The Global Mental Health Databank – Practicing Better Science Together

At Sage, we build responsible practices for data sharing in health research. By combining policy and technology, we work to ensure data can be safely used across institutes. We do this because broad data resources and interdisciplinary teams are necessary to understand the complexities of human health, and we need to use this kind of information to improve people’s experiences in the health system.

We have been successful in stimulating collaborative science across teams of researchers. But we recognize that our approaches to data sharing haven’t included everyone who should be part of the conversation. Patient advocates and community-based organizations have asked us for many years why we don’t prioritize participant involvement. Indeed, there are a wide range of research opportunities that are only possible with direct involvement of the individuals contributing the data. We haven’t had a good answer.

Researchers simply can’t keep excluding those who provide the data. While we at Sage are partial to the ethical arguments for this, there are plainly scientific arguments as well. As data collection creeps out of the lab and into our homes, researchers will not recruit and engage diverse cohorts if the people in the study are treated as subjects, not partners.

In our own work we focus on “real-world evidence” – how to collect, govern, and analyze a wide range of extremely personal data about our everyday lives, including medical care, daily habits, self-management practices and lived experience. What is the value proposition that would convince anyone to contribute these kinds of data to research? How broadly are people willing distribute data collected about their daily lives? How do these tradeoffs look to people who aren’t traditionally asked their opinion about tradeoffs?

The challenge then is to create a data governance system that empowers people to be active partners in managing the way that their data is collected, shared, or used in research.

This is why I am so excited about the new partnership we’ve just formed with Miranda Wolpert and the Mental Health Priority Area at Wellcome Trust. We’re going to build – and test – that data governance system.

The Global Mental Health Databank project seeks to research strategies – active ingredients – that youth around the world can use to self-manage anxiety and depression, to develop a system that guides youth to those strategies that they are most likely to find useful. To be clear, our project is to design the governance for this kind of databank. We’ll be testing out the ways that participants want to govern such a databank – we are not conducting the research on mental health!

To be clear, our project is to design the governance for this kind of databank.

Doing this will require active partnership with both researchers and youth to help collect the data needed to answer the question of “what works for whom and why” for mental health management. A data sharing system designed for these purposes must meet the needs both of youth with lived experience in mental health and of mental health researchers. Participants and researchers are often interested in different questions – both of which have value. Just as researcher-led programs have often overlooked participant needs and interests, participant-led programs may miss subtleties of statistics and biology that are important to researchers.

Over the next two years, we will work together with youth and researchers to evaluate the feasibility of successfully implementing a participant-led databank for global mental health research that enables the collection, sharing and use of data from youth across three countries – South Africa, India, and the United Kingdom. We will run this study as an experiment in participant-led data governance designed to address several key questions: What value do youth find in participating in this kind of databank and how does that vary across individuals? Is youth involvement in research impacted by their control over how their data is collected, shared, and used? What levels of oversight do they wish to have? Do these considerations have an impact on what types of data they are willing to contribute? What do they wish to do with these data and what support do they need to achieve their goals? How do the preferences of youth and the preferences of researchers intersect?

I am thankful to be collaborating with a team of amazing researchers with deep expertise in youth mental health: Drs. Zuki Zingela and Melvyn Freeman in South Africa, Dr. Soumitra Pathare in India, and Dr. Tamsin Ford and Dr. Mina Fazel in the UK. These individuals work directly with school-based youth on the management of mental health. Together, we will evaluate the interests of youth in engaging with a databank program of this sort. We are also joined by two research teams from the University of Washington, Drs. Pamela Collins and Pat Areán who have expertise in global mental health collaboratives and in digital mental health assessment, respectively.

Our team is committed to embarking on this journey in partnership with youth. To build a system that meets the needs of youth and researchers, we need both perspectives to be involved right from the beginning. To this end, our first action has been to hire young adults onto the team to co-develop this work with these researchers. They will be supported by a series of panels of youth with lived experience of mental health in each country who can provide a diverse set of perspectives to inform the project.

Stay tuned!

Collaborating with youth is key to studying mental-health management

Sage Bionetworks leads international feasibility study to identify core design components to build the Global Mental Health Databank with youth participants


SEATTLE, Nov. 18, 2020 – Young people around the world commonly experience anxiety and depression, but it can be hard to identify how each person can best manage their own mental health. The Global Mental Health Databank, a feasibility study officially launching today, hopes to change that by enabling youth from the United Kingdom, South Africa, and India to work directly with mental health researchers to better understand how young people can manage their mental health.

Sage Bionetworks is leading an international group of researchers from Oxford University, University of Cambridge, University of Washington, Walter Sisulu University, Higher Health, and the Centre for Mental Health Law & Policy at the Indian Law Society in this effort to shift how a mental health databank could be developed and structured. This project is funded by the mental health area team at the Wellcome Trust as key infrastructure necessary to enable their work to identify the next generation of treatments and approaches to prevent, intervene, manage and stop relapse of anxiety and depression in young people.

This project will work directly with youth and researchers to build the blueprint for a global mental health program that directly collects data and provides insights to youth around the world. We will test how youth wish to interact with and use this system to advance understanding of mental health.

“We are excited to create a system that supports both youth and researchers in understanding mental health management strategies,” said Dr. Lara Mangravite, president of Sage Bionetworks. “We think it’s essential to start by developing a system that empowers young people to directly guide how their data is collected, shared and used.”

Partnering with Youth

Relying on mobile phones and other connected technologies, the study will collect data from youth participants about their lived experience with mental health self-management. Collecting such data, which requires a strong partnership between youth and researchers, will provide insight about how a person’s daily activities and surroundings affect their health and the success of their health-management strategies. For example, can changes in sleep habits, social interactions or financial security help mitigate anxiety?

“We look forward to rich learning as to how to balance the best ways to ensure those banking their data have maximal control and privacy, with the wish to allow diverse scientists to have ready access to data to advance understanding of the active ingredients that help address youth anxiety and depression globally,” said Professor Miranda Wolpert, MBE Head of the Mental Health Priority Area at Wellcome.

Leveraging existing technologies and expertise, Sage will also be testing the ability to operate a program of this nature at scale with the varied data privacy regulations of participating countries.

“India has the world’s largest population of adolescents and young people. Therefore, it is important that India is a part of such global initiatives to solve global problems affecting all of humanity, especially young people in low- and middle-income countries,” said Dr. Soumitra Pathare, Director, Centre for Mental Health Law & Policy, ILS, Pune, India. “We hope that Indian researchers will also find this project of value in helping them solve these mental health problems in the India context.”

“This work provides an extraordinary opportunity to engage culturally and contextually diverse groups of young people and researchers on critical questions for youth mental health. This is fundamental to strengthening the science of global mental health and ensuring that the solutions are informed by specific needs,” said Dr. Pamela Collins, professor of psychiatry and behavioral sciences, professor of global health, and director of the UW Global Mental Health Program.

Connected Technologies

The team from the University of Washington will bring their deep experience in working with people coping with mental health challenges and how connected technologies can help assess mental health.

“Nearly every young person in the world has access to connected technologies and these technologies provide a window into their social, physical and emotional lives,” said Dr. Patricia Areán, professor of psychiatry and behavioral sciences at the University of Washington School of Medicine. “By banking this data, we hope to provide the opportunity to discover which strategies do and do not help them manage their mental health.”

The project envisions the databank as a platform that connects participants and researchers interested in studying the effects of contextual determinants, experiences, behaviors, and interventions in a real-world setting. To be successful, this platform must be technically feasible, beneficial to both data contributors and researchers, and it must operate under parameters that promote data justice.

“Data analysis should be the bedrock on which policy and practice rest, and it is therefore essential that we as researchers engage in dialogue with young people about how their data could support population mental health and ensure that their concerns about privacy and confidentiality are enshrined in governance procedures,” said Dr. Tamsin Ford, Professor of Child and Adolescent Psychiatry at the University of Cambridge and Honorary Consultant at Cambridge and Peterborough NHS Foundation Trust.

“While this is still at early stages, we hope that this work will ultimately help to practically change lives globally,” said Dr. Melvyn Freeman, of Higher Health. “I am personally hopeful, and quietly confident, that this databank will become the cog around which research into depression and anxiety in young people will turn in the future, and hence make a big difference to youth well-being.”

Media contacts:

Hsiao-Ching Chou,

Meera Damji,

Craig Brierley,; Laura Marshall,

Sage Bionetworks: Sage Bionetworks is a nonprofit biomedical research and technology development organization that was founded in Seattle in 2009. Our focus is to develop and apply open practices to data-driven research for the advancement of human health. Our interdisciplinary team of scientists and engineers work together to provide researchers access to technology tools and scientific approaches to share data, benchmark methods, and explore collective insights, all backed by Sage’s gold-standard governance protocols and commitment to user-centered design. Sage is a 501c3 and is supported through a portfolio of competitive research grants, commercial partnerships, and philanthropic contributions.

Centre for Mental Health Law & Policy, ILS, Pune, India: The Centre for Mental Health Law & Policy aims to protect and promote the rights of persons with psychosocial disabilities using a rights-based approach to mental health through law and policy reform; implementation research; community capacity building; strategic litigation & strengthening public mental health systems, peer support; youth mental health and training & education. CMHLP works with different stakeholders including service users, mental health professionals, policymakers, civil society organisations and researchers both nationally and internationally with a specific focus on vulnerable and marginalised populations in low- and middle-income countries (LMICs).

About the University of Cambridge: The mission of the University of Cambridge is to contribute to society through the pursuit of education, learning and research at the highest international levels of excellence. To date, 110 affiliates of the University have won the Nobel Prize. Founded in 1209, the University comprises 31 autonomous Colleges and 150 departments, faculties and institutions. Cambridge is a global university. Its 19,000 student body includes 3,700 international students from 120 countries. Cambridge researchers collaborate with colleagues worldwide, and the University has established larger-scale partnerships in Asia, Africa and America. The University sits at the heart of the ‘Cambridge cluster’, which employs more than 61,000 people and has in excess of £15 billion in turnover generated annually by the 5,000 knowledge-intensive firms in and around the city. The city publishes 316 patents per 100,000 residents.

UW Medicine in Seattle: UW Medicine is one of the top-rated academic medical systems in the world. With a mission to improve the health of the public, UW Medicine educates the next generation of physicians and scientists, leads one of the world’s largest and most comprehensive biomedical research programs, and provides outstanding care to patients from across the globe. The UW School of Medicine is second in the nation in federal research grants and contracts with $930.4 million in total revenue (fiscal year 2019) according to the Association of American Medical Colleges.

Oxford University: Oxford University has been placed number 1 in the Times Higher Education World University Rankings for the third year running, and at the heart of this success is our ground-breaking research and innovation. Oxford is world-famous for research excellence and home to some of the most talented people from across the globe. Our work helps the lives of millions, solving real-world problems through a huge network of partnerships and collaborations. The breadth and interdisciplinary nature of our research sparks imaginative and inventive insights and solutions. Through its research commercialisation arm, Oxford University Innovation, Oxford is the highest university patent filer in the UK and is ranked first in the UK for university spinouts, having created more than 170 new companies since 1988. Over a third of these companies have been created in the past three years.

HIGHER HEALTH: HIGHER HEALTH is an implementing agency of the Department of Higher Education and Training (DHET). HIGHER HEALTH is dedicated to promoting the health and wellbeing of nearly two million students in the post-school education system. Our structures, implementing programmes, campaigns and a wide spectrum of health, wellness and psychosocial services cover over 420 campus sites, and rural, informal and urban settings. Most of the students we work with are in the 15-24 year age group and most come from impoverished backgrounds.

Walter Sisulu University: Walter Sisulu University (WSU) is a university of technology and science in Mthatha, Eastern Cape, South Africa. This institution was founded in 2005 and offers quality education to over 24,000 students across its four campuses.

Uncovering Therapeutic Strategies for Neurofibromatosis Type 1

Drug discovery studies are challenging to conduct for rare diseases because there often isn’t enough relevant biological data. Small, or underpowered, datasets hinder the effectiveness of the statistical methods that researchers typically use to identify potential drug targets and to generate hypotheses for experimentation. But, in neurofibromatosis type 1 (NF1) research, there has been an effort among patients, researchers, clinicians, and funding partners to increase the availability and accessibility of data. In a recent article published in the journal Genes, we share how we were able to apply sophisticated computational methods to an aggregated group of small NF1 datasets to generate insights about potential drug targets.

Title: Integrative Analysis Identifies Candidate Tumor Microenvironment and Intracellular Signaling Pathways that Define Tumor Heterogeneity in NF1
Journal: Genes
Authors: Jineta Banerjee, Robert J. Allaway, Jaclyn N Taroni, Aaron Baker, Xiaochun Zhang, Chang In Moon, Christine A. Pratilas, Jaishri O. Blakeley, Justin Guinney, Angela Hirbe, Casey S. Greene, and Sara JC Gosline

NF1 is a rare genetic disorder that affects over 2.5 million people globally. The disease is a result of mutations in the NF1 gene and it can cause heterogeneous tumors, including cutaneous neurofibromas (cNFs), plexiform neurofibromas (pNFs), and malignant peripheral nerve sheath tumors (MPNSTs). While there is an incredible amount of research on NF1, there are very few safe and effective drugs to treat the various types of NF1 tumors. But, machine-learning methods can be powerful tools for accelerating drug discovery in NF1.

In the study, we relied on the NF community’s data-sharing efforts to identify important biological signatures in NF1 tumors. To do this, we applied machine learning methods that first learn biological patterns from large collections of data and then look for these patterns in different datasets, such as the ones that exist in NF1. We then characterized these patterns using statistical approaches and systems biology methods and were able to identify enrichment of signals related to immune cells as well as possible drug classes for follow up in NF, pNF, and MPNST research. We further found that histone deacetylase (HDAC) inhibitors, which have been observed to work well in preclinical models of  MPNSTs, may be worth exploring as a potential therapy for cNFs.

Our re-analysis of NF1 data in this study, enabled by access to data generated by the NF community researchers and encouraged by research-forward funding partners like Neurofibromatosis Therapeutic Accelerator Program (NTAP), showcases how shared data from various groups can together power sophisticated analyses which would otherwise not be possible for each of the datasets separately. For rare diseases, this approach is extremely valuable since patient data is sparse and precious. We hope that our efforts and the results showcased in this article will not only inform experimental researchers of probable hypotheses to test, but also encourage them to share their data more readily to power even more sophisticated analyses in the future.

Jineta Banerjee and Robert Allaway are co-lead authors on this study.



Sage Perspective: Retention in Remote Digital Health Studies

Editor’s note: This is a Twitter thread from John Wilbanks, Sage’s chief commons officer.


New from Abishek Pratap and a few more of us – Indicators of retention in remote digital health studies: a cross-study evaluation of 100,000 participants

A few thoughts on the paper:

  1. Hurrah for data that’s open enough to cross-compare.
  2. When someone shows you overall enrollment in a digital health study, ask about engagement % on day 2. It’s a way better metric.
  3. Over-recruit the under-represented with intent from the start or your sample won’t be anywhere close to diverse enough.
  4. Design your studies for broad, shallow engagement – your protocol and analytics will be better matched.
  5. Pay for participation and clinician involvement make a huge difference. Follow @hollylynchez who writes very clearly on the payment topic.
  6. Clinician engagement is going to need some COI norms because whew it’s easy to see where that can go sideways.
  7. When your study is flattened down to an app on a screen, the competition is savage for attention and you’ll get deleted really quickly if there isn’t some sense of value emerging from the study.
  8. Meta-conclusion: perhaps start with the question: how does this give value the participant when the app is in airplane mode?
  9. On “pay to participate” – the first time I ever talked to @FearLoathingBTX, he immediately foresaw studies providing a “free” phone for participation, but cutting service off for low engagement. That is, sadly, definitely on track absent some intervention.

Related content and resources:


Evaluation of Participation in Digital Health Studies

The widespread use of smartphones has offered a valuable opportunity to biomedical researchers. Using mobile apps, scientists are now able to design large-scale health research studies in a cost-effective way and, importantly, gather diverse real-world lived experiences of disease over time by recruiting participants from broader geographic regions – at least that is the hope. The real-world data (RWD) gathered through the health research apps also complements traditional research by capturing disease fluctuations at important moments that are often missed between periodic in-person clinic visits.

Title: Indicators of retention in remote digital health studies: across-study evaluation of 100,000 participants
Journal: Nature Digital Medicine
Authors: Abhishek Pratap, Elias Chaibub Neto, Phil Snyder, Carl Stepnowsky, Noémie Elhadad, Daniel Grant, Matthew H. Mohebbi, Sean Mooney, Christine Suver, John Wilbanks, Lara Mangravite, Patrick J. Heagerty, Pat Areán, and Larsson Omberg

In the last five years, several digital health studies, including remote interventions and clinical trials, have been conducted using smartphone technology. Despite the success where researchers were able to enroll thousands of research participants in a short amount of time, participant retention and long-term engagement in fully remote research remain a significant barrier for generating robust real-world evidence from RWD. In the study Indicators of retention in remote digital health studies: a cross-study evaluation of 100,000 participants, published in the journal Nature Digital Medicine on Feb. 17, researchers pooled data from eight digital health studies across nearly 110,000 study participants and discovered key factors that affect participant retention.

To avoid collecting biased real-world data there is an urgent need to assess underlying patterns in people’s participation in fully remote studies. If you can’t measure it, you can’t fix it.

A screenshot of a table that shows data from a collection of 8 digital health studies. The table comes from the paper Indicators of retention in remote digital health studies: a cross-study evaluation of 100,000 participants.

The study compiled user-engagement data from eight digital health studies that targeted different diseases ranging from asthma, endometriosis, heart disease, depression, sleep health, and neurological diseases. The compilation of individualized user-level engagement data is one of the largest and most diverse user-engagement datasets to date and has been made publicly available for the broad research community. The data analysis surfaced two key results 1) Half of the participants dropped out of studies within the first week and 2) most studies ultimately weren’t able to recruit demographically or geographically representative participants.

Despite the limitations, several factors, such as partnerships with clinicians and providing research participants fair compensation for their time in the study, could help researchers retain diverse participants in future digital health studies. Unsupervised analysis of engagement data also revealed broadly consistent underlying patterns of participation in remote research. App-usage behavior fell into four clusters, with distinct differences that have semantic and demographic ramifications.

The insights from this research have the potential to inform user enrollment and engagement strategies for improving retention and engagement in future digital health studies.

Related content:

Voices from the Open Science Movement

Open science is an umbrella term used by many people to represent a diverse set of research methods designed to increase the speed, value, and reproducibility of scientific output. In general, these approaches work to achieve their goals through increased sharing of research assets or transparency in research methods. Our own work in this field promotes the effective integration of computational analysis into the life sciences. The challenge: While the advancements in technology now support the generation and analysis of large-scale biological data from human samples in a way that can meaningfully expand our understanding of human biology, the established processes for implementation and independent evaluation of research outcomes are not particularly well suited to these emerging forms of scientific inquiry.

The scientific method defines a scientist as one who seeks new knowledge by developing and testing hypotheses. In this frame, the scientist is a neutral observer who is equally satisfied when a hypothesis is either proven or disproven. However, as with any application of theory, the practical implementation of the scientific method is impacted by the conditions in which it is applied.

A Different Era

The U.S. scientific system has a well-established set of processes that were developed in post-war America with the intention of advancing the kinds of science of that era. This system promotes the pursuit of scientific inquiry within the context of research universities, using funding from the government and distributing knowledge across the research community through papers published in journals and patents acquired by technology transfer offices. While this system can be highly effective, it also incentivizes scientists in a manner that directly impacts the outputs of their work.

The current scientific system rewards our scientists for new discoveries. This is the criterion that is used to gate their ability to obtain funding, their ability to advance their own careers and those of their colleagues, and, in some cases, their ability to remain employed. For this reason, we sometimes skew our experiments towards those that prove rather than disprove the hypothesis. We enter into the self-assessment bias – in which we tend to overvalue the impact and validity of our own outputs.

Now, all is not lost: we have a well-established system of peer-review that uses independent evaluation to assess the appropriateness of research conclusions. To this aim, we as a community, are meant to evaluate the evidence presented, determine the validity of an experiment, and understand how that experiment may support the general hypothesis. The task of turning an individual observation into general knowledge may be led by an individual scientific team, but it is the responsibility of the entire field.

Growing Pains

This system is noble and often quite effective. It’s also been strained by the scale and complexity of the research that is currently being pursued – including the integration of computational sciences into biology. The system is so good at encouraging publication in peer-review journals that more than 400,000 papers on biology were published in 2017. This causes a host of problems.

First, it’s a strain for anyone in the scientific community to balance the time it takes to perform this important task with many other demands. Second, the complexity of our modern experiments are not easily conveyed within the traditional means for scholarly communication, making it difficult for independent scientists to meaningfully evaluate each experiment. Third, the full set of evidence needed to evaluate a general hypothesis is usually spread across a series of communications, making it difficult to perform independent evaluation at the level of that broader hypothesis.

This last point can be particularly problematic as conflicting evidence can arise across papers in a manner that can be difficult to support through comparative evaluation. These issues have exploded into a much-publicized replication crisis, making it hard to translate science into medicine.

Open Methods

So what does this all have to do with open science? The acknowledgement of these imperfections in our current system has led to a desire – across many fronts – for an adapted  system that can better solve these problems. Open science contains lots of elements of a new scientific system. For computational research in life sciences, it works on the cloud where we can document our experimental choices with greater granularity. It provides evidence of the scientific process that helps us decide which papers out of that 400,000 to trust – the ones where we can see the work, and the ones where machines can help us read them.

In our own work, we have seen how the use of open methods can increase the justification of research claims. Working inside a series of scientific programs, we have been able to extract general principles and solutions to support this goal. These are our interventions – ways to support scientists in making real progress towards well-justified research outcomes.

These approaches have been encouraged by the scientists, funders, and policy makers involved in these programs, who are seeking ways to increase the translational impact of their work. We have seen cases across the field where these approaches have allowed exactly that. But these are sometimes at odds with the broader system, causing conflict and reducing their adoption. It may be time to contemplate a more complete, systemic redesign of the life sciences that supports our scientists in their quest for knowledge and that has the potential to directly improve our ability to promote human health.