Open science is an umbrella term used by many people to represent a diverse set of research methods designed to increase the speed, value, and reproducibility of scientific output. In general, these approaches work to achieve their goals through increased sharing of research assets or transparency in research methods. Our own work in this field promotes the effective integration of computational analysis into the life sciences. The challenge: While the advancements in technology now support the generation and analysis of large-scale biological data from human samples in a way that can meaningfully expand our understanding of human biology, the established processes for implementation and independent evaluation of research outcomes are not particularly well suited to these emerging forms of scientific inquiry.
The scientific method defines a scientist as one who seeks new knowledge by developing and testing hypotheses. In this frame, the scientist is a neutral observer who is equally satisfied when a hypothesis is either proven or disproven. However, as with any application of theory, the practical implementation of the scientific method is impacted by the conditions in which it is applied.
A Different Era
The U.S. scientific system has a well-established set of processes that were developed in post-war America with the intention of advancing the kinds of science of that era. This system promotes the pursuit of scientific inquiry within the context of research universities, using funding from the government and distributing knowledge across the research community through papers published in journals and patents acquired by technology transfer offices. While this system can be highly effective, it also incentivizes scientists in a manner that directly impacts the outputs of their work.
The current scientific system rewards our scientists for new discoveries. This is the criterion that is used to gate their ability to obtain funding, their ability to advance their own careers and those of their colleagues, and, in some cases, their ability to remain employed. For this reason, we sometimes skew our experiments towards those that prove rather than disprove the hypothesis. We enter into the self-assessment bias – in which we tend to overvalue the impact and validity of our own outputs.
Now, all is not lost: we have a well-established system of peer-review that uses independent evaluation to assess the appropriateness of research conclusions. To this aim, we as a community, are meant to evaluate the evidence presented, determine the validity of an experiment, and understand how that experiment may support the general hypothesis. The task of turning an individual observation into general knowledge may be led by an individual scientific team, but it is the responsibility of the entire field.
This system is noble and often quite effective. It’s also been strained by the scale and complexity of the research that is currently being pursued – including the integration of computational sciences into biology. The system is so good at encouraging publication in peer-review journals that more than 400,000 papers on biology were published in 2017. This causes a host of problems.
First, it’s a strain for anyone in the scientific community to balance the time it takes to perform this important task with many other demands. Second, the complexity of our modern experiments are not easily conveyed within the traditional means for scholarly communication, making it difficult for independent scientists to meaningfully evaluate each experiment. Third, the full set of evidence needed to evaluate a general hypothesis is usually spread across a series of communications, making it difficult to perform independent evaluation at the level of that broader hypothesis.
This last point can be particularly problematic as conflicting evidence can arise across papers in a manner that can be difficult to support through comparative evaluation. These issues have exploded into a much-publicized replication crisis, making it hard to translate science into medicine.
So what does this all have to do with open science? The acknowledgement of these imperfections in our current system has led to a desire – across many fronts – for an adapted system that can better solve these problems. Open science contains lots of elements of a new scientific system. For computational research in life sciences, it works on the cloud where we can document our experimental choices with greater granularity. It provides evidence of the scientific process that helps us decide which papers out of that 400,000 to trust – the ones where we can see the work, and the ones where machines can help us read them.
In our own work, we have seen how the use of open methods can increase the justification of research claims. Working inside a series of scientific programs, we have been able to extract general principles and solutions to support this goal. These are our interventions – ways to support scientists in making real progress towards well-justified research outcomes.
These approaches have been encouraged by the scientists, funders, and policy makers involved in these programs, who are seeking ways to increase the translational impact of their work. We have seen cases across the field where these approaches have allowed exactly that. But these are sometimes at odds with the broader system, causing conflict and reducing their adoption. It may be time to contemplate a more complete, systemic redesign of the life sciences that supports our scientists in their quest for knowledge and that has the potential to directly improve our ability to promote human health.