By Titus Brown
In my view, open science fundamentally depends on tools, infrastructure and practice for making the research process more open, transparent, and reproducible. Any progress on the open tooling and open practices front (almost) invariably redounds to the larger benefit of open science, and thus science more generally.
So at the recent Critical Assessment of Open Science (CAOS) meeting in New Orleans, I found myself a bit frustrated by the overall mood of doom and gloom. Sure, open science thinking has thus far failed to magically transform the scientific enterprise into a wonderland of openness and collaboration; the negatives of openness are becoming clearer as we explore them; and existing closed systems are surprisingly robust and adaptable in practice. But I think there’s lots of good news, too.
The good news is that, in the last 10 years, we have seen tremendous adoption of openness in scientific communities. For example, the widespread adoption of Jupyter and R notebook technologies means that data analysis workflows are being made explicit in a way that many can understand, share, and remix. Moreover, these open technologies are being incorporated into essentially every data science stack everywhere. Preprints in biology have taken off and there’s no going back. The majority of tools for bioinformatics are now open source. Sites like GitHub, Zenodo, Figshare, and the Open Science Framework, make it trivial to share content, mint DOIs, and openly integrate digital artifacts into the literature. The rise of cloud means that, increasingly, workflows are portable between groups. FAIR data principles have taken off. And the Carpentries training community has spread like wildfire and teaches, as one of its underlying philosophies, more effective sharing through all of the above mechanisms.
But, we don’t really stop and celebrate these wins in the open science community, because we’re relentlessly focused on the next steps. There’s plenty more to be done, and many disappointments and challenges, even with the successful approaches. The relentless academic focus on the unsolved problems prevents us from properly celebrating the amazing achievements that we’ve already got in the bag.
So, stop and smell the roses! Sit back and appreciate our wins, over a beverage of your choice, in a comfortable community space. And start every presentation and workshop with an optimistic statement about what has already worked. I’m not sure how else to best celebrate, but please consider this a call for suggestions.
With full awareness of the irony, I would like to now ask: what’s next? For me, one of the main challenges moving forward is how to more effectively spread the practices above. Scientific practice tends to shift slowly, for good and bad reasons. Can we accelerate adoption of open practices that demonstrably work?
To a large extent, I think adoption of more open practices is just going to happen: data science is an increasingly large, intrinsic part of science, and notebooks make too much sense to ignore. Preprints and open source are, likewise, deeply embedded in some fields and we just need to wait for the obstacles to retire. Sharing mechanisms aren’t going away. Cloud isn’t going away. FAIR is seeing adoption by funding agencies. And the training done by the Carpentries (and friends) seems increasingly likely to become embedded in undergraduate training, because it’s how data science is done.
But there are a lot of methodologies and practices that take a bit of work. For example, at a recent SIAM CSE minisymposium, many of the talks focused on the better ways we already have of working on and with software: we have good techniques for building and supporting software via community engagement, successful business models for long-term research software support, peer code review techniques that work, robust software citation mechanisms, and good continuous integration systems, with improvements on the way.
The main remaining challenge (in my view) is that of adoption: The future is already here – it’s just not evenly distributed. And distributing skills more evenly is hard, as is adapting them to the on-the-ground needs of each scientific community. In my experience, the most effective way of doing this is by developing organic development of communities of practice that adopt and solidify good practice, ultimately making this practice normative within their enclosing scientific communities.
So, what are my main takeaways? I’ll stick with three:
- Open has been really successful in ways that, 10 years ago, we would have found hard to believe. Celebrate!
- The leading edge of “open” has identified lots of good and effective practice. We should figure out how to spread and solidify this practice broadly, and not just work on the next exciting unsolved problem.
- It’s all about communities of practice, maaaan! Invest now! And let’s talk about how to make them more inclusive and welcoming!
Thanks to the CAOS organizers for running a great meeting, to the minisymposium speakers for their great talks, and especially to Daina Bouqin for the enthusiastic discussion about making good software citation behavior normative.
Dr. C. Titus Brown, an Associate Professor at the University of California, Davis. He runs the Data Intensive Biology Lab at UC Davis, where his team tackles questions surrounding biological data analysis, data integration, and data sharing.
About this series: In February 2019, Sage Bionetworks hosted a workshop called Critical Assessment of Open Science (CAOS). This series of blog posts by some of the participants delves into a few of the themes that drove discussions – and debates – during the workshop.
Voices from this series:
Read here or on Medium.com
Introduction by Lara Mangravite and John Wilbanks
Voices From the Open Science Movement by Lara Mangravite and John Wilbanks
Recognizing the Successes of Open Science by Titus Brown
Do Scientists Reuse Open Data? by Irene Pasquetto
Open Science: To What End? by Cyndi Grossman
On the Ethics of Open Science by Nancy Kass
Bringing Open Science to Neuroinformatics by Helena Ledmyr
From Open Systems to Trusted Systems: New Approaches to Data Commons by Michael Kellen
How Data Commons Can Support Open Science by Robert L. Grossman
Reproducibility in Computational Analysis by Geraldine Van der Auwera