Sage Bionetworks is recruiting for a Data Curator to enable our open science initiatives. Sage is a Seattle, WA based non-profit research organization that seeks to develop predictors of disease and accelerate health research through the creation of open systems, incentives, and standards. We create strategies and platforms that empower researchers to share and interpret data on a colossal scale, crowdsource tests for new hypotheses, and contribute to knowledge through community challenges. At the base of all of these efforts is the need to create globally coherent biological data sets that are Findable, Accessible, Interoperable, and Re-usable.
We are seeking someone to maintain and develop new approaches to support data sharing, through multiple open neuroscience data portals. This is a person who understands the complexities of combining heterogeneous biological data sources, the difficulties of standardizing metadata, and the value that these approaches can provide to support reproducible research.
Specific responsibilities include:
- Working within a team of biologists, statisticians and database administrators to compile and format large, highly-dimensional data sets.
- Develop new processes for maintaining and growing datasets to meet the strategic needs of the research community.
- Use and develop new tools that build on existing APIs for data standardization.
- Train users and lead outreach efforts with new user groups
- Streamline processes and implement new methods using scripting and statistical programming
- Assist in the evaluation and implementation of data and phenotype curation tools
- Create simple visualizations and dashboards of datasets to facilitate users in understanding their content and value.
- Master’s degree in a computational field (e.g. bioinformatics, computational biology, physics) or a Bachelor’s degree in same with a minimum of 2 years experience working with high dimensional data repositories.
- Experience with scripting languages such as Python or statistical programming in R.
- Ability to work independently and in a team setting.
- Understanding of biological data structures and metadata.
- Excellent communication skills.
- Hands on experience working with clinical data from neurodegeneration studies.
- Experience working with gene expression, genotype, CNV, or other sequence data.
- Proficiency in Linux and Linux-based scripting tools (sed, awk, shell scripts, etc).
- Experience in building dashboards summarizing data.
- An interest in learning about high throughput biological technologies.
About Sage Bionetworks:
Sage Bionetworks is a Seattle-based non-profit organization dedicated to advancing biomedical research through the implementation of reproducible, open science. Using cutting edge machine-learning methodologies in collaboration with scientists around the world, we build robust computational models of disease-related phenotypes through integrative analysis of large-scale genomic, imaging, and mHealth data sets. To enhance collaborative efforts, we leverage a compute platform (www.synapse.org) for sharing research insights in a transparent, reproducible fashion.
Sage Bionetworks is an Equal Opportunity Employer. We offer competitive compensation and a comprehensive benefits package, including relocation benefits to bring the right talent to our team.