At Sage Bionetworks, we believe that we can learn more by learning from each other. By improving the way scientists collaborate, we help to make science more effective. We partner with researchers, patients, and healthcare innovators to drive collaborative data-driven science to improve health. Making science more open, collaborative, and inclusive ultimately advances biomedicine.
This position will play a critical role in our scientific research communities by providing support of biomedical data repositories through data curation, and the development of data formats and metadata standards. We are looking for someone who understands the complexities of combining heterogeneous biological data sources, the difficulties of standardizing metadata, and the value that these efforts provide in the support of reproducible research.
What you’ll be doing:
- Work within a team of research scientists, data curators, and outreach and communication specialists to curate, distribute, and publicize highly-dimensional biomedical data sets coming from multiple NIH- and Foundation-supported programs.
- Develop or extend existing data models/schemas for projects in coordination with Project Leads.
- Document data ingest processes and curation SOPs. Write data release notes. Produce data reports.
- Contribute and maintain data portal content.
- Develop and document metadata standards for sharing datasets under the FAIR guiding principles.
- Maintain research collections for scientific communities and implement new processes for data curation using scripting and statistical programming.
- Manage access to sensitive datasets in collaboration with the Sage governance team.
We’d love to hear from you if you:
- Have a master’s degree in computational field, library science, data management, OR a bachelor’s degree in one of the above areas with 3+ years of relevant work experience. A minimum of 2 years working with high-dimensional data repositories is preferred.
- Have experience working with common biomedical data types. Experience with gene expression and other omic data is preferred. Experience with biomedical imaging data and/or clinical data considered a plus.
- Have an interest in learning about new high throughput biological technologies.
- Are proficient in a scripting language such as Python or R.
- Have a basic understanding of SQL.
- Are highly organized and have great attention to detail.
- Are able to work individually and within a team.
- Are passionate about open science and collaboration.
In light of recent concerns of Covid-19, all interviews will be conducted remotely, and most positions will be remote through at least June 30, 2021. The option to work on-site at our Seattle office prior to June 30, will be considered upon request.