At Sage Bionetworks, we believe that we can learn more by learning from each other. We develop and apply open practices to data-driven research for the advancement of human health. We are working to establish actionable biomedical observations through the reliable analysis and responsible sharing of representative data. By improving the way scientists collaborate and by increasing the reliability of research, we will improve human health.
This position will play a critical role in our scientific research communities by providing support of biomedical data repositories through data curation, and the development of data formats and metadata standards. We are looking for someone who understands the complexities of combining heterogeneous biological data sources, the difficulties of standardizing metadata, and the value that these efforts provide in the support of reproducible research
What you’ll be doing:
- Work within a team of research scientists, data curators, and outreach and communication specialists to curate, distribute, and publicize highly-dimensional biomedical data sets coming from multiple NIH supported programs.
- Develop and document metadata standards for sharing datasets under the FAIR guiding principles.
- Develop or extend existing data models/schemas for projects in coordination with Project Leads.
- Document data ingest processes and curation SOPs. Write data release notes. Produce data reports.
- Maintain research collections for disease communities, such as the AD Knowledge Portal (adknowledgeportal.org).
- Streamline and implement new processes for data curation using scripting and statistical programming.
- Manage access to sensitive datasets in collaboration with the Sage governance team.
We’d love to hear from you if you:
- Have a master’s degree in computational field, library science, data management, OR a bachelor’s degree in one of the above areas with 3+ years of relevant work experience. A minimum of 2 years working with high dimensional data repositories is preferred.
- Have experience working with common biomedical data types derived from disease model systems and patients. Experience with gene expression and other omic data is preferred.
- Have an interest in learning about new high throughput biological technologies.
- Are proficient in a scripting language such as Python or R.
- Have a basic understanding of SQL.
- Are highly organized and have great attention to detail.
- Are able to work individually and within a team.
- Are passionate about open science and collaboration.
In light of recent concerns of Covid-19, all interviews will be conducted remotely, and most positions will be remote through at least June 30, 2021. The option to work on-site at our Seattle office prior to June 30, will be considered upon request.