Position Overview: Sage Bionetworks is seeking a data engineer to join the systems biology team. At Sage, we develop and test social, technical, and scientific solutions to foster open practices and enable collaborative research. The systems biology group focuses on data rich research collaborations where teams of scientists collaborate to generate and analyze billions of measurements on thousands of research samples. The broader goal is to advance scientific research by establishing community consensus through data and knowledge sharing. The Data Engineer’s primary focus will be to develop and build tools to organize data streams from multiple external teams studying different aspects of Alzheimer’s disease, along with systems to facilitate facile analysis of these data streams via workflow and containerization technologies. The Data Engineer will be a technical lead who can proactively identify and outline technical needs across multiple projects and drive the development of solutions by defining requirements and overseeing multiple project roadmaps.
Specific Responsibilities include:
- Manage the distribution of knowledge and data by defining and overseeing the development of data and visualizations web portals.
- Work with Sage scientists and engineers to develop common strategies and processes for the hosting, curation, and annotation of large clinical and genomic data sets.
- Work with Synapse engineering team to lead development of analysis workflows for scientific compute on terabytes of genomic data using cloud infrastructure.
- Design and develop programmatic solutions to ingest data, manage metadata, and improve data discoverability.
- Build on our existing tools and APIs to help scientists perform computational research.
- MS or PhD degree in computational field.
- 2+ years experience with data processing.
- 1+ years experience with cluster or cloud computing.
- High level knowledge of R, Python or Matlab a must.
- Comfortable using Linux.
- Experience with querying and using databases a plus.
- Experience with bioinformatics techniques, specifically genomic data generation and analysis, preferred.
- Effective and efficient communication skills for diverse audience.
- Experience with workflow technologies is a plus.
Sage offers competitive compensation and a comprehensive benefits package.
To apply, please submit CV and cover letter.
About Sage Bionetworks: Sage Bionetworks is a non-profit organization dedicated to advancing biomedical research through the implementation of reproducible, open science. In collaboration with scientists around the world, we build robust computational models of disease-related phenotypes through integrative analysis of large-scale genomic, imaging, and mHealth data sets. To enhance collaborative efforts, we leverage a compute platform (www.synapse.org) for sharing research insights in a transparent, reproducible fashion.