Computational Oncology Group

Data Engineer – Informatics Workflows

Data Engineer – Informatics Workflows

Data Engineer – Informatics Workflows

Sage Bionetworks, Seattle WA

At Sage Bionetworks, we believe that we can learn more by learning from each other. By improving the way scientists collaborate, we help to make science more effective. We partner with researchers, patients, and healthcare innovators to drive collaborative data-driven science to improve health. Making science more open, collaborative, and inclusive ultimately advances biomedicine. 

Do you have expertise in data integration and a passion for mission-driven work? Do you want to be an important contributor to a team that includes computational biologists, software engineers, and data curators? If so, you could be our next Data Engineer. 

As part of the Computational Oncology team, you’ll work with scientists and developers to manage genomics and informatics pipelines to support high-throughput data ingress and standardization for cancer and clinical research. These workflows will be used in cloud-based environments to process data contributed by large external research consortia. Results will be aggregated and shared through publications, portals, and interactive applications, to accelerate discovery in the field.

What you’ll be doing:

  • Implement software tools to enable automated ingestion of data and resources into central data repositories.
  • Build genomic, proteomic, metabolomic, and other “omic” analysis pipelines using cloud-based platforms.
  • Standardize and deploy pipelines as reproducible workflows for scientific compute on terabytes of data using cloud infrastructure. 
  • Develop scalable and secure solutions for distributed workflow execution.
  • Write documentation and provide training for researchers in the use of workflows

We’d love to hear from you if: 

  • You have a PhD in Computer Science, Bioinformatics, Statistics, Computer Engineering or related computational field, or an MS and 3+ years of relevant job experience.
  • You’re enthusiastic about open science, collaboration, and reproducible research.
  • You have experience developing and deploying genomics pipelines in cloud-based environments (preferably AWS).
  • You have experience processing and managing large datasets, especially those generated by genomics and next-generation sequencing technologies.
  • You’re proficient in scripting and experienced with package development in R or Python.
  • You can work in Unix environments and with Unix-based scripting tools (sed, awk, grep, shell scripts, etc.).
  • You’re proficient in container technologies such as Docker.
  • You’re familiar with community standard or domain-specific workflow specifications such as Common Workflow Language (CWL), Workflow Definition Language (WDL), Nextflow, Snakemake, Galaxy, or Apache Airflow.
  • You have experience with collaborative development and version control systems (e.g. git).
  • You have experience with software development life cycles and familiarity with continuous integration (CI), continuous development (CD), and testing frameworks.

About Sage Bionetworks

Sage Bionetworks is a nonprofit biomedical research and technology development organization that was founded in Seattle in 2009. Our focus is to develop and apply open practices to data-driven research for the advancement of human health. Data-driven research has become an important component of biomedicine, but it’s not always easy to understand how to apply computational approaches appropriately or how to interpret their results. Sage believes open practices can help. Our interdisciplinary team of scientists and engineers work together to provide researchers access to technology tools and scientific approaches to share data, benchmark methods, and explore collective insights, all backed by Sage’s gold-standard governance protocols and commitment to user-centered design. Sage is supported through a portfolio of competitive research grants, commercial partnerships, and philanthropic contributions.

Sage embraces diversity, equity and inclusion. We offer a comprehensive benefits package, including relocation benefits, to bring the right talent to the team. We are based in Seattle, WA, and collaborate broadly throughout the world.

Apply Now