At Sage Bionetworks, we believe that we can learn more by learning from each other. By improving the way scientists collaborate, we help to make science more effective. We partner with researchers, patients, and healthcare innovators to drive collaborative data-driven science to improve health. Making science more open, collaborative, and inclusive ultimately advances biomedicine.
Sage Bionetworks is currently recruiting a computational biologist with a background in the application of machine learning techniques to real-world data (RWD). This position will support cancer data sharing projects by working with partners to test machine learning methods such as natural language processing that create structured clinical datasets from electronic health record data. The position will work with external bioinformaticians, engineers, and other subject matter experts to implement the solution with multiple external data partners. The work is inherently collaborative; the position will work closely with scientists and engineers inside Sage and with external researchers.
What you’ll be doing:
- Develop the strategy for using machine learning methods to expedite the creation of structured clinical data from EHRs. Work with software engineers to design systems to deploy across heterogeneous sites. Monitor performance and collect performance metrics.
- Collaborate with external partners to develop and apply methods for federated machine learning on partitioned clinical data, including secure sharing of trained models.
- Develop and execute quality assessment routines for phenotype data from translational, clinical trial, and RWD datasets contributed by external collaborators. Establish procedures for execution of quality routines in production.
- Summarize research findings in reports, manuscripts, and presentations to collaborators and funders. Generate preliminary data to support grant applications.
- Collaborate with in-house data curation experts to select and apply data standards to consortia datasets.
- Work with Sage’s software engineering team to define use cases and requirements for new features in Synapse.
We’d love to hear from you if you have:
- A PhD in computational biology, bioinformatics, statistics, mathematics, or related quantitative discipline, OR a masters degree in one of the above areas and 7+ years of relevant work experience.
- A publication track record demonstrating work in machine learning applications on clinical data.
- Experience using natural language processing methods on clinical/EHR data, or other relevant data.
- Experience with real-world or trials data and clinical models such as OHDSI/OMOP and CDISC.
- Demonstrated ability to address a defined problem or hypothesis (biological or otherwise) creatively and with minimal supervision.
- Programming competence in R and/or Python.
- Collaboration, teamwork, presentation, and communication skills.
- Familiarity with cancer research.
- Experience analyzing high-dimensional biological assay data, such as gene expression, whole exome sequencing, or CyTOF.
- Software development skills, including experience with version control software (e.g., Git).
- Experience with cloud computing (especially AWS) and containerization approaches (principally Docker).
In light of recent concerns of Covid-19, all interviews will be conducted remotely, and most positions will be remote through at least June 30, 2021. The option to work on-site at our Seattle office prior to June 30, will be considered upon request.