Postdoc: Biomedical Data Science, Genomic Medicine, Machine Learning [Pejaver Lab]

Multiple postdoctoral positions are available in the Pejaver Lab. We are an interdisciplinary group with diverse but interlinked research interests related to biomedical data science and machine learning methodology development. These positions are funded through flexible but stable mechanisms and, thus, can potentially be extended in the longer term. Depending on the candidate’s interests, there will be opportunities to work on, internally funded curiosity-driven projects, as well as structured, grant-funded projects.

The position will be situated at the Institute for Genomic Health (IGH) at the Icahn School of Medicine at Mount Sinai in Manhattan, New York, NY, USA. IGH aims to close the gap between genomic research and healthcare through discoveries and algorithms focused on diverse populations. Data resources include (a) Access to >8 million patient records in the Mount Sinai Data Warehouse and (b) The BioMe Biobank Program with >30,000 patients with whole exome sequencing data linked to longitudinal clinical data. Mount Sinai also houses Minerva, a world-class high-performance computing resource, with specifications that include over 2 petaflops of compute cores and nearly 90 GPU cores, among others.

We are broadly seeking postdocs with research interests and complementary expertise related to our three focus areas:

1. Variant and genome interpretation

We have had a longstanding interest in the prediction of the functional and phenotypic impact of genetic variants, and developing disease risk scores using genomic data. A key aspect of our work is the integration and predictive modeling of protein function, allowing for mechanistic “explanations” of how a variant may lead to disease. We have ongoing and planned collaborations with the ClinGen to standardize the use of predictive methods such as ours in clinical variant interpretation and the Impact of Genomic Variation on Function (IGVF) Consortia to expand this work to non-coding variation. Candidate requirements include:

  • PhD in Bioinformatics, Computational Biology, Genomics, Molecular Biology, or a related discipline
  • Expertise in functional annotation of human genomes and proficiency in extracting information from biological knowledgebases such as UniProt and Ensembl
  • Experience developing efficient pipelines to process large genomic data sets in high-performance computing environments
  • Working knowledge of statistical testing and data exploration techniques

Desired qualifications include:

  • Experience applying machine learning methods to genomics and molecular data sets
  • Experience with functional genomics consortia data resources such as GTEx and ENCODE
  • Familiarity with large genomics data resources such as the 1000 Genomes Project, UK Biobank, ClinVar and gnomAD, among others
  • Working knowledge of modern web service and/or software implementation

2. Deep phenotyping using electronic health records (EHRs)

The Pejaver Lab is interested in the extraction of genetic disease-related information from EHRs and its integration with genomic, molecular and other clinical data sets to build better cohorts for variant discovery and aid in the timely diagnosis of patients in the clinic. We are currently funded to develop data science methods to identify patients with rare genetic diseases from their health records, particularly exploiting patterns in clinical notes and their odyssey through the health care system. We also collaborate with other labs at IGH to develop phenotypic risk scores using information from the Mount Sinai BioMe biobank and EHRs.

Candidate requirements include:

  • PhD or MD-PhD in Biomedical Informatics, Biomedical Data Science, Clinical Research Informatics, or a related discipline
  • Expertise in integrating genomic and health record data sets and proficiency in common data models such as OMOP and i2b2
  • Working knowledge of statistical testing and data exploration techniques
  • Experience working in high-performance computing environments

Desired qualifications include:

  • Familiarity with standardized vocabularies and ontologies such as UMLS and HPO
  • Working knowledge of natural language processing techniques including transformers and foundational models
  • Working knowledge of modern web service and/or software implementation
  • Familiarity with HIPAA and data governance in large healthcare systems

3. Applied machine learning for biomedical data sets

The applications that we are interested in developing methods for, naturally require innovative problem formulations, novel learning algorithms and customized objective functions. We also particularly emphasize end-to-end implementation of our methods, with the aim of increased adoption and improved decision support in research and clinical settings. Topics of interest include similarity learning, interpretable machine learning, multi-task learning, and structured output learning. We also collaborate with the ClinGen Consortium and the Critical Assessment of Genome Interpretation (CAGI) on developing novel metrics to evaluate computational methods on real-world biomedical data sets.

Candidate requirements include:

  • PhD in Computer Science, Informatics, Data Science, or a related discipline
  • Expertise in machine learning methodology development and proficiency in MLOps
  • Demonstrable interest in applying and evaluating computational methods for biomedical data sets, particularly genomics, longitudinal and/or text data
  • Experience with data pre-processing and cleaning techniques

Desired qualifications include:

  • Experience with deep learning frameworks such as TensorFlow and PyTorch
  • Proficiency in natural language processing techniques including transformers and foundational models
  • Working knowledge of probabilistic modeling, statistical testing and/or data exploration techniques
  • Basic understanding of concepts in molecular biology or biomedical informatics

Instructions for all positions

Candidates must have strong communication skills, a commitment to methodological rigor, and the ability to work creatively and collaboratively. Please send inquiries by email to with “Postdoc position” in the subject and the following materials:

  • A complete CV
  • A cover letter describing which focus area you would be most suited for and how your training and expertise relate to our research interests
  • Contact information for at least 2 references

Information on the Postdoctoral Training Program at Mount Sinai

Click here to learn more about the postdoctoral training program at Icahn School of Medicine at Mount Sinai. Incoming postdoctoral fellows are eligible for affordable Mount Sinai Housing within walking distance of the medical school and of a wide range of amenities as well as visa sponsorship on a case-by-case basis.

About our organization

The Icahn School of Medicine at Mount Sinai is internationally recognized as a leader in groundbreaking clinical and basic science research and is known for its innovative approach to medical education. With a faculty of more than 3,400 in 38 clinical and basic science departments and centers, Mount Sinai ranks among the top 20 medical schools in receipt of National Institutes of Health grants. In its 2015 “America’s Best Graduate Schools” issue, U.S. News & World Report ranks the Icahn School of Medicine 14th out of 130 medical schools nationwide. Mount Sinai Medical Center is an equal opportunity/affirmative action employer. We recognize the power and importance of a diverse employee population and strongly encourage applicants with various experiences and backgrounds. Mount Sinai Medical Center–An EEO/AA-D/V Employer.