grant

Manifold representations and active learning for 21 st century biology

Organization MASSACHUSETTS INSTITUTE OF TECHNOLOGYLocation CAMBRIDGE, UNITED STATESPosted 1 Jun 2021Deadline 31 May 2026

NIHUS FederalResearch GrantFY2025Active LearningAlgorithm DesignAlgorithmic DesignAlgorithmic EngineeringAlgorithmsBig DataBigDataBiologicalBiologyBiotechBiotechnologyBody TissuesCRISPRCRISPR/Cas systemCell BodyCellsCellular ExpansionCellular GrowthCellular biologyClustered Regularly Interspaced Short Palindromic RepeatsCommunitiesComplexComputational algorithmComputer Software ToolsComputer softwareCooperative LearningDNA mutationDataData SetDevelopmentDimensionsDiseaseDisorderExperiential LearningExperimental DesignsFeedbackFunctional RNAGenesGeneticGenetic ChangeGenetic defectGenetic mutationGenomicsHeterogeneityHigh-Throughput Nucleotide SequencingHigh-Throughput SequencingIndividualMachine LearningMeasuresMemoryModelingMutationNoncoding RNANontranslated RNAOutcomePathologicPerformancePhenotypePropertyProteomicsSecureSoftwareSoftware ToolsSpatial DesignState InterestsSystems BiologyTimeTissuesTranslatingUncertaintyUntranslated RNAValidationWorkalgorithm engineeringalgorithmic compositionbiologiccell biologycell growthcomputational resourcescomputer algorithmcomputing resourcesdesigndesigningdevelopmentaldoubtexperimentexperimental researchexperimental studyexperimentsgenome mutationgenome profilinggenomic datagenomic datasetgenomic profilinghigh dimensionalityimprovedinsightinterestmachine based learningmachine learning based frameworkmachine learning frameworkmulti-modalitymultimodalitymultiomicsmultiple omicsnoncodingnovelpanomicsprecision medicineprecision-based medicinesmall moleculesoftware toolkitspatial RNA sequencingspatial gene expression analysisspatial gene expression profilingspatial resolved transcriptome sequencingspatial transcriptome analysisspatial transcriptome profilingspatial transcriptome sequencingspatial transcriptomicsspatially resolved transcriptomicsspatio transcriptomicsstructural biologytranscriptomicsvalidations

— or —

Get email alerts for similar roles

Full Description

Project Summary
With the rise of high-throughput sequencing and multiplexed biotechnologies enabling single-cell multi-omics
and massively parallel CRISPR experiments, the biomedical community is generating a monumental amount of
data. These data promise to reveal new biology and drive personal and precision medicine. However, the sheer
volume of genomic data is overwhelming current computational resources, requiring prohibitively high compute
time, memory usage, and storage. My lab has been at the forefront of solving big data challenges in genomics,
designing novel algorithms that enable efficient and secure analyses that were previously computationally
infeasible, and that reveal novel structural, cellular, and systems biology. Drawing upon our expertise in
developing scalable and insightful algorithms for analyzing genomic, transcriptomic, and proteomic data, we aim
to tackle two key data-driven challenges facing the biological community: 1) efficient, accurate, and robust
characterization of tissues at the single-cell level, and 2) translating high-throughput datasets into biological
discoveries via machine learning-based prediction. To solve the first challenge, we will leverage our discovery
that seemingly high-dimensional sequencing data often lies on low-dimensional manifolds that capture the
underlying biological state of interest. We will design algorithms that generate these compact, meaningful
manifold representations of single-cell omics datasets. This will enable a number of key applications including
characterizing co-expression and gene-modules that define healthy and pathologic cell states; integrating
multi-modal single-cell omics datasets to more richly characterize cellular diversity; and investigating the
mechanisms underlying transcriptomic diversity across tissues and developmental states. To solve the second
challenge, we will take a two-pronged approach. First, we will design novel machine learning frameworks that
provide a measure of confidence when predicting in unfamiliar biological states, enabling prediction that is robust
to “out-of-distribution” (unobserved) examples. We will then work with our experimental collaborators and CROs
to rapidly perform experimental validation of model-based predictions. Finally, we will return the experimental
results to the model to further improve performance. This will enable an “active learning” feedback loop to
efficiently explore a complex biological space for outcomes of interest. We will use this uncertainty-powered
active learning approach to explore several pressing biological concerns such as the identification of small
molecule compounds with enzymatic or whole-cell growth inhibitory properties, efficient design of spatial-
transcriptomic experiments, computationally guided CRISPR perturbation experiments, and identification of
functional non-coding mutations. This project will result in 1) numerous software tools with wide utility that
efficiently analyze massive biological datasets and guide complex experimentation, and 2) reveal biological
insights, especially into biomolecular interactions and cellular heterogeneity.

Grant Number: 5R35GM141861-05
NIH Institute/Center: NIH
Principal Investigator: BONNIE BERGER

Agency Plan

7-day free trial

Unlock procurement & grants

Upgrade to access active tenders from World Bank, UNDP, ADB and more — with email alerts and pipeline tracking.

$29.99 / month

🔔Email alerts for new matching tenders
🗂️Track tenders in your pipeline
💰Filter by contract value
📥Export results to CSV
📌Save searches with one click

Start 7-day free trial →

Explore more

📍 More grants in UNITED STATES 🏷 More NIH opportunities 🏷 More US Federal opportunities 🏷 More Research Grant opportunities 🏢 All nih_reporter opportunities

More from MASSACHUSETTS INSTITUTE OF TECHNOLOGY

All grants from MASSACHUSETTS INSTITUTE OF TECHNOLOGY →