NeoGx-III: Interpretable Machine Learning with Integrated Analysis of Maternal & Infant Electronic Medical Records for Unbiased Prediction of Need for Genome Sequencing in Level III NICUs
Full Description
ABSTRACT
Prolonged diagnostic delays, or “diagnostic odysseys,” in neonatal intensive care units (NICUs) represent a
significant burden for patients and families while posing challenges for clinicians, particularly when genome
sequencing (GS) is delayed or omitted. Up to 20% of critically ill neonates may have a genetic disease, yet many
diagnoses are made only after extended uncertainty, leading to worse outcomes, longer hospital stays, and higher
healthcare costs. These issues are especially pronounced in underserved populations, such as racial and ethnic
minorities, who face barriers to GS due to healthcare disparities, further compounding diagnostic delays and
worsening outcomes. Our long-term goal is to eliminate health disparities in genetic testing, ensuring that no child
with a genetic disease—regardless of racial, ethnic, or socioeconomic background—experiences a prolonged
diagnostic odyssey. The overall objective of this application is to develop a machine learning (ML)-based approach
that reduces health disparities by objectively identifying neonates from underserved populations who require
genomic testing, using documented clinical data to mitigate provider- and system-driven biases that often contribute
to unequal access to genetic services. Our central hypothesis is that the combined analysis of maternal and infant
health records will enable efficient identification of neonates in Level III NICUs likely to benefit from early GS,
facilitating faster and targeted diagnosis of genetic diseases. To test this hypothesis, our specific aim is to develop
and evaluate an interpretable ML model that leverages both structured and unstructured data from neonatal and
maternal electronic health records (EHRs) to systematically identify neonates most likely to benefit from early-life
GS. The ML model will integrate data from clinical notes—encoded as Human Phenotype Ontology terms—and
structured data elements such as ICD codes (mapped to PheCodes), laboratory results, clinical characteristics (e.g.,
gestational age, birth weight), neonatal critical care management (e.g., intubation, medications), and relevant
maternal factors (e.g., maternal age, parity, prenatal care). Developed within a privacy-preserving environment, the
model will be designed to integrate seamlessly into existing clinical workflows and EHR systems to provide clinicians
with real-time decision support. By developing ML that integrates maternal and infant health data, this project
introduces an innovative, data-driven approach to identifying at-risk neonates while minimizing human bias. The
rationale is that early detection of genetic diseases triggered by predictive analytics will enable timely interventions,
reduce health disparities, and improve outcomes in all populations, not just those with ready access to Level IV
NICUs. This aligns with funding opportunity PAR-21-255 and helps the NHGRI advance its mission by addressing
critical gaps in neonatal genomic medicine and reducing diagnostic disparities. Our team’s unique expertise in
neonatal genomics, ML, and clinical decision support positions us to implement this transformative approach
successfully, ultimately improving health outcomes and reducing healthcare costs for vulnerable neonates.
Grant Number: 1R21HD119885-01
NIH Institute/Center: NIH
Principal Investigator: Bimal Chaudhari
Sign up free to get the apply link, save to pipeline, and set email alerts.
Sign up free →Agency Plan
7-day free trialUnlock procurement & grants
Upgrade to access active tenders from World Bank, UNDP, ADB and more — with email alerts and pipeline tracking.
$29.99 / month
- 🔔Email alerts for new matching tenders
- 🗂️Track tenders in your pipeline
- 💰Filter by contract value
- 📥Export results to CSV
- 📌Save searches with one click