Natural language processing and machine learning for development of a Fontan Failure risk prediction model from electronic health records
Full Description
PROJECT SUMMARY/ABSTRACT
Fontan palliation for rare single ventricle heart defects is lifesaving but creates deranged cardiovascular
physiology with eventual premature multi-organ circulatory failure. Circulatory failure after Fontan palliation may
be related to a number of physiologic states which may change over time, associated with variable prognoses,
and requiring development of physiology-specific treatment options. Fontan research is limited by heterogeneity
of native anatomy, post-Fontan anatomy, physiologic states, and small sample sizes due to rarity. Adverse
outcomes in the Fontan population begin in childhood, are common and diverse, often affecting multiple organ
systems. We have previously described Fontan Failure physiologic phenotypes based on (1) Systolic Heart
Failure (2) Diastolic Heart Failure (3) Hepatic and Pulmonary phenotype (normal cardiac output) and (4)
Lymphatic Abnormalities. Despite the broad range of complications, treatments for Fontan patients are generally
consensus based and may not address the underlying physiologic derangement. Heart transplantation can be
lifesaving for this population; however, heart transplantation creates a different disease state with its own related
late morbidity and mortality, and optimal timing is unknown. Using two electronic health records systems
(pediatric and adult) including free text notes, for a diverse population with Fontan anatomy across the age
spectrum, we propose to use natural language processing (NLP) and machine learning (ML) techniques to
improve detection of multi-organ comorbid conditions in this population to define anatomic and physiologic
phenotypes, and develop of an annualized risk score applicable across age, sex, race and ethnicity. Our
proposed work builds on a rigorous pilot study in which we developed an NLP-based ML model for automatically
identifying Fontan patients from two hospital systems representing a racially diverse cohort across the lifespan.
Our pilot system achieved significantly better performance compared to ICD code-based classification of Fontan
cases. In the proposed work, we will (i) advance the state of the art in biomedical NLP to improve the automatic
classification of Fontan phenotypes in the cohort so that it is closer to human-level performance; (ii) develop a
generalizable and interpretable pipeline so that NLP/ML outputs can be traced by domain experts from the final
decision to initial data point; and (iii) implement data-driven methods to develop a risk prediction model for
adverse outcomes in Fontan patients. Our innovative approach can facilitate the development of physiology-
based treatments and risk stratification for advanced therapies. Public, open-sourced release of the code
associated with our technological innovations will benefit the research community as a whole to accelerate rare
disease research, at lower cost and with greater inclusivity.
Grant Number: 1R21HL181630-01
NIH Institute/Center: NIH
Principal Investigator: Wendy Book
Sign up free to get the apply link, save to pipeline, and set email alerts.
Sign up free →Agency Plan
7-day free trialUnlock procurement & grants
Upgrade to access active tenders from World Bank, UNDP, ADB and more — with email alerts and pipeline tracking.
$29.99 / month
- 🔔Email alerts for new matching tenders
- 🗂️Track tenders in your pipeline
- 💰Filter by contract value
- 📥Export results to CSV
- 📌Save searches with one click