DOGSLED: Data, Ontologies, and Graphs Supporting Learning and Enhanced Discovery
Full Description
Abstract
The NCATS Biomedical Data Translator (“Translator”) aims to augment human reasoning and
accelerate scientific discovery through a federated system that integrates a broad range of
biomedical data and knowledge, and reasons over them to answer translational science
questions. During the Development phase (Phase II), the Translator program successfully
implemented a system capable of answering certain types of clinical and translational questions.
We propose advancements to make Translator an even more effective and compelling resource
that will attract a broad and deep community of biomedical researchers.
To achieve this transformation, we propose DOGSLED (Data, Ontologies, and Graphs to
Support Learning and Enhance Discovery). DOGSLED will build on the best elements of the
Phase II system—many of which were developed by members of our team—while improving
breadth, integration, efficiency, explainability, usability, and sustainability.
During Phase II of Translator, as members of the Ranking Agent, Exposures Provider, and
Standards and Reference Implementation (SRI) teams, we worked with the Translator
Consortium to build and integrate the ARAGORN Reasoning Agent, the ICEES Knowledge
Provider, the Node Normalizer, and the Biolink Model. Building on that work, the DOGSLED
team will collaborate with other proposed teams such as DOGSURF and ARAX-MGKG2, should
they be awarded funding, to advance Translator to the next level, catalyzing user uptake and
satisfaction.
Our planned improvements center around performance, functionality, and transparency. Aim 1
(Create a Performant, Scalable, Reproducible Translator) involves improving reliability and
performance by centralizing and unifying data ingest, data processing, and deployment in an
integrated infrastructure component called BioPack. In addition to improving the efficiency of the
system itself, this work will streamline and standardize the development process, reducing
demands on future developers and making Translator more sustainable and extensible. To
realize Aim 2 (Expand the Functionality of Translator), we will support new query types,
leverage underutilized KPs, ingest or make better use of new and existing biomedical and
clinical knowledge sources, and improve reasoning approaches. We will leverage large
language models to enable users to add their own data in the form of publications and other
text-based information as well as to query Translator using natural language. To achieve Aim 3
(Make Translator Fully Transparent to Users), we will track provenance at every stage, from
initial data ingest all the way to ranked, evidence-supported answers to user queries. This will
feed into improvements in answer scoring and will enable the system to provide better
explanations to users. These advances will significantly expand the range of queries that users
will be able to ask of the system, build confidence in the answers, improve system performance,
and position Translator to keep pace with future developments in biomedical science.
In concert with a multi-pronged user engagement and outreach strategy inspired by other
successful consortia, the DOGSLED team will greatly expand Translator’s user base and help
the program move toward its vision of Translator as a transformative scientific discovery tool
used by a growing number of researchers.
Grant Number: 1OT2TR005712-01
NIH Institute/Center: NIH
Principal Investigator: Christopher Bizon
Sign up free to get the apply link, save to pipeline, and set email alerts.
Sign up free →Agency Plan
7-day free trialUnlock procurement & grants
Upgrade to access active tenders from World Bank, UNDP, ADB and more — with email alerts and pipeline tracking.
$29.99 / month
- 🔔Email alerts for new matching tenders
- 🗂️Track tenders in your pipeline
- 💰Filter by contract value
- 📥Export results to CSV
- 📌Save searches with one click