A cell-type specific atlas of TF-element connectivity across human tissues
Full Description
PROJECT SUMMARY
This proposal builds upon the Common Fund data resources, GTEx and HuBMAP by
integrating the single cell/single nucleus ATAC-seq data (sc/snATAC-seq) across these
resources, and deploying recently developed deep learning methods to establish a cell
type-specific atlas of TF-element predictions.
Across our tissues and cell types, in health and disease, our genomes selectively
activate and reorganize genes and cis-regulatory elements (CREs) to define diverse cell types.
To accomplish this, hundreds of transcription factors (TFs) organize to determine the activity of
millions of CREs, which in turn regulate the expression of ~25,000 genes. The vast majority of
GWAS variants associated with common diseases and traits lie in CREs, thus a compelling
hypothesis is that these variants disrupt binding of regulatory proteins. A grand challenge in
biology is therefore to identify the precise genomic locations of these regulatory proteins across
all CREs in all cell types in an effort to understand the function of non-coding genetic variation.
The GTEx and HuBMAP Common Fund projects have generated a critical mass of
sn/scATAC-seq datasets (~177 to date), across many different human donors and tissues.
These data comprise an incredibly valuable resource of single cell data across human biology.
We recently developed PRINT, a deep learning model that uses ATAC-seq to more-accurately
reveal multiscale footprints of regulatory proteins on DNA (Hu et al. bioRxiv). ATAC-seq
provides a measure of open chromatin. PRINT therefore enables the prediction of binding of
regulatory proteins, such as TFs, within regions of open chromatin.
We will reprocess and harmonize the ~177 GTEx and HuBMAP sn/scATAC datasets,
and supplement these data with 375 ENCODE sn/scATAC datasets from human cell types and
tissues. We will deploy PRINT to predict TF footprints in the GTEx, HuBMAP and ENCODE
single cell ATAC data. We will utilize existing innovative deep learning models to annotate these
footprints. Taken together, these analyses will enable us to characterize TF binding in human
tissues and cell types, and resolve changes in CRE activity and TF binding across different cell
types and differentiation trajectories in vivo. All data, code, and model predictions will be made
available via the CFDE portal. We expect that these annotations will underpin CFDE user efforts
to develop hypotheses regarding - and ultimately annotate - the function of genetic variants.
Grant Number: 1R03OD039985-01
NIH Institute/Center: NIH
Principal Investigator: Jason Buenrostro
Sign up free to get the apply link, save to pipeline, and set email alerts.
Sign up free →Agency Plan
7-day free trialUnlock procurement & grants
Upgrade to access active tenders from World Bank, UNDP, ADB and more — with email alerts and pipeline tracking.
$29.99 / month
- 🔔Email alerts for new matching tenders
- 🗂️Track tenders in your pipeline
- 💰Filter by contract value
- 📥Export results to CSV
- 📌Save searches with one click