SciDAP: Scientific Data Analysis Platform
Full Description
The recent proliferation of next-generation sequencing (NGS) - based methods for the analysis of expression,
chromatin and protein-DNA interactions has created tremendous opportunities for gaining insights into biology,
health, and disease. However, analysis of the data requires computational expertise that many biologists do not
possess. Hence, when dealing with genomics data, majority of biologists require the help of bioinformaticians
even for simple tasks. This places these exciting methods beyond the reach of the majority of life scientists.
This phase II proposal from DATIRIUM, LLC, a start-up from Cincinnati, OH follows phase I project that
resulted in the development of a prototype (MVP) of SciDAP (Scientific Data Analysis Platform), a novel multi-
omics user-friendly data analysis platform that allows biologists to analyze the data and enables collaboration
with bioinformaticians. The current phase II proposal describes a plan to continue SciDAP development.
The key problem for creating user-friendly data analysis packages is the difficulty in adding new or modifying
existing pipelines: due to the tight coupling between pipeline and user interface this required changes at all levels
of software. Unfortunately, the same limitation exists for all user-friendly bioinformatics tools. Given that there
are more than 150 NGS-based methods and many ways to process the data, this explains why a universal and
user-friendly data analysis platform does not yet exist.
We hypothesized that we can create a data analysis platform that is both universal and user-friendly by
including interface instructions into computational pipelines. Platform will use these instructions to create a
graphical interface. Specifically, we are using containerized pipelines developed using Common Workflow
Language (CWL) making our pipelines both portable and reproducible. On top of CWL, Datirium developed a
system of CWL extensions that allows to describe the inputs and outputs visualizations within the CWL
workflows. Importantly, our platform will increase the rigor of computational analysis by (i) making the analysis
reproducible and auditable by bioinformaticians due to CWL pipeline portability and recording each step of the
analysis as Research Objects; (ii) enabling collaboration between experimentalists and computational biologists
by providing bioinformaticians with a way to direct analysis flow and biologists with the convenience of GUI; (iii)
Including out of the box pipelines with optimized parameters and actionable QC metrics that flag possible issues.
In the first aim of this proposal we will develop a version of SciDAP for use on academic clusters and
commercial clouds. In the second aim, in collaboration with Dr. Salomonis at CCHMC, we will adopt pipelines
miRNA, WGS/WXS and scMultiome data analysis. In the third, we will develop improvements to SciDAP
interface that will increase SciDAP flexibility and usability for bioinformaticians and experimentalists.
Successful completion of this project will provide the research community with a cutting edge, flexible and
biologist-friendly data analysis platform.
Grant Number: 5R42HG011219-03
NIH Institute/Center: NIH
Principal Investigator: Artem Barski
Sign up free to get the apply link, save to pipeline, and set email alerts.
Sign up free →Agency Plan
7-day free trialUnlock procurement & grants
Upgrade to access active tenders from World Bank, UNDP, ADB and more — with email alerts and pipeline tracking.
$29.99 / month
- 🔔Email alerts for new matching tenders
- 🗂️Track tenders in your pipeline
- 💰Filter by contract value
- 📥Export results to CSV
- 📌Save searches with one click