grant

CAREER: Towards Trustworthy Analytics

Organization University of Illinois at ChicagoLocation CHICAGO, United StatesPosted 1 Aug 2025Deadline 30 Apr 2027
NSFUS FederalResearch GrantScience FoundationIL
Sign up free to applyApply link · pipeline · email alerts
— or —

Get email alerts for similar roles

Weekly digest · no password needed · unsubscribe any time

Full Description

Tools that create visualizations of data are increasingly important for discovery and decision-making in a range of domains, from science and engineering to commerce. Data analysts use these tools to rapidly slice and dice their data, often inspecting a large number of visualizations in the process. Though useful for exploration, these visualizations can also expose random data fluctuations, which could be mistaken for real patterns. If analysts are not careful in interpreting these apparent patterns, they could inadvertently make false discoveries or take incorrect decisions. The goal of this research is to reduce the risk from spurious patterns arising in interactive data analyses. The project comprises three stages: (1) developing techniques for capturing analyst beliefs, expectations, and intentions as they conduct visual analysis; (2) using this data to develop algorithms that forecast the reliability of emerging visualizations; and (3) evaluating strategies for communicating the risk of false patterns. The resulting techniques will be validated and incorporated in tools for detecting RNA modifications from noisy sequencing data, in collaboration with bioinformatics researchers. The expected impact of this project is to aid analysts in assessing the reliability of insights, while guarding against visualizations that seem convincing but that are likely to be misleading. This in turn could broaden the adoption of visual analytics tools, increase the confidence in conclusions, and potentially reduce the incidence of false discovery. As part of this research, the team will develop interactive educational materials for training students in reliable data-driven inference. These learning modules will be disseminated in a format that allows customization by data science instructors for inclusion into existing curricula. Lastly, the project will provide opportunities for graduate research training and incorporate K-12 outreach activities that introduce young learners to data science.

The project comprises three main activities: (1) Prototyping techniques to incrementally elicit analysts' belief and prior knowledge as they make sense of data. The elicited knowledge will then be used to distinguish between a gamut of intentions: from planned analyses with substantive hypotheses, to purely exploratory actions with minimal expectations. (2) The project will next develop a model to predict the reliability of apparent patterns and insights unearthed at different points in the analysis cycle. To build this model, the research team will use a variety of features, including the specificity of analyst intents, the degree to which their expectations are borne out in the data, as well as their behavior and interactions with visualizations. The elicitation techniques and the insight reliability model will then be refined in a series of visual analysis studies and through crowdsourced experiments, in which participants' declared priors and discoveries are used to improve the accuracy of the model in forecasting spurious patterns. Lastly, (3) the project will identify and characterize strategies for communicating the risk of spurious insights to analysts in real time. In particular, the team will evaluate techniques for directly visualizing risk indicators, as well as indirect methods whereby the visual encodings of the data will be adjusted depending on how risky it is predicted to be. The developed interventions will be evaluated both in experiments and in a bioinformatics application, to assess whether they reduce the rate of false discovery. The expected results include new methods for eliciting analyst beliefs, techniques to forecast and communicate the trustworthiness of insights, and instructional materials for teaching robust data analytic practices. The products will be disseminated in publications, and in the form of open-source software and learning modules.


This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Award Number: 2547020
Principal Investigator: Khairi Reda

Funds Obligated: $222,663

State: IL

Sign up free to get the apply link, save to pipeline, and set email alerts.

Sign up free →

Agency Plan

7-day free trial

Unlock procurement & grants

Upgrade to access active tenders from World Bank, UNDP, ADB and more — with email alerts and pipeline tracking.

$29.99 / month

  • 🔔Email alerts for new matching tenders
  • 🗂️Track tenders in your pipeline
  • 💰Filter by contract value
  • 📥Export results to CSV
  • 📌Save searches with one click
Start 7-day free trial →