CRII: III: Compression-Aware Computing for Sustainable Machine Learning Model Inference
Full Description
This project aims to advance the efficiency of machine learning model inference by developing a compression-aware computing framework. As large machine learning models become increasingly essential across science and industry, their substantial resource demands, particularly memory, computation, and energy, have limited their deployment to expensive, specialized hardware. Lossy compression techniques, such as sparsification and quantization, have emerged as solutions for running these models on consumer devices, such as laptops and smartphones. However, these methods often reduce predictive accuracy and require extensive tuning. This project proposes an approach to address this accuracy-efficacy trade-off by building self-aware machine learning models over lossy compression. By enabling models to detect and adapt to their own compression, this project will unlock pathways for cost-effective machine learning model inference.
Technically, the project consists of two core research objectives. The first focuses on developing machine learning models capable of self-awareness in response to lossy compression. This includes enabling models to determine whether they have been compressed, identify the type of compression used, and localize the affected components. The second objective leverages this self-awareness to recover and enhance model performance without retraining. Techniques include instruction-based recovery, sparse zeroth-order optimization that adjusts a small subset of model parameters, and a collaborative inference framework where multiple compressed models work together. The project will evaluate these methods on real-world tasks such as long-context language modeling and biomedical question answering. By addressing fundamental limitations in compressed machine learning model inference, this project will contribute practical tools for efficient machine learning model deployment.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Award Number: 2451398
Principal Investigator: Zhaozhuo Xu
Funds Obligated: $171,387
State: NJ
Sign up free to get the apply link, save to pipeline, and set email alerts.
Sign up free →Agency Plan
7-day free trialUnlock procurement & grants
Upgrade to access active tenders from World Bank, UNDP, ADB and more — with email alerts and pipeline tracking.
$29.99 / month
- 🔔Email alerts for new matching tenders
- 🗂️Track tenders in your pipeline
- 💰Filter by contract value
- 📥Export results to CSV
- 📌Save searches with one click