Consensus and Covariance Proteins: Stability, Cooperativity, Function, & Design
Full Description
PROJECT SUMMARY/ABSTRACT
With the exponential increase in protein sequences, the statistical power of multiple sequence alignments
(MSAs) has been recognized as an important source of information for analysis and design of proteins. For
example, consensus design, where the most frequent residue is selected from each position of an MSA,
has been recognized as generating folded, functional, stabilized proteins. At the same time, covariance
among pairs of residues at different positions has been recognized as having powerful value in predicting
protein structures, and is a major component of the recent successes of deep-learning methods such as
AlphaFold. Despite the power of pairwise residue covariance, these statistics have seen limited use in
design of proteins. Moreover, it is not presently known which properties of proteins—for example, folding,
stability, binding, and catalysis--are affected by the forces that contribute to covariance.
The proposed research will combine consensus design with covariance. Using well-behaved consensus
proteins we designed in the previous funding cycle, we will use two complementary methods to design
proteins with varying amounts of covariance and consensus information. The first uses a statistical
thermodynamic "Potts" formalism to determine coupling biases between residue pairs and separate them
from single-site biases. This separation allows us to adjust the amount of covariance information in our
designs. The second method uses singular value decomposition (SVD) to transform an MSA to a set of
coordinates that separate consensus from covariance. Within this space, sequences fall into well-defined
clusters that have shared conservation and covariance patterns. We will use the coordinate values of these
clusters to design sequences with specific patterns of covariance. Designed proteins will be produced in
the lab, and their stabilities, binding affinities, and enzyme activities will be determined. By projecting Potts
designs into SVD space, we will refine the Potts designs and gain insights into the specific pair correlations
that position each SVD cluster. We will also project extant sequences with known specificities into SVD
space to predict functional features of clusters, which will be tested experimentally.
To identify specific consensus and covariance sequence elements that contribute to stability and activity
patterns, we will make single-and multisite point substitutions that are found in our consensus, Potts, and
SVD designs. These will focus the non-additivity of consensus stabilization, which has been suggested
from the previous funding cycle, which is likely to be related to covariance. These mutagenesis studies will
also better define the striking stability and activity differences we have seen in preliminary Potts designs.
Overall, the proposed research will better define the roles of covariance in the various properties of proteins,
and will lead to new tools for more precise protein design. Furthermore, we expect better connect the SVD
method to taxonomy, and help establish it as a mainstream tool for molecular biology research.
Grant Number: 5R01GM068462-20
NIH Institute/Center: NIH
Principal Investigator: DOUGLAS BARRICK
Sign up free to get the apply link, save to pipeline, and set email alerts.
Sign up free →Agency Plan
7-day free trialUnlock procurement & grants
Upgrade to access active tenders from World Bank, UNDP, ADB and more — with email alerts and pipeline tracking.
$29.99 / month
- 🔔Email alerts for new matching tenders
- 🗂️Track tenders in your pipeline
- 💰Filter by contract value
- 📥Export results to CSV
- 📌Save searches with one click