Author
Listed:
- R Sathyapriya
- Jose M Duarte
- Henning Stehr
- Ioannis Filippis
- Michael Lappe
Abstract
The network of native non-covalent residue contacts determines the three-dimensional structure of a protein. However, not all contacts are of equal structural significance, and little knowledge exists about a minimal, yet sufficient, subset required to define the global features of a protein. Characterisation of this “structural essence” has remained elusive so far: no algorithmic strategy has been devised to-date that could outperform a random selection in terms of 3D reconstruction accuracy (measured as the Ca RMSD). It is not only of theoretical interest (i.e., for design of advanced statistical potentials) to identify the number and nature of essential native contacts—such a subset of spatial constraints is very useful in a number of novel experimental methods (like EPR) which rely heavily on constraint-based protein modelling. To derive accurate three-dimensional models from distance constraints, we implemented a reconstruction pipeline using distance geometry. We selected a test-set of 12 protein structures from the four major SCOP fold classes and performed our reconstruction analysis. As a reference set, series of random subsets (ranging from 10% to 90% of native contacts) are generated for each protein, and the reconstruction accuracy is computed for each subset. We have developed a rational strategy, termed “cone-peeling” that combines sequence features and network descriptors to select minimal subsets that outperform the reference sets. We present, for the first time, a rational strategy to derive a structural essence of residue contacts and provide an estimate of the size of this minimal subset. Our algorithm computes sparse subsets capable of determining the tertiary structure at approximately 4.8 Å Ca RMSD with as little as 8% of the native contacts (Ca-Ca and Cb-Cb). At the same time, a randomly chosen subset of native contacts needs about twice as many contacts to reach the same level of accuracy. This “structural essence” opens new avenues in the fields of structure prediction, empirical potentials and docking.Author Summary: A protein structure can be visualized as a network of non-covalent contacts existing between amino acids. But not all such contacts are important structural determinants of a protein. We have attempted to identify a subset of amino acid contacts that are essential for reconstructing protein structures. Initially, we followed random sampling of contacts and tested their efficacy to successfully represent the three-dimensional structure. Further, we also developed an algorithm that selects a subset of amino acid contacts from proteins based on the sequence and network properties. The subsets picked by our algorithm represent protein three-dimensional structure better than random subsets, thereby offering direct evidence for the existence of a structural essence in protein structures. The identification of such structure-defining subsets finds application in experimental and computational protein structure determination.
Suggested Citation
R Sathyapriya & Jose M Duarte & Henning Stehr & Ioannis Filippis & Michael Lappe, 2009.
"Defining an Essence of Structure Determining Residue Contacts in Proteins,"
PLOS Computational Biology, Public Library of Science, vol. 5(12), pages 1-10, December.
Handle:
RePEc:plo:pcbi00:1000584
DOI: 10.1371/journal.pcbi.1000584
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1000584. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.