IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/0040043.html
   My bibliography  Save this article

Accurate Structural Correlations from Maximum Likelihood Superpositions

Author

Listed:
  • Douglas L Theobald
  • Deborah S Wuttke

Abstract

The cores of globular proteins are densely packed, resulting in complicated networks of structural interactions. These interactions in turn give rise to dynamic structural correlations over a wide range of time scales. Accurate analysis of these complex correlations is crucial for understanding biomolecular mechanisms and for relating structure to function. Here we report a highly accurate technique for inferring the major modes of structural correlation in macromolecules using likelihood-based statistical analysis of sets of structures. This method is generally applicable to any ensemble of related molecules, including families of nuclear magnetic resonance (NMR) models, different crystal forms of a protein, and structural alignments of homologous proteins, as well as molecular dynamics trajectories. Dominant modes of structural correlation are determined using principal components analysis (PCA) of the maximum likelihood estimate of the correlation matrix. The correlations we identify are inherently independent of the statistical uncertainty and dynamic heterogeneity associated with the structural coordinates. We additionally present an easily interpretable method (“PCA plots”) for displaying these positional correlations by color-coding them onto a macromolecular structure. Maximum likelihood PCA of structural superpositions, and the structural PCA plots that illustrate the results, will facilitate the accurate determination of dynamic structural correlations analyzed in diverse fields of structural biology.: Biological macromolecules comprise extensive networks of interconnected atoms. These complex coupled networks result in correlated structural dynamics, where atoms and residues move and evolve together as concerted conformational changes. The availability of a wealth of macromolecular structures necessitates the use of robust strategies for analyzing the correlated modes of motion found in molecular ensembles. Current strategies use a combination of least-squares superpositions and statistical analysis of the structural covariance matrix. However, the least-squares treatment implicitly requires that atoms are uncorrelated and that each atom has the same positional uncertainty, two assumptions which are violated in structural ensembles. For example, the atoms in the proteins are connected by chemical bonds, covalent and non-covalent, resulting in strong correlations. Furthermore, different atoms have different variances, because some atoms are known with less precision or have greater mobility. Using maximum likelihood (ML) analysis, we have developed a technique that is markedly more accurate than the classical least-squares approach by accounting for both correlations and heterogeneous variances. The improved ability to accurately analyze the major modes of dynamic structural correlations will benefit a diverse range of biological disciplines, including nuclear magnetic resonance (NMR) spectroscopy, crystallography, molecular dynamics, and molecular evolution.

Suggested Citation

  • Douglas L Theobald & Deborah S Wuttke, 2008. "Accurate Structural Correlations from Maximum Likelihood Superpositions," PLOS Computational Biology, Public Library of Science, vol. 4(2), pages 1-8, February.
  • Handle: RePEc:plo:pcbi00:0040043
    DOI: 10.1371/journal.pcbi.0040043
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.0040043
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.0040043&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.0040043?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Kresten Lindorff-Larsen & Robert B. Best & Mark A. DePristo & Christopher M. Dobson & Michele Vendruscolo, 2005. "Simultaneous determination of protein structure and dynamics," Nature, Nature, vol. 433(7022), pages 128-132, January.
    2. John T. Kent & Kanti V. Mardia, 1997. "Consistency of Procrustes Estimators," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 59(1), pages 281-290.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Neema L Salimi & Bosco Ho & David A Agard, 2010. "Unfolding Simulations Reveal the Mechanism of Extreme Unfolding Cooperativity in the Kinetically Stable α-Lytic Protease," PLOS Computational Biology, Public Library of Science, vol. 6(2), pages 1-14, February.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Farag Shuweihdi & Charles C. Taylor & Arief Gusnanto, 2017. "Classification of form under heterogeneity and non-isotropic errors," Journal of Applied Statistics, Taylor & Francis Journals, vol. 44(8), pages 1495-1508, June.
    2. Alshabani, A.K.S. & Dryden, I.L. & Litton, C.D., 2007. "Partial size-and-shape distributions," Journal of Multivariate Analysis, Elsevier, vol. 98(10), pages 1988-2001, November.
    3. Timothy R Lezon & Ivet Bahar, 2010. "Using Entropy Maximization to Understand the Determinants of Structural Dynamics beyond Native Contact Topology," PLOS Computational Biology, Public Library of Science, vol. 6(6), pages 1-12, June.
    4. Gregory D Friedland & Nils-Alexander Lakomek & Christian Griesinger & Jens Meiler & Tanja Kortemme, 2009. "A Correspondence Between Solution-State Dynamics of an Individual Protein and the Sequence and Conformational Diversity of its Family," PLOS Computational Biology, Public Library of Science, vol. 5(5), pages 1-16, May.
    5. Fabian J.E. Telschow & Michael R. Pierrynowski & Stephan F. Huckemann, 2021. "Functional inference on rotational curves under sample‐specific group actions and identification of human gait," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 48(4), pages 1256-1276, December.
    6. Matteo Tiberti & Elena Papaleo & Tone Bengtsen & Wouter Boomsma & Kresten Lindorff-Larsen, 2015. "ENCORE: Software for Quantitative Ensemble Comparison," PLOS Computational Biology, Public Library of Science, vol. 11(10), pages 1-16, October.
    7. Anders S Christensen & Troels E Linnet & Mikael Borg & Wouter Boomsma & Kresten Lindorff-Larsen & Thomas Hamelryck & Jan H Jensen, 2013. "Protein Structure Validation and Refinement Using Amide Proton Chemical Shifts Derived from Quantum Mechanics," PLOS ONE, Public Library of Science, vol. 8(12), pages 1-10, December.
    8. F. Emil Thomasen & Tórur Skaalum & Ashutosh Kumar & Sriraksha Srinivasan & Stefano Vanni & Kresten Lindorff-Larsen, 2024. "Rescaling protein-protein interactions improves Martini 3 for flexible proteins in solution," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
    9. Nimmi Das Anthuparambil & Anita Girelli & Sonja Timmermann & Marvin Kowalski & Mohammad Sayed Akhundzadeh & Sebastian Retzbach & Maximilian D. Senft & Michelle Dargasz & Dennis Gutmüller & Anusha Hire, 2023. "Exploring non-equilibrium processes and spatio-temporal scaling laws in heated egg yolk using coherent X-rays," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    10. Dong Long & Rafael Brüschweiler, 2011. "In Silico Elucidation of the Recognition Dynamics of Ubiquitin," PLOS Computational Biology, Public Library of Science, vol. 7(4), pages 1-9, April.
    11. Kresten Lindorff-Larsen & Jesper Ferkinghoff-Borg, 2009. "Similarity Measures for Protein Ensembles," PLOS ONE, Public Library of Science, vol. 4(1), pages 1-13, January.
    12. Kai Wang & Shiyang Long & Pu Tian, 2015. "Hierarchical Conformational Analysis of Native Lysozyme Based on Sub-Millisecond Molecular Dynamics Simulations," PLOS ONE, Public Library of Science, vol. 10(6), pages 1-17, June.
    13. Wouter Boomsma & Jesper Ferkinghoff-Borg & Kresten Lindorff-Larsen, 2014. "Combining Experiments and Simulations Using the Maximum Entropy Principle," PLOS Computational Biology, Public Library of Science, vol. 10(2), pages 1-9, February.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:0040043. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.