Author
Listed:
- Swati Jain
- Jonathan D Jou
- Ivelin S Georgiev
- Bruce R Donald
Abstract
Protein design algorithms enumerate a combinatorial number of candidate structures to compute the Global Minimum Energy Conformation (GMEC). To efficiently find the GMEC, protein design algorithms must methodically reduce the conformational search space. By applying distance and energy cutoffs, the protein system to be designed can thus be represented using a sparse residue interaction graph, where the number of interacting residue pairs is less than all pairs of mutable residues, and the corresponding GMEC is called the sparse GMEC. However, ignoring some pairwise residue interactions can lead to a change in the energy, conformation, or sequence of the sparse GMEC vs. the original or the full GMEC. Despite the widespread use of sparse residue interaction graphs in protein design, the above mentioned effects of their use have not been previously analyzed. To analyze the costs and benefits of designing with sparse residue interaction graphs, we computed the GMECs for 136 different protein design problems both with and without distance and energy cutoffs, and compared their energies, conformations, and sequences. Our analysis shows that the differences between the GMECs depend critically on whether or not the design includes core, boundary, or surface residues. Moreover, neglecting long-range interactions can alter local interactions and introduce large sequence differences, both of which can result in significant structural and functional changes. Designs on proteins with experimentally measured thermostability show it is beneficial to compute both the full and the sparse GMEC accurately and efficiently. To this end, we show that a provable, ensemble-based algorithm can efficiently compute both GMECs by enumerating a small number of conformations, usually fewer than 1000. This provides a novel way to combine sparse residue interaction graphs with provable, ensemble-based algorithms to reap the benefits of sparse residue interaction graphs while avoiding their potential inaccuracies.Author summary: Computational structure-based protein design algorithms have successfully redesigned proteins to fold and bind target substrates in vitro, and even in vivo. Because the complexity of a computational design increases dramatically with the number of mutable residues, many design algorithms employ cutoffs (distance or energy) to neglect some pairwise residue interactions, thereby reducing the effective search space and computational cost. However, the energies neglected by such cutoffs can add up, which may have nontrivial effects on the designed sequence and its function. To study the effects of using cutoffs on protein design, we computed the optimal sequence both with and without cutoffs, and showed that neglecting long-range interactions can significantly change the computed conformation and sequence. Designs on proteins with experimentally measured thermostability showed the benefits of computing the optimal sequences (and their conformations), both with and without cutoffs, efficiently and accurately. Therefore, we also showed that a provable, ensemble-based algorithm can efficiently compute the optimal conformation and sequence, both with and without applying cutoffs, by enumerating a small number of conformations, usually fewer than 1000. This provides a novel way to combine cutoffs with provable, ensemble-based algorithms to reap the computational efficiency of cutoffs while avoiding their potential inaccuracies.
Suggested Citation
Swati Jain & Jonathan D Jou & Ivelin S Georgiev & Bruce R Donald, 2017.
"A critical analysis of computational protein design with sparse residue interaction graphs,"
PLOS Computational Biology, Public Library of Science, vol. 13(3), pages 1-30, March.
Handle:
RePEc:plo:pcbi00:1005346
DOI: 10.1371/journal.pcbi.1005346
Download full text from publisher
References listed on IDEAS
- Pablo Gainza & Kyle E Roberts & Bruce R Donald, 2012.
"Protein Design Using Continuous Rotamers,"
PLOS Computational Biology, Public Library of Science, vol. 8(1), pages 1-15, January.
- Kyle E Roberts & Patrick R Cushing & Prisca Boisguerin & Dean R Madden & Bruce R Donald, 2012.
"Computational Design of a PDZ Domain Peptide Inhibitor that Rescues CFTR Activity,"
PLOS Computational Biology, Public Library of Science, vol. 8(4), pages 1-12, April.
- Loren L. Looger & Mary A. Dwyer & James J. Smith & Homme W. Hellinga, 2003.
"Computational design of receptor and sensor proteins with novel functions,"
Nature, Nature, vol. 423(6936), pages 185-190, May.
Full references (including those not matched with items on IDEAS)
Most related items
These are the items that most often cite the same works as this one and are cited by the same works as this one.
- Anna U Lowegard & Marcel S Frenkel & Graham T Holt & Jonathan D Jou & Adegoke A Ojewole & Bruce R Donald, 2020.
"Novel, provable algorithms for efficient ensemble-based computational protein design and their application to the redesign of the c-Raf-RBD:KRas protein-protein interface,"
PLOS Computational Biology, Public Library of Science, vol. 16(6), pages 1-27, June.
- Khaled Daqrouq & Rami Alhmouz & Ahmed Balamesh & Adnan Memic, 2015.
"Application of Wavelet Transform for PDZ Domain Classification,"
PLOS ONE, Public Library of Science, vol. 10(4), pages 1-16, April.
- Kyle E Roberts & Patrick R Cushing & Prisca Boisguerin & Dean R Madden & Bruce R Donald, 2012.
"Computational Design of a PDZ Domain Peptide Inhibitor that Rescues CFTR Activity,"
PLOS Computational Biology, Public Library of Science, vol. 8(4), pages 1-12, April.
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1005346. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.