IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0200815.html
   My bibliography  Save this article

The scale-free nature of protein sequence space

Author

Listed:
  • Patrick C F Buchholz
  • Catharina Zeil
  • Jürgen Pleiss

Abstract

The sequence space of five protein superfamilies was investigated by constructing sequence networks. The nodes represent individual sequences, and two nodes are connected by an edge if the global sequence identity of two sequences exceeds a threshold. The networks were characterized by their degree distribution (number of nodes with a given number of neighbors) and by their fractal network dimension. Although the five protein families differed in sequence length, fold, and domain arrangement, their network properties were similar. The fractal network dimension Df was distance-dependent: a high dimension for single and double mutants (Df = 4.0), which dropped to Df = 0.7–1.0 at 90% sequence identity, and increased to Df = 3.5–4.5 below 70% sequence identity. The distance dependency of the network dimension is consistent with evolutionary constraints for functional proteins. While random single and double mutations often result in a functional protein, the accumulation of more than ten mutations is dominated by epistasis. The networks of the five protein families were highly inhomogeneous with few highly connected communities ("hub sequences") and a large number of smaller and less connected communities. The degree distributions followed a power-law distribution with similar scaling exponents close to 1. Because the hub sequences have a large number of functional neighbors, they are expected to be robust toward possible deleterious effects of mutations. Because of their robustness, hub sequences have the potential of high innovability, with additional mutations readily inducing new functions. Therefore, they form hotspots of evolution and are promising candidates as starting points for directed evolution experiments in biotechnology.

Suggested Citation

  • Patrick C F Buchholz & Catharina Zeil & Jürgen Pleiss, 2018. "The scale-free nature of protein sequence space," PLOS ONE, Public Library of Science, vol. 13(8), pages 1-14, August.
  • Handle: RePEc:plo:pone00:0200815
    DOI: 10.1371/journal.pone.0200815
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0200815
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0200815&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0200815?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Eugene V. Koonin & Yuri I. Wolf & Georgy P. Karev, 2002. "The structure of the protein universe and genome evolution," Nature, Nature, vol. 420(6912), pages 218-223, November.
    2. H. Jeong & B. Tombor & R. Albert & Z. N. Oltvai & A.-L. Barabási, 2000. "The large-scale organization of metabolic networks," Nature, Nature, vol. 407(6804), pages 651-654, October.
    3. Jing-Dong J. Han & Nicolas Bertin & Tong Hao & Debra S. Goldberg & Gabriel F. Berriz & Lan V. Zhang & Denis Dupuy & Albertha J. M. Walhout & Michael E. Cusick & Frederick P. Roth & Marc Vidal, 2004. "Erratum: Evidence for dynamically organized modularity in the yeast protein–protein interaction network," Nature, Nature, vol. 430(6997), pages 380-380, July.
    4. Eric A. Gaucher & Sridhar Govindarajan & Omjoy K. Ganesh, 2008. "Palaeotemperature trend for Precambrian life inferred from resurrected proteins," Nature, Nature, vol. 451(7179), pages 704-707, February.
    5. Jing-Dong J. Han & Nicolas Bertin & Tong Hao & Debra S. Goldberg & Gabriel F. Berriz & Lan V. Zhang & Denis Dupuy & Albertha J. M. Walhout & Michael E. Cusick & Frederick P. Roth & Marc Vidal, 2004. "Evidence for dynamically organized modularity in the yeast protein–protein interaction network," Nature, Nature, vol. 430(6995), pages 88-93, July.
    6. Manhart, Michael & Haldane, Allan & Morozov, Alexandre V., 2012. "A universal scaling law determines time reversibility and steady state of substitutions under selection," Theoretical Population Biology, Elsevier, vol. 82(1), pages 66-76.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Zhang, Mengya & Zhang, Gupeng & Liu, Yun & Zhai, Xiaorong & Han, Xinying, 2020. "Scientists’ genders and international academic collaboration: An empirical study of Chinese universities and research institutes," Journal of Informetrics, Elsevier, vol. 14(4).
    2. Marco Orlando & Patrick C F Buchholz & Marina Lotti & Jürgen Pleiss, 2021. "The GH19 Engineering Database: Sequence diversity, substrate scope, and evolution in glycoside hydrolase family 19," PLOS ONE, Public Library of Science, vol. 16(10), pages 1-30, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Pan-Jun Kim & Nathan D Price, 2011. "Genetic Co-Occurrence Network across Sequenced Microbes," PLOS Computational Biology, Public Library of Science, vol. 7(12), pages 1-9, December.
    2. Franke, R., 2016. "CHIMERA: Top-down model for hierarchical, overlapping and directed cluster structures in directed and weighted complex networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 461(C), pages 384-408.
    3. Seyed Yahya Anvar & Allan Tucker & Veronica Vinciotti & Andrea Venema & Gert-Jan B van Ommen & Silvere M van der Maarel & Vered Raz & Peter A C ‘t Hoen, 2011. "Interspecies Translation of Disease Networks Increases Robustness and Predictive Accuracy," PLOS Computational Biology, Public Library of Science, vol. 7(11), pages 1-14, November.
    4. Hou, Bonan & Yao, Yiping & Liao, Dongsheng, 2012. "Identifying all-around nodes for spreading dynamics in complex networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 391(15), pages 4012-4017.
    5. Peter Langfelder & Paul S Mischel & Steve Horvath, 2013. "When Is Hub Gene Selection Better than Standard Meta-Analysis?," PLOS ONE, Public Library of Science, vol. 8(4), pages 1-16, April.
    6. Zhang, Yuerong & Marshall, Stephen & Manley, Ed, 2021. "Understanding the roles of rail stations: Insights from network approaches in the London metropolitan area," Journal of Transport Geography, Elsevier, vol. 94(C).
    7. Fabio Cumbo & Paola Paci & Daniele Santoni & Luisa Di Paola & Alessandro Giuliani, 2014. "GIANT: A Cytoscape Plugin for Modular Networks," PLOS ONE, Public Library of Science, vol. 9(10), pages 1-7, October.
    8. Weijiang Li & Hiroyuki Kurata, 2008. "Visualizing Global Properties of Large Complex Networks," PLOS ONE, Public Library of Science, vol. 3(7), pages 1-4, July.
    9. Yau-Hua Yu & Hsu-Ko Kuo & Kuo-Wei Chang, 2008. "The Evolving Transcriptome of Head and Neck Squamous Cell Carcinoma: A Systematic Review," PLOS ONE, Public Library of Science, vol. 3(9), pages 1-11, September.
    10. Changki Hong & Jeewon Hwang & Kwang-Hyun Cho & Insik Shin, 2015. "An Efficient Steady-State Analysis Method for Large Boolean Networks with High Maximum Node Connectivity," PLOS ONE, Public Library of Science, vol. 10(12), pages 1-19, December.
    11. Seah Choon Sen & Shahreen Kasim & Mohd Farhan Md Fudzee & Rusli Abdullah & Rodziah Atan, 2017. "Random Walk From Different Perspective," Acta Electronica Malaysia (AEM), Zibeline International Publishing, vol. 1(2), pages 26-27, November.
    12. Chrysafis Vogiatzis & Mustafa Can Camur, 2019. "Identification of Essential Proteins Using Induced Stars in Protein–Protein Interaction Networks," INFORMS Journal on Computing, INFORMS, vol. 31(4), pages 703-718, October.
    13. Gabor I Simko & Peter Csermely, 2013. "Nodes Having a Major Influence to Break Cooperation Define a Novel Centrality Measure: Game Centrality," PLOS ONE, Public Library of Science, vol. 8(6), pages 1-8, June.
    14. Shiwei Lu & Yaping Huang & Zhiyuan Zhao & Xiping Yang, 2018. "Exploring the Hierarchical Structure of China’s Railway Network from 2008 to 2017," Sustainability, MDPI, vol. 10(9), pages 1-15, September.
    15. Luis P Fernandes & Alessia Annibale & Jens Kleinjung & Anthony C C Coolen & Franca Fraternali, 2010. "Protein Networks Reveal Detection Bias and Species Consistency When Analysed by Information-Theoretic Methods," PLOS ONE, Public Library of Science, vol. 5(8), pages 1-14, August.
    16. Sun, Yeran & Mburu, Lucy & Wang, Shaohua, 2016. "Analysis of community properties and node properties to understand the structure of the bus transport network," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 450(C), pages 523-530.
    17. Yikui Li & Jie Li & Wei-Kang Chen & Yang Li & Sheng Xu & Linwei Li & Bing Xia & Ren Wang, 2024. "Tuning architectural organization of eukaryotic P450 system to boost bioproduction in Escherichia coli," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    18. Amir Lakizadeh & Saeed Jalili, 2016. "BiCAMWI: A Genetic-Based Biclustering Algorithm for Detecting Dynamic Protein Complexes," PLOS ONE, Public Library of Science, vol. 11(7), pages 1-16, July.
    19. Jin Wang & Bo Huang & Xuefeng Xia & Zhirong Sun, 2006. "Funneled Landscape Leads to Robustness of Cell Networks: Yeast Cell Cycle," PLOS Computational Biology, Public Library of Science, vol. 2(11), pages 1-10, November.
    20. Zhou, Wei-Xing & Jiang, Zhi-Qiang & Sornette, Didier, 2007. "Exploring self-similarity of complex cellular networks: The edge-covering method with simulated annealing and log-periodic sampling," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 375(2), pages 741-752.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0200815. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.