IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1008841.html
   My bibliography  Save this article

Searching for fat tails in CRISPR-Cas systems: Data analysis and mathematical modeling

Author

Listed:
  • Yekaterina S Pavlova
  • David Paez-Espino
  • Andrew Yu Morozov
  • Ilya S Belalov

Abstract

Understanding CRISPR-Cas systems—the adaptive defence mechanism that about half of bacterial species and most of archaea use to neutralise viral attacks—is important for explaining the biodiversity observed in the microbial world as well as for editing animal and plant genomes effectively. The CRISPR-Cas system learns from previous viral infections and integrates small pieces from phage genomes called spacers into the microbial genome. The resulting library of spacers collected in CRISPR arrays is then compared with the DNA of potential invaders. One of the most intriguing and least well understood questions about CRISPR-Cas systems is the distribution of spacers across the microbial population. Here, using empirical data, we show that the global distribution of spacer numbers in CRISPR arrays across multiple biomes worldwide typically exhibits scale-invariant power law behaviour, and the standard deviation is greater than the sample mean. We develop a mathematical model of spacer loss and acquisition dynamics which fits observed data from almost four thousand metagenomes well. In analogy to the classical ‘rich-get-richer’ mechanism of power law emergence, the rate of spacer acquisition is proportional to the CRISPR array size, which allows a small proportion of CRISPRs within the population to possess a significant number of spacers. Our study provides an alternative explanation for the rarity of all-resistant super microbes in nature and why proliferation of phages can be highly successful despite the effectiveness of CRISPR-Cas systems.Author summary: About half of bacterial species and most of archaea are equipped with CRISPR-Cas systems of adaptive immunity to protect them from their natural enemies—bacteriophages. The memory of CRISPR-Cas contains a catalogue of the fingerprints of previously experienced offenders which is passed down to the bacterial progeny. The microbial resistance to viruses largely depends on the number of records in this CRISPR array. Our analysis combining metagenomics data and mathematical modelling shows that the size of CRISPR arrays in microbial populations generally follows a power law distribution. Power law distributions have been found in many other complex systems (earthquakes, financial markets, animal movement). We argue that our model explains the presence of a power law in CRISPR arrays and the rareness of all-resistant super microbes.

Suggested Citation

  • Yekaterina S Pavlova & David Paez-Espino & Andrew Yu Morozov & Ilya S Belalov, 2021. "Searching for fat tails in CRISPR-Cas systems: Data analysis and mathematical modeling," PLOS Computational Biology, Public Library of Science, vol. 17(3), pages 1-21, March.
  • Handle: RePEc:plo:pcbi00:1008841
    DOI: 10.1371/journal.pcbi.1008841
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008841
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1008841&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1008841?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Luciano A. Marraffini, 2015. "CRISPR-Cas immunity in prokaryotes," Nature, Nature, vol. 526(7571), pages 55-61, October.
    2. Jeff Alstott & Ed Bullmore & Dietmar Plenz, 2014. "powerlaw: A Python Package for Analysis of Heavy-Tailed Distributions," PLOS ONE, Public Library of Science, vol. 9(1), pages 1-11, January.
    3. Beare, Brendan K & Toda, Alexis Akira, 2020. "On the emergence of a power law in the distribution of COVID-19 cases," University of California at San Diego, Economics Working Paper Series qt9k5027d0, Department of Economics, UC San Diego.
    4. Kun Zhao & Boo Shan Tseng & Bernard Beckerman & Fan Jin & Maxsim L. Gibiansky & Joe J. Harrison & Erik Luijten & Matthew R. Parsek & Gerard C. L. Wong, 2013. "Psl trails guide exploration and microcolony formation in Pseudomonas aeruginosa biofilms," Nature, Nature, vol. 497(7449), pages 388-391, May.
    5. Vuong, Quang H, 1989. "Likelihood Ratio Tests for Model Selection and Non-nested Hypotheses," Econometrica, Econometric Society, vol. 57(2), pages 307-333, March.
    6. Mazhar Adli, 2018. "The CRISPR tool kit for genome editing and beyond," Nature Communications, Nature, vol. 9(1), pages 1-13, December.
    7. David Paez-Espino & Wesley Morovic & Christine L. Sun & Brian C. Thomas & Ken-ichi Ueda & Buffy Stahl & Rodolphe Barrangou & Jillian F. Banfield, 2013. "Strong bias in the bacterial CRISPR elements that confer immunity to phage," Nature Communications, Nature, vol. 4(1), pages 1-7, June.
    8. Alexander P. Hynes & Manuela Villion & Sylvain Moineau, 2014. "Adaptation in bacterial CRISPR-Cas immunity can be driven by defective phages," Nature Communications, Nature, vol. 5(1), pages 1-6, December.
    9. Xavier Gabaix & Parameswaran Gopikrishnan & Vasiliki Plerou & H. Eugene Stanley, 2003. "A theory of power-law distributions in financial market fluctuations," Nature, Nature, vol. 423(6937), pages 267-270, May.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jovanovic, Franck & Schinckus, Christophe, 2016. "Breaking down the barriers between econophysics and financial economics," International Review of Financial Analysis, Elsevier, vol. 47(C), pages 256-266.
    2. Katahira, Kei & Chen, Yu & Akiyama, Eizo, 2021. "Self-organized Speculation Game for the spontaneous emergence of financial stylized facts," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 582(C).
    3. He, Fang & Chen, Xi, 2016. "Credit networks and systemic risk of Chinese local financing platforms: Too central or too big to fail?," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 461(C), pages 158-170.
    4. Jovanovic, Franck & Schinckus, Christophe, 2017. "Econophysics and Financial Economics: An Emerging Dialogue," OUP Catalogue, Oxford University Press, number 9780190205034.
    5. Damien Challet & Nikita Gourianov, 2018. "Dynamical regularities of US equities opening and closing auctions," Post-Print hal-01702726, HAL.
    6. Montebruno, Piero & Bennett, Robert J. & van Lieshout, Carry & Smith, Harry, 2019. "A tale of two tails: Do Power Law and Lognormal models fit firm-size distributions in the mid-Victorian era?," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 523(C), pages 858-875.
    7. Christophe Schinckus & Çınla Akdere, 2015. "Towards a New Way of Teaching Statistics in Economics: The Case for Econophysics," Ekonomi-tek - International Economics Journal, Turkish Economic Association, vol. 4(3), pages 89-108, September.
    8. Arthur A. B. Pessa & Matjaz Perc & Haroldo V. Ribeiro, 2023. "Age and market capitalization drive large price variations of cryptocurrencies," Papers 2302.12319, arXiv.org.
    9. Chołoniewski, Jan & Sienkiewicz, Julian & Leban, Gregor & Hołyst, Janusz A., 2019. "Modeling of temporal fluctuation scaling in online news network with independent cascade model," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 523(C), pages 129-144.
    10. Rashidisabet, Homa & Ajilore, Olusola & Leow, Alex & Demos, Alexander P., 2022. "Revisiting power-law estimation with applications to real-world human typing dynamics," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 599(C).
    11. Katahira, Kei & Chen, Yu & Hashimoto, Gaku & Okuda, Hiroshi, 2019. "Development of an agent-based speculation game for higher reproducibility of financial stylized facts," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 524(C), pages 503-518.
    12. Banshal, Sumit Kumar & Gupta, Solanki & Lathabai, Hiran H & Singh, Vivek Kumar, 2022. "Power Laws in altmetrics: An empirical analysis," Journal of Informetrics, Elsevier, vol. 16(3).
    13. Shim, Jaehu, 2016. "Toward a more nuanced understanding of long-tail distributions and their generative process in entrepreneurship," Journal of Business Venturing Insights, Elsevier, vol. 6(C), pages 21-27.
    14. Radu T. Pruna & Maria Polukarov & Nicholas R. Jennings, 2016. "A new structural stochastic volatility model of asset pricing and its stylized facts," Papers 1604.08824, arXiv.org.
    15. Fabrice Gilles & Sabina Issehnane & Florent Sari, 2022. "Using short-term jobs as a way to find a regular job. What kind of role for local context?," TEPP Working Paper 2022-07, TEPP.
    16. Abduraimova, Kumushoy, 2022. "Contagion and tail risk in complex financial networks," Journal of Banking & Finance, Elsevier, vol. 143(C).
    17. Jean-Philippe Bouchaud & Julien Kockelkoren & Marc Potters, 2006. "Random walks, liquidity molasses and critical response in financial markets," Quantitative Finance, Taylor & Francis Journals, vol. 6(2), pages 115-123.
    18. Paulo M. D. C. Parente & Richard J. Smith, 2021. "Quasi‐maximum likelihood and the kernel block bootstrap for nonlinear dynamic models," Journal of Time Series Analysis, Wiley Blackwell, vol. 42(4), pages 377-405, July.
    19. Juan C. Henao-Londono & Sebastian M. Krause & Thomas Guhr, 2021. "Price response functions and spread impact in correlated financial markets," The European Physical Journal B: Condensed Matter and Complex Systems, Springer;EDP Sciences, vol. 94(4), pages 1-20, April.
    20. Cornelia Lawson, 2013. "Academic Inventions Outside the University: Investigating Patent Ownership in the UK," Industry and Innovation, Taylor & Francis Journals, vol. 20(5), pages 385-398, July.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1008841. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.