IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v15y2024i1d10.1038_s41467-024-53294-2.html
   My bibliography  Save this article

A unified framework to analyze transposable element insertion polymorphisms using graph genomes

Author

Listed:
  • Cristian Groza

    (McGill University)

  • Xun Chen

    (Kyoto University)

  • Travis J. Wheeler

    (University of Arizona)

  • Guillaume Bourque

    (Kyoto University
    McGill University
    Victor Phillip Dahdaleh Institute of Genomic Medicine at McGill University
    McGill University)

  • Clément Goubert

    (McGill University
    University of Arizona)

Abstract

Transposable elements are ubiquitous mobile DNA sequences generating insertion polymorphisms, contributing to genomic diversity. We present GraffiTE, a flexible pipeline to analyze polymorphic mobile elements insertions. By integrating state-of-the-art structural variant detection algorithms and graph genomes, GraffiTE identifies polymorphic mobile elements from genomic assemblies or long-read sequencing data, and genotypes these variants using short or long read sets. Benchmarking on simulated and real datasets reports high precision and recall rates. GraffiTE is designed to allow non-expert users to perform comprehensive analyses, including in models with limited transposable element knowledge and is compatible with various sequencing technologies. Here, we demonstrate the versatility of GraffiTE by analyzing human, Drosophila melanogaster, maize, and Cannabis sativa pangenome data. These analyses reveal the landscapes of polymorphic mobile elements and their frequency variations across individuals, strains, and cultivars.

Suggested Citation

  • Cristian Groza & Xun Chen & Travis J. Wheeler & Guillaume Bourque & Clément Goubert, 2024. "A unified framework to analyze transposable element insertion polymorphisms using graph genomes," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
  • Handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-53294-2
    DOI: 10.1038/s41467-024-53294-2
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-024-53294-2
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-024-53294-2?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Cristian Groza & Carl Schwendinger-Schreck & Warren A. Cheung & Emily G. Farrow & Isabelle Thiffault & Juniper Lake & William B. Rizzo & Gilad Evrony & Tom Curran & Guillaume Bourque & Tomi Pastinen, 2024. "Pangenome graphs improve the analysis of structural variants in rare genetic diseases," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    2. Agnieszka A. Golicz & Philipp E. Bayer & Guy C. Barker & Patrick P. Edger & HyeRan Kim & Paula A. Martinez & Chon Kit Kenneth Chan & Anita Severn-Ellis & W. Richard McCombie & Isobel A. P. Parkin & An, 2016. "The pangenome of an agronomically important crop plant Brassica oleracea," Nature Communications, Nature, vol. 7(1), pages 1-8, December.
    3. Daniel C. Jeffares & Clemency Jolly & Mimoza Hoti & Doug Speed & Liam Shaw & Charalampos Rallis & Francois Balloux & Christophe Dessimoz & Jürg Bähler & Fritz J. Sedlazeck, 2017. "Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast," Nature Communications, Nature, vol. 8(1), pages 1-11, April.
    4. Gabriel E Rech & María Bogaerts-Márquez & Maite G Barrón & Miriam Merenciano & José Luis Villanueva-Cañas & Vivien Horváth & Anna-Sophie Fiston-Lavier & Isabelle Luyten & Sandeep Venkataram & Hadi Que, 2019. "Stress response, behavior, and development are shaped by transposable element-induced mutations in Drosophila," PLOS Genetics, Public Library of Science, vol. 15(2), pages 1-33, February.
    5. Peter H. Sudmant & Tobias Rausch & Eugene J. Gardner & Robert E. Handsaker & Alexej Abyzov & John Huddleston & Yan Zhang & Kai Ye & Goo Jun & Markus Hsi-Yang Fritz & Miriam K. Konkel & Ankit Malhotra , 2015. "An integrated map of structural variation in 2,504 human genomes," Nature, Nature, vol. 526(7571), pages 75-81, October.
    6. Yichen Henry Liu & Can Luo & Staunton G. Golding & Jacob B. Ioffe & Xin Maizie Zhou, 2024. "Tradeoffs in alignment and assembly-based methods for structural variant detection with long-read sequencing data," Nature Communications, Nature, vol. 15(1), pages 1-22, December.
    7. Wen-Wei Liao & Mobin Asri & Jana Ebler & Daniel Doerr & Marina Haukness & Glenn Hickey & Shuangjia Lu & Julian K. Lucas & Jean Monlong & Haley J. Abel & Silvia Buonaiuto & Xian H. Chang & Haoyu Cheng , 2023. "A draft human pangenome reference," Nature, Nature, vol. 617(7960), pages 312-324, May.
    8. Ting Wang & Lucinda Antonacci-Fulton & Kerstin Howe & Heather A. Lawson & Julian K. Lucas & Adam M. Phillippy & Alice B. Popejoy & Mobin Asri & Caryn Carson & Mark J. P. Chaisson & Xian Chang & Robert, 2022. "The Human Pangenome Project: a global resource to map genomic diversity," Nature, Nature, vol. 604(7906), pages 437-446, April.
    9. Chong Chu & Rebeca Borges-Monroy & Vinayak V. Viswanadham & Soohyun Lee & Heng Li & Eunjung Alice Lee & Peter J. Park, 2021. "Comprehensive identification of transposable element insertions using multiple sequencing technologies," Nature Communications, Nature, vol. 12(1), pages 1-12, December.
    10. J. Gower & P. Legendre, 1986. "Metric and Euclidean properties of dissimilarity coefficients," Journal of Classification, Springer;The Classification Society, vol. 3(1), pages 5-48, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Can Luo & Yichen Henry Liu & Xin Maizie Zhou, 2024. "VolcanoSV enables accurate and robust structural variant calling in diploid genomes from single-molecule long read sequencing," Nature Communications, Nature, vol. 15(1), pages 1-20, December.
    2. Xiaoling Tong & Min-Jin Han & Kunpeng Lu & Shuaishuai Tai & Shubo Liang & Yucheng Liu & Hai Hu & Jianghong Shen & Anxing Long & Chengyu Zhan & Xin Ding & Shuo Liu & Qiang Gao & Bili Zhang & Linli Zhou, 2022. "High-resolution silkworm pan-genome provides genetic insights into artificial selection and ecological adaptation," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    3. Yichen Henry Liu & Can Luo & Staunton G. Golding & Jacob B. Ioffe & Xin Maizie Zhou, 2024. "Tradeoffs in alignment and assembly-based methods for structural variant detection with long-read sequencing data," Nature Communications, Nature, vol. 15(1), pages 1-22, December.
    4. Tuomas Hämälä & Christopher Moore & Laura Cowan & Matthew Carlile & David Gopaulchan & Marie K. Brandrud & Siri Birkeland & Matthew Loose & Filip Kolář & Marcus A. Koch & Levi Yant, 2024. "Impact of whole-genome duplications on structural variant evolution in Cochlearia," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    5. Cristian Groza & Carl Schwendinger-Schreck & Warren A. Cheung & Emily G. Farrow & Isabelle Thiffault & Juniper Lake & William B. Rizzo & Gilad Evrony & Tom Curran & Guillaume Bourque & Tomi Pastinen, 2024. "Pangenome graphs improve the analysis of structural variants in rare genetic diseases," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    6. Tobias T. Schmidt & Carly Tyer & Preeyesh Rughani & Candy Haggblom & Jeffrey R. Jones & Xiaoguang Dai & Kelly A. Frazer & Fred H. Gage & Sissel Juul & Scott Hickey & Jan Karlseder, 2024. "High resolution long-read telomere sequencing reveals dynamic mechanisms in aging and cancer," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    7. Arthur S. Lee & Lauren J. Ayers & Michael Kosicki & Wai-Man Chan & Lydia N. Fozo & Brandon M. Pratt & Thomas E. Collins & Boxun Zhao & Matthew F. Rose & Alba Sanchis-Juan & Jack M. Fu & Isaac Wong & X, 2024. "A cell type-aware framework for nominating non-coding variants in Mendelian regulatory disorders," Nature Communications, Nature, vol. 15(1), pages 1-26, December.
    8. Wolfram Höps & Tobias Rausch & Michael Jendrusch & Jan O. Korbel & Fritz J. Sedlazeck, 2024. "Impact and characterization of serial structural variations across humans and great apes," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    9. Marsha M. Wheeler & Adrienne M. Stilp & Shuquan Rao & Bjarni V. Halldórsson & Doruk Beyter & Jia Wen & Anna V. Mihkaylova & Caitlin P. McHugh & John Lane & Min-Zhi Jiang & Laura M. Raffield & Goo Jun , 2022. "Whole genome sequencing identifies structural variants contributing to hematologic traits in the NHLBI TOPMed program," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
    10. Guohuan Su & Adam Mertel & Sébastien Brosse & Justin M. Calabrese, 2023. "Species invasiveness and community invasibility of North American freshwater fish fauna revealed via trait-based analysis," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    11. Sean A. Misek & Aaron Fultineer & Jeremie Kalfon & Javad Noorbakhsh & Isabella Boyle & Priyanka Roy & Joshua Dempster & Lia Petronio & Katherine Huang & Alham Saadat & Thomas Green & Adam Brown & John, 2024. "Germline variation contributes to false negatives in CRISPR-based experiments with varying burden across ancestries," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    12. la Grange, Anthony & le Roux, Niël & Gardner-Lubbe, Sugnet, 2009. "BiplotGUI: Interactive Biplots in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 30(i12).
    13. Michael Brusco & J Dennis Cradit & Douglas Steinley, 2021. "A comparison of 71 binary similarity coefficients: The effect of base rates," PLOS ONE, Public Library of Science, vol. 16(4), pages 1-19, April.
    14. Balepur, Prashant Narayan, 1998. "Impacts of Computer-Mediated Communication on Travel and Communication Patterns: The Davis Community Network Study," Institute of Transportation Studies, Research Reports, Working Papers, Proceedings qt6cb1f85c, Institute of Transportation Studies, UC Berkeley.
    15. Niemann, Helen & Moehrle, Martin G. & Frischkorn, Jonas, 2017. "Use of a new patent text-mining and visualization method for identifying patenting patterns over time: Concept, method and test application," Technological Forecasting and Social Change, Elsevier, vol. 115(C), pages 210-220.
    16. Michael J. Greenacre & Patrick J. F. Groenen, 2016. "Weighted Euclidean Biplots," Journal of Classification, Springer;The Classification Society, vol. 33(3), pages 442-459, October.
    17. Douglas L. Steinley & M. J. Brusco, 2019. "Using an Iterative Reallocation Partitioning Algorithm to Verify Test Multidimensionality," Journal of Classification, Springer;The Classification Society, vol. 36(3), pages 397-413, October.
    18. Matthijs Warrens, 2008. "Bounds of Resemblance Measures for Binary (Presence/Absence) Variables," Journal of Classification, Springer;The Classification Society, vol. 25(2), pages 195-208, November.
    19. Anna Maria D’Arcangelis & Giulia Rotundo, 2016. "Complex Networks in Finance," Lecture Notes in Economics and Mathematical Systems, in: Pasquale Commendatore & Mariano Matilla-García & Luis M. Varela & Jose S. Cánovas (ed.), Complex Networks and Dynamics, pages 209-235, Springer.
    20. Carla Coltharp & Rene P Kessler & Jie Xiao, 2012. "Accurate Construction of Photoactivated Localization Microscopy (PALM) Images for Quantitative Measurements," PLOS ONE, Public Library of Science, vol. 7(12), pages 1-15, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-53294-2. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.