IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0062856.html
   My bibliography  Save this article

Effects of GC Bias in Next-Generation-Sequencing Data on De Novo Genome Assembly

Author

Listed:
  • Yen-Chun Chen
  • Tsunglin Liu
  • Chun-Hui Yu
  • Tzen-Yuh Chiang
  • Chi-Chuan Hwang

Abstract

Next-generation-sequencing (NGS) has revolutionized the field of genome assembly because of its much higher data throughput and much lower cost compared with traditional Sanger sequencing. However, NGS poses new computational challenges to de novo genome assembly. Among the challenges, GC bias in NGS data is known to aggravate genome assembly. However, it is not clear to what extent GC bias affects genome assembly in general. In this work, we conduct a systematic analysis on the effects of GC bias on genome assembly. Our analyses reveal that GC bias only lowers assembly completeness when the degree of GC bias is above a threshold. At a strong GC bias, the assembly fragmentation due to GC bias can be explained by the low coverage of reads in the GC-poor or GC-rich regions of a genome. This effect is observed for all the assemblers under study. Increasing the total amount of NGS data thus rescues the assembly fragmentation because of GC bias. However, the amount of data needed for a full rescue depends on the distribution of GC contents. Both low and high coverage depths due to GC bias lower the accuracy of assembly. These pieces of information provide guidance toward a better de novo genome assembly in the presence of GC bias.

Suggested Citation

  • Yen-Chun Chen & Tsunglin Liu & Chun-Hui Yu & Tzen-Yuh Chiang & Chi-Chuan Hwang, 2013. "Effects of GC Bias in Next-Generation-Sequencing Data on De Novo Genome Assembly," PLOS ONE, Public Library of Science, vol. 8(4), pages 1-20, April.
  • Handle: RePEc:plo:pone00:0062856
    DOI: 10.1371/journal.pone.0062856
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0062856
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0062856&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0062856?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. David R. Bentley & Shankar Balasubramanian & Harold P. Swerdlow & Geoffrey P. Smith & John Milton & Clive G. Brown & Kevin P. Hall & Dirk J. Evers & Colin L. Barnes & Helen R. Bignell & Jonathan M. Bo, 2008. "Accurate whole human genome sequencing using reversible terminator chemistry," Nature, Nature, vol. 456(7218), pages 53-59, November.
    2. Giuseppe Narzisi & Bud Mishra, 2011. "Comparing De Novo Genome Assembly: The Long and Short of It," PLOS ONE, Public Library of Science, vol. 6(4), pages 1-17, April.
    3. Marcel Margulies & Michael Egholm & William E. Altman & Said Attiya & Joel S. Bader & Lisa A. Bemben & Jan Berka & Michael S. Braverman & Yi-Ju Chen & Zhoutao Chen & Scott B. Dewell & Lei Du & Joseph , 2005. "Genome sequencing in microfabricated high-density picolitre reactors," Nature, Nature, vol. 437(7057), pages 376-380, September.
    4. Daniel R Zerbino & Gayle K McEwen & Elliott H Margulies & Ewan Birney, 2009. "Pebble and Rock Band: Heuristic Resolution of Repeats and Scaffolding in the Velvet Short-Read de Novo Assembler," PLOS ONE, Public Library of Science, vol. 4(12), pages 1-9, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Marvin Mundry & Erich Bornberg-Bauer & Michael Sammeth & Philine G D Feulner, 2012. "Evaluating Characteristics of De Novo Assembly Software on 454 Transcriptome Data: A Simulation Approach," PLOS ONE, Public Library of Science, vol. 7(2), pages 1-10, February.
    2. Fiaz Ahmad Sulehri & Saba Sharif, 2022. "The Impact of Firm Sustainability on Firm Growth: Evidence from USA," Journal of Policy Research (JPR), Research Foundation for Humanity (RFH), vol. 8(2), pages 1-15, August.
    3. Patrick Breheny & Prabhakar Chalise & Anthony Batzler & Liewei Wang & Brooke L Fridley, 2012. "Genetic Association Studies of Copy-Number Variation: Should Assignment of Copy Number States Precede Testing?," PLOS ONE, Public Library of Science, vol. 7(4), pages 1-9, April.
    4. Fernando Lopez-Rios & Barbara Angulo & Belen Gomez & Debbie Mair & Rebeca Martinez & Esther Conde & Felice Shieh & Jeffrey Vaks & Rachel Langland & H Jeffrey Lawrence & David Gonzalez de Castro, 2013. "Comparison of Testing Methods for the Detection of BRAF V600E Mutations in Malignant Melanoma: Pre-Approval Validation Study of the Companion Diagnostic Test for Vemurafenib," PLOS ONE, Public Library of Science, vol. 8(1), pages 1-7, January.
    5. Katharina Mir & Klaus Neuhaus & Martin Bossert & Steffen Schober, 2013. "Short Barcodes for Next Generation Sequencing," PLOS ONE, Public Library of Science, vol. 8(12), pages 1-8, December.
    6. David J H F Knapp & Rachel A McGovern & Art F Y Poon & Xiaoyin Zhong & Dennison Chan & Luke C Swenson & Winnie Dong & P Richard Harrigan, 2014. "“Deep” Sequencing Accuracy and Reproducibility Using Roche/454 Technology for Inferring Co-Receptor Usage in HIV-1," PLOS ONE, Public Library of Science, vol. 9(6), pages 1-10, June.
    7. Chongqing Wen & Liyou Wu & Yujia Qin & Joy D Van Nostrand & Daliang Ning & Bo Sun & Kai Xue & Feifei Liu & Ye Deng & Yuting Liang & Jizhong Zhou, 2017. "Evaluation of the reproducibility of amplicon sequencing with Illumina MiSeq platform," PLOS ONE, Public Library of Science, vol. 12(4), pages 1-20, April.
    8. Temitayo A. Olagunju & Benjamin D. Rosen & Holly L. Neibergs & Gabrielle M. Becker & Kimberly M. Davenport & Christine G. Elsik & Tracy S. Hadfield & Sergey Koren & Kristen L. Kuhn & Arang Rhie & Kati, 2024. "Telomere-to-telomere assemblies of cattle and sheep Y-chromosomes uncover divergent structure and gene content," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    9. Jiang Du & Robert D Bjornson & Zhengdong D Zhang & Yong Kong & Michael Snyder & Mark B Gerstein, 2009. "Integrating Sequencing Technologies in Personal Genomics: Optimal Low Cost Reconstruction of Structural Variants," PLOS Computational Biology, Public Library of Science, vol. 5(7), pages 1-15, July.
    10. Maja Olecka & Alena Bömmel & Lena Best & Madlen Haase & Silke Foerste & Konstantin Riege & Thomas Dost & Stefano Flor & Otto W. Witte & Sören Franzenburg & Marco Groth & Björn Eyss & Christoph Kaleta , 2024. "Nonlinear DNA methylation trajectories in aging male mice," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    11. Peri, Alessandro, 2020. "A hardware approach to value function iteration," Journal of Economic Dynamics and Control, Elsevier, vol. 114(C).
    12. Ben Jia & Liming Xuan & Kaiye Cai & Zhiqiang Hu & Liangxiao Ma & Chaochun Wei, 2013. "NeSSM: A Next-Generation Sequencing Simulator for Metagenomics," PLOS ONE, Public Library of Science, vol. 8(10), pages 1-10, October.
    13. Annekatrin Richter & Hanna Mörl & Maria Thielemann & Markus Kleemann & Raphael Geißen & Robert Schwarz & Carolin Albertz & Philipp Koch & Andreas Petzold & Torsten Kroll & Marco Groth & Nils Hartmann , 2025. "The master male sex determinant Gdf6Y of the turquoise killifish arose through allelic neofunctionalization," Nature Communications, Nature, vol. 16(1), pages 1-18, December.
    14. Francesco Vezzi & Giuseppe Narzisi & Bud Mishra, 2012. "Feature-by-Feature – Evaluating De Novo Sequence Assembly," PLOS ONE, Public Library of Science, vol. 7(2), pages 1-12, February.
    15. Vincent A Fusaro & Prasad Patil & Erik Gafni & Dennis P Wall & Peter J Tonellato, 2011. "Biomedical Cloud Computing With Amazon Web Services," PLOS Computational Biology, Public Library of Science, vol. 7(8), pages 1-6, August.
    16. Tetsushi Sadakata & Yo Shinoda & Akira Sato & Hirotoshi Iguchi & Chiaki Ishii & Makoto Matsuo & Ryosuke Yamaga & Teiichi Furuichi, 2013. "Mouse Models of Mutations and Variations in Autism Spectrum Disorder-Associated Genes: Mice Expressing Caps2/Cadps2 Copy Number and Alternative Splicing Variants," IJERPH, MDPI, vol. 10(12), pages 1-19, November.
    17. Giuseppe Narzisi & Bud Mishra, 2011. "Comparing De Novo Genome Assembly: The Long and Short of It," PLOS ONE, Public Library of Science, vol. 6(4), pages 1-17, April.
    18. Dasol Han & Guojing Liu & Yujeong Oh & Seyoun Oh & Seungbok Yang & Lori Mandjikian & Neha Rani & Maria C. Almeida & Kenneth S. Kosik & Jiwon Jang, 2023. "ZBTB12 is a molecular barrier to dedifferentiation in human pluripotent stem cells," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    19. Tsunglin Liu & Cheng-Hung Tsai & Wen-Bin Lee & Jung-Hsien Chiang, 2013. "Optimizing Information in Next-Generation-Sequencing (NGS) Reads for Improving De Novo Genome Assembly," PLOS ONE, Public Library of Science, vol. 8(7), pages 1-16, July.
    20. Jing Tu & Mengqin Duan & Wenli Liu & Na Lu & Yue Zhou & Xiao Sun & Zuhong Lu, 2021. "Direct genome-wide identification of G-quadruplex structures by whole-genome resequencing," Nature Communications, Nature, vol. 12(1), pages 1-9, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0062856. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.