IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0031410.html
   My bibliography  Save this article

Evaluating Characteristics of De Novo Assembly Software on 454 Transcriptome Data: A Simulation Approach

Author

Listed:
  • Marvin Mundry
  • Erich Bornberg-Bauer
  • Michael Sammeth
  • Philine G D Feulner

Abstract

Background: The quantity of transcriptome data is rapidly increasing for non-model organisms. As sequencing technology advances, focus shifts towards solving bioinformatic challenges, of which sequence read assembly is the first task. Recent studies have compared the performance of different software to establish a best practice for transcriptome assembly. Here, we adapted a simulation approach to evaluate specific features of assembly programs on 454 data. The novelty of our study is that the simulation allows us to calculate a model assembly as reference point for comparison. Findings: The simulation approach allows us to compare basic metrics of assemblies computed by different software applications (CAP3, MIRA, Newbler, and Oases) to a known optimal solution. We found MIRA and CAP3 are conservative in merging reads. This resulted in comparably high number of short contigs. In contrast, Newbler more readily merged reads into longer contigs, while Oases produced the overall shortest assembly. Due to the simulation approach, reads could be traced back to their correct placement within the transcriptome. Together with mapping reads onto the assembled contigs, we were able to evaluate ambiguity in the assemblies. This analysis further supported the conservative nature of MIRA and CAP3, which resulted in low proportions of chimeric contigs, but high redundancy. Newbler produced less redundancy, but the proportion of chimeric contigs was higher. Conclusion: Our evaluation of four assemblers suggested that MIRA and Newbler slightly outperformed the other programs, while showing contrasting characteristics. Oases did not perform very well on the 454 reads. Our evaluation indicated that the software was either conservative (MIRA) or liberal (Newbler) about merging reads into contigs. This suggested that in choosing an assembly program researchers should carefully consider their follow up analysis and consequences of the chosen approach to gain an assembly.

Suggested Citation

  • Marvin Mundry & Erich Bornberg-Bauer & Michael Sammeth & Philine G D Feulner, 2012. "Evaluating Characteristics of De Novo Assembly Software on 454 Transcriptome Data: A Simulation Approach," PLOS ONE, Public Library of Science, vol. 7(2), pages 1-10, February.
  • Handle: RePEc:plo:pone00:0031410
    DOI: 10.1371/journal.pone.0031410
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0031410
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0031410&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0031410?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. David R. Bentley & Shankar Balasubramanian & Harold P. Swerdlow & Geoffrey P. Smith & John Milton & Clive G. Brown & Kevin P. Hall & Dirk J. Evers & Colin L. Barnes & Helen R. Bignell & Jonathan M. Bo, 2008. "Accurate whole human genome sequencing using reversible terminator chemistry," Nature, Nature, vol. 456(7218), pages 53-59, November.
    2. Marcel Margulies & Michael Egholm & William E. Altman & Said Attiya & Joel S. Bader & Lisa A. Bemben & Jan Berka & Michael S. Braverman & Yi-Ju Chen & Zhoutao Chen & Scott B. Dewell & Lei Du & Joseph , 2005. "Genome sequencing in microfabricated high-density picolitre reactors," Nature, Nature, vol. 437(7057), pages 376-380, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ben Jia & Liming Xuan & Kaiye Cai & Zhiqiang Hu & Liangxiao Ma & Chaochun Wei, 2013. "NeSSM: A Next-Generation Sequencing Simulator for Metagenomics," PLOS ONE, Public Library of Science, vol. 8(10), pages 1-10, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yen-Chun Chen & Tsunglin Liu & Chun-Hui Yu & Tzen-Yuh Chiang & Chi-Chuan Hwang, 2013. "Effects of GC Bias in Next-Generation-Sequencing Data on De Novo Genome Assembly," PLOS ONE, Public Library of Science, vol. 8(4), pages 1-20, April.
    2. Fiaz Ahmad Sulehri & Saba Sharif, 2022. "The Impact of Firm Sustainability on Firm Growth: Evidence from USA," Journal of Policy Research (JPR), Research Foundation for Humanity (RFH), vol. 8(2), pages 1-15, August.
    3. Patrick Breheny & Prabhakar Chalise & Anthony Batzler & Liewei Wang & Brooke L Fridley, 2012. "Genetic Association Studies of Copy-Number Variation: Should Assignment of Copy Number States Precede Testing?," PLOS ONE, Public Library of Science, vol. 7(4), pages 1-9, April.
    4. Fernando Lopez-Rios & Barbara Angulo & Belen Gomez & Debbie Mair & Rebeca Martinez & Esther Conde & Felice Shieh & Jeffrey Vaks & Rachel Langland & H Jeffrey Lawrence & David Gonzalez de Castro, 2013. "Comparison of Testing Methods for the Detection of BRAF V600E Mutations in Malignant Melanoma: Pre-Approval Validation Study of the Companion Diagnostic Test for Vemurafenib," PLOS ONE, Public Library of Science, vol. 8(1), pages 1-7, January.
    5. Katharina Mir & Klaus Neuhaus & Martin Bossert & Steffen Schober, 2013. "Short Barcodes for Next Generation Sequencing," PLOS ONE, Public Library of Science, vol. 8(12), pages 1-8, December.
    6. David J H F Knapp & Rachel A McGovern & Art F Y Poon & Xiaoyin Zhong & Dennison Chan & Luke C Swenson & Winnie Dong & P Richard Harrigan, 2014. "“Deep” Sequencing Accuracy and Reproducibility Using Roche/454 Technology for Inferring Co-Receptor Usage in HIV-1," PLOS ONE, Public Library of Science, vol. 9(6), pages 1-10, June.
    7. Chongqing Wen & Liyou Wu & Yujia Qin & Joy D Van Nostrand & Daliang Ning & Bo Sun & Kai Xue & Feifei Liu & Ye Deng & Yuting Liang & Jizhong Zhou, 2017. "Evaluation of the reproducibility of amplicon sequencing with Illumina MiSeq platform," PLOS ONE, Public Library of Science, vol. 12(4), pages 1-20, April.
    8. Temitayo A. Olagunju & Benjamin D. Rosen & Holly L. Neibergs & Gabrielle M. Becker & Kimberly M. Davenport & Christine G. Elsik & Tracy S. Hadfield & Sergey Koren & Kristen L. Kuhn & Arang Rhie & Kati, 2024. "Telomere-to-telomere assemblies of cattle and sheep Y-chromosomes uncover divergent structure and gene content," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    9. Jiang Du & Robert D Bjornson & Zhengdong D Zhang & Yong Kong & Michael Snyder & Mark B Gerstein, 2009. "Integrating Sequencing Technologies in Personal Genomics: Optimal Low Cost Reconstruction of Structural Variants," PLOS Computational Biology, Public Library of Science, vol. 5(7), pages 1-15, July.
    10. Maja Olecka & Alena Bömmel & Lena Best & Madlen Haase & Silke Foerste & Konstantin Riege & Thomas Dost & Stefano Flor & Otto W. Witte & Sören Franzenburg & Marco Groth & Björn Eyss & Christoph Kaleta , 2024. "Nonlinear DNA methylation trajectories in aging male mice," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    11. Peri, Alessandro, 2020. "A hardware approach to value function iteration," Journal of Economic Dynamics and Control, Elsevier, vol. 114(C).
    12. Ben Jia & Liming Xuan & Kaiye Cai & Zhiqiang Hu & Liangxiao Ma & Chaochun Wei, 2013. "NeSSM: A Next-Generation Sequencing Simulator for Metagenomics," PLOS ONE, Public Library of Science, vol. 8(10), pages 1-10, October.
    13. Annekatrin Richter & Hanna Mörl & Maria Thielemann & Markus Kleemann & Raphael Geißen & Robert Schwarz & Carolin Albertz & Philipp Koch & Andreas Petzold & Torsten Kroll & Marco Groth & Nils Hartmann , 2025. "The master male sex determinant Gdf6Y of the turquoise killifish arose through allelic neofunctionalization," Nature Communications, Nature, vol. 16(1), pages 1-18, December.
    14. Vincent A Fusaro & Prasad Patil & Erik Gafni & Dennis P Wall & Peter J Tonellato, 2011. "Biomedical Cloud Computing With Amazon Web Services," PLOS Computational Biology, Public Library of Science, vol. 7(8), pages 1-6, August.
    15. Tetsushi Sadakata & Yo Shinoda & Akira Sato & Hirotoshi Iguchi & Chiaki Ishii & Makoto Matsuo & Ryosuke Yamaga & Teiichi Furuichi, 2013. "Mouse Models of Mutations and Variations in Autism Spectrum Disorder-Associated Genes: Mice Expressing Caps2/Cadps2 Copy Number and Alternative Splicing Variants," IJERPH, MDPI, vol. 10(12), pages 1-19, November.
    16. Dasol Han & Guojing Liu & Yujeong Oh & Seyoun Oh & Seungbok Yang & Lori Mandjikian & Neha Rani & Maria C. Almeida & Kenneth S. Kosik & Jiwon Jang, 2023. "ZBTB12 is a molecular barrier to dedifferentiation in human pluripotent stem cells," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    17. Tsunglin Liu & Cheng-Hung Tsai & Wen-Bin Lee & Jung-Hsien Chiang, 2013. "Optimizing Information in Next-Generation-Sequencing (NGS) Reads for Improving De Novo Genome Assembly," PLOS ONE, Public Library of Science, vol. 8(7), pages 1-16, July.
    18. Jing Tu & Mengqin Duan & Wenli Liu & Na Lu & Yue Zhou & Xiao Sun & Zuhong Lu, 2021. "Direct genome-wide identification of G-quadruplex structures by whole-genome resequencing," Nature Communications, Nature, vol. 12(1), pages 1-9, December.
    19. Cirella, Giuseppe T. & Zerbe, Stefan, 2014. "Sustainable Water Management and Wetland Restoration Strategies in Northern China," MPRA Paper 120233, University Library of Munich, Germany.
    20. Zheng Xu & Song Yan & Shuai Yuan & Cong Wu & Sixia Chen & Zifang Guo & Yun Li, 2023. "Efficient Two-Stage Analysis for Complex Trait Association with Arbitrary Depth Sequencing Data," Stats, MDPI, vol. 6(1), pages 1-14, March.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0031410. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.