IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0202355.html
   My bibliography  Save this article

GRASShopPER—An algorithm for de novo assembly based on GPU alignments

Author

Listed:
  • Aleksandra Swiercz
  • Wojciech Frohmberg
  • Michal Kierzynka
  • Pawel Wojciechowski
  • Piotr Zurkowski
  • Jan Badura
  • Artur Laskowski
  • Marta Kasprzak
  • Jacek Blazewicz

Abstract

Next generation sequencers produce billions of short DNA sequences in a massively parallel manner, which causes a great computational challenge in accurately reconstructing a genome sequence de novo using these short sequences. Here, we propose the GRASShopPER assembler, which follows an approach of overlap-layout-consensus. It uses an efficient GPU implementation for the sequence alignment during the graph construction stage and a greedy hyper-heuristic algorithm at the fork detection stage. A two-part fork detection method allows us to identify repeated fragments of a genome and to reconstruct them without misassemblies. The assemblies of data sets of bacteria Candidatus Microthrix, nematode Caenorhabditis elegans, and human chromosome 14 were evaluated with the golden standard tool QUAST. In comparison with other assemblers, GRASShopPER provided contigs that covered the largest part of the genomes and, at the same time, kept good values of other metrics, e.g., NG50 and misassembly rate.

Suggested Citation

  • Aleksandra Swiercz & Wojciech Frohmberg & Michal Kierzynka & Pawel Wojciechowski & Piotr Zurkowski & Jan Badura & Artur Laskowski & Marta Kasprzak & Jacek Blazewicz, 2018. "GRASShopPER—An algorithm for de novo assembly based on GPU alignments," PLOS ONE, Public Library of Science, vol. 13(8), pages 1-23, August.
  • Handle: RePEc:plo:pone00:0202355
    DOI: 10.1371/journal.pone.0202355
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0202355
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0202355&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0202355?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Jacek Blazewicz & Edmund Burke & Graham Kendall & Wojciech Mruczkiewicz & Ceyda Oguz & Aleksandra Swiercz, 2013. "A hyper-heuristic approach to sequencing by hybridization of DNA sequences," Annals of Operations Research, Springer, vol. 207(1), pages 27-41, August.
    2. Edmund K Burke & Michel Gendreau & Matthew Hyde & Graham Kendall & Gabriela Ochoa & Ender Özcan & Rong Qu, 2013. "Hyper-heuristics: a survey of the state of the art," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 64(12), pages 1695-1724, December.
    3. Edmund K. Burke & Matthew Hyde & Graham Kendall & Gabriela Ochoa & Ender Özcan & John R. Woodward, 2010. "A Classification of Hyper-heuristic Approaches," International Series in Operations Research & Management Science, in: Michel Gendreau & Jean-Yves Potvin (ed.), Handbook of Metaheuristics, chapter 0, pages 449-468, Springer.
    4. Blazewicz, Jacek & Kasprzak, Marta & Kierzynka, Michal & Frohmberg, Wojciech & Swiercz, Aleksandra & Wojciechowski, Pawel & Zurkowski, Piotr, 2018. "Graph algorithms for DNA sequencing – origins, current models and the future," European Journal of Operational Research, Elsevier, vol. 264(3), pages 799-812.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lale Özbakır & Gökhan Seçme, 2022. "A hyper-heuristic approach for stochastic parallel assembly line balancing problems with equipment costs," Operational Research, Springer, vol. 22(1), pages 577-614, March.
    2. Longlong Leng & Yanwei Zhao & Zheng Wang & Jingling Zhang & Wanliang Wang & Chunmiao Zhang, 2019. "A Novel Hyper-Heuristic for the Biobjective Regional Low-Carbon Location-Routing Problem with Multiple Constraints," Sustainability, MDPI, vol. 11(6), pages 1-31, March.
    3. Johnes, Jill, 2015. "Operational Research in education," European Journal of Operational Research, Elsevier, vol. 243(3), pages 683-696.
    4. Ahmed, Leena & Mumford, Christine & Kheiri, Ahmed, 2019. "Solving urban transit route design problem using selection hyper-heuristics," European Journal of Operational Research, Elsevier, vol. 274(2), pages 545-559.
    5. Olacir R. Castro & Gian Mauricio Fritsche & Aurora Pozo, 2018. "Evaluating selection methods on hyper-heuristic multi-objective particle swarm optimization," Journal of Heuristics, Springer, vol. 24(4), pages 581-616, August.
    6. Kheiri, Ahmed & Özcan, Ender, 2016. "An iterated multi-stage selection hyper-heuristic," European Journal of Operational Research, Elsevier, vol. 250(1), pages 77-90.
    7. Zhang, Yuchang & Bai, Ruibin & Qu, Rong & Tu, Chaofan & Jin, Jiahuan, 2022. "A deep reinforcement learning based hyper-heuristic for combinatorial optimisation with uncertainties," European Journal of Operational Research, Elsevier, vol. 300(2), pages 418-427.
    8. Aleksandra Swiercz & Edmund Burke & Mateusz Cichenski & Grzegorz Pawlak & Sanja Petrovic & Tomasz Zurkowski & Jacek Blazewicz, 2014. "Unified encoding for hyper-heuristics with application to bioinformatics," Central European Journal of Operations Research, Springer;Slovak Society for Operations Research;Hungarian Operational Research Society;Czech Society for Operations Research;Österr. Gesellschaft für Operations Research (ÖGOR);Slovenian Society Informatika - Section for Operational Research;Croatian Operational Research Society, vol. 22(3), pages 567-589, September.
    9. Franck Butelle & Laurent Alfandari & Camille Coti & Lucian Finta & Lucas Létocart & Gérard Plateau & Frédéric Roupin & Antoine Rozenknop & Roberto Wolfler Calvo, 2016. "Fast machine reassignment," Annals of Operations Research, Springer, vol. 242(1), pages 133-160, July.
    10. Yanwei Zhao & Longlong Leng & Chunmiao Zhang, 2021. "A novel framework of hyper-heuristic approach and its application in location-routing problem with simultaneous pickup and delivery," Operational Research, Springer, vol. 21(2), pages 1299-1332, June.
    11. Drake, John H. & Kheiri, Ahmed & Özcan, Ender & Burke, Edmund K., 2020. "Recent advances in selection hyper-heuristics," European Journal of Operational Research, Elsevier, vol. 285(2), pages 405-428.
    12. Fabio Caraffini & Giovanni Iacca, 2020. "The SOS Platform: Designing, Tuning and Statistically Benchmarking Optimisation Algorithms," Mathematics, MDPI, vol. 8(5), pages 1-31, May.
    13. Chen, Yujie & Cowling, Peter & Polack, Fiona & Remde, Stephen & Mourdjis, Philip, 2017. "Dynamic optimisation of preventative and corrective maintenance schedules for a large scale urban drainage system," European Journal of Operational Research, Elsevier, vol. 257(2), pages 494-510.
    14. Ahmed Kheiri & Alina G. Dragomir & David Mueller & Joaquim Gromicho & Caroline Jagtenberg & Jelke J. Hoorn, 2019. "Tackling a VRP challenge to redistribute scarce equipment within time windows using metaheuristic algorithms," EURO Journal on Transportation and Logistics, Springer;EURO - The Association of European Operational Research Societies, vol. 8(5), pages 561-595, December.
    15. Andrzej Kozik, 2017. "Handling precedence constraints in scheduling problems by the sequence pair representation," Journal of Combinatorial Optimization, Springer, vol. 33(2), pages 445-472, February.
    16. Marko Ɖurasević & Domagoj Jakobović, 2019. "Creating dispatching rules by simple ensemble combination," Journal of Heuristics, Springer, vol. 25(6), pages 959-1013, December.
    17. Gahm, Christian & Uzunoglu, Aykut & Wahl, Stefan & Ganschinietz, Chantal & Tuma, Axel, 2022. "Applying machine learning for the anticipation of complex nesting solutions in hierarchical production planning," European Journal of Operational Research, Elsevier, vol. 296(3), pages 819-836.
    18. W. B. Yates & E. C. Keedwell, 2019. "An analysis of heuristic subsequences for offline hyper-heuristic learning," Journal of Heuristics, Springer, vol. 25(3), pages 399-430, June.
    19. Derya Deliktaş, 2022. "Self-adaptive memetic algorithms for multi-objective single machine learning-effect scheduling problems with release times," Flexible Services and Manufacturing Journal, Springer, vol. 34(3), pages 748-784, September.
    20. Carlos Contreras Bolton & Gustavo Gatica & Víctor Parada, 2013. "Automatically Generated Algorithms for the Vertex Coloring Problem," PLOS ONE, Public Library of Science, vol. 8(3), pages 1-9, March.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0202355. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.