IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0018093.html
   My bibliography  Save this article

A Comprehensive Benchmark Study of Multiple Sequence Alignment Methods: Current Challenges and Future Perspectives

Author

Listed:
  • Julie D Thompson
  • Benjamin Linard
  • Odile Lecompte
  • Olivier Poch

Abstract

Multiple comparison or alignmentof protein sequences has become a fundamental tool in many different domains in modern molecular biology, from evolutionary studies to prediction of 2D/3D structure, molecular function and inter-molecular interactions etc. By placing the sequence in the framework of the overall family, multiple alignments can be used to identify conserved features and to highlight differences or specificities. In this paper, we describe a comprehensive evaluation of many of the most popular methods for multiple sequence alignment (MSA), based on a new benchmark test set. The benchmark is designed to represent typical problems encountered when aligning the large protein sequence sets that result from today's high throughput biotechnologies. We show that alignmentmethods have significantly progressed and can now identify most of the shared sequence features that determine the broad molecular function(s) of a protein family, even for divergent sequences. However,we have identified a number of important challenges. First, the locally conserved regions, that reflect functional specificities or that modulate a protein's function in a given cellular context,are less well aligned. Second, motifs in natively disordered regions are often misaligned. Third, the badly predicted or fragmentary protein sequences, which make up a large proportion of today's databases, lead to a significant number of alignment errors. Based on this study, we demonstrate that the existing MSA methods can be exploited in combination to improve alignment accuracy, although novel approaches will still be needed to fully explore the most difficult regions. We then propose knowledge-enabled, dynamic solutions that will hopefully pave the way to enhanced alignment construction and exploitation in future evolutionary systems biology studies.

Suggested Citation

  • Julie D Thompson & Benjamin Linard & Odile Lecompte & Olivier Poch, 2011. "A Comprehensive Benchmark Study of Multiple Sequence Alignment Methods: Current Challenges and Future Perspectives," PLOS ONE, Public Library of Science, vol. 6(3), pages 1-14, March.
  • Handle: RePEc:plo:pone00:0018093
    DOI: 10.1371/journal.pone.0018093
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0018093
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0018093&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0018093?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Casey W. Dunn & Andreas Hejnol & David Q. Matus & Kevin Pang & William E. Browne & Stephen A. Smith & Elaine Seaver & Greg W. Rouse & Matthias Obst & Gregory D. Edgecombe & Martin V. Sørensen & Steven, 2008. "Broad phylogenomic sampling improves resolution of the animal tree of life," Nature, Nature, vol. 452(7188), pages 745-749, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Saeedeh Akbari Rokn Abadi & Negin Hashemi Dijujin & Somayyeh Koohi, 2021. "Optical pattern generator for efficient bio-data encoding in a photonic sequence comparison architecture," PLOS ONE, Public Library of Science, vol. 16(1), pages 1-27, January.
    2. Amin Hosseininasab & Willem-Jan van Hoeve, 2021. "Exact Multiple Sequence Alignment by Synchronized Decision Diagrams," INFORMS Journal on Computing, INFORMS, vol. 33(2), pages 721-738, May.
    3. Ya-Mei Ding & Xiao-Xu Pang & Yu Cao & Wei-Ping Zhang & Susanne S. Renner & Da-Yong Zhang & Wei-Ning Bai, 2023. "Genome structure-based Juglandaceae phylogenies contradict alignment-based phylogenies and substitution rates vary with DNA repair genes," Nature Communications, Nature, vol. 14(1), pages 1-13, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Helen E. Robertson & Arnau Sebé-Pedrós & Baptiste Saudemont & Yann Loe-Mie & Anne-C. Zakrzewski & Xavier Grau-Bové & Marie-Pierre Mailhe & Philipp Schiffer & Maximilian J. Telford & Heather Marlow, 2024. "Single cell atlas of Xenoturbella bocki highlights limited cell-type complexity," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    2. Matthew Goulty & Gaelle Botton-Amiot & Ezio Rosato & Simon G. Sprecher & Roberto Feuda, 2023. "The monoaminergic system is a bilaterian innovation," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    3. M Antonio Todaro & Tobias Kånneby & Matteo Dal Zotto & Ulf Jondelius, 2011. "Phylogeny of Thaumastodermatidae (Gastrotricha: Macrodasyida) Inferred from Nuclear and Mitochondrial Sequence Data," PLOS ONE, Public Library of Science, vol. 6(3), pages 1-13, March.
    4. Maria E Gallegos & Sanjeev Balakrishnan & Priya Chandramouli & Shaily Arora & Aruna Azameera & Anitha Babushekar & Emilee Bargoma & Abdulmalik Bokhari & Siva Kumari Chava & Pranti Das & Meetali Desai , 2012. "The C. elegans Rab Family: Identification, Classification and Toolkit Construction," PLOS ONE, Public Library of Science, vol. 7(11), pages 1-19, November.
    5. Emese Meglécz & Gabriel Nève & Ed Biffin & Michael G Gardner, 2012. "Breakdown of Phylogenetic Signal: A Survey of Microsatellite Densities in 454 Shotgun Sequences from 154 Non Model Eukaryote Species," PLOS ONE, Public Library of Science, vol. 7(7), pages 1-15, July.
    6. Lauren E. Vandepas & Caroline Stefani & Phillip P. Domeier & Nikki Traylor-Knowles & Frederick W. Goetz & William E. Browne & Adam Lacy-Hulbert, 2024. "Extracellular DNA traps in a ctenophore demonstrate immune cell behaviors in a non-bilaterian," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    7. Bryan Korithoski & Oralia Kolaczkowski & Krishanu Mukherjee & Reema Kola & Chandra Earl & Bryan Kolaczkowski, 2015. "Evolution of a Novel Antiviral Immune-Signaling Interaction by Partial-Gene Duplication," PLOS ONE, Public Library of Science, vol. 10(9), pages 1-26, September.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0018093. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.