IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0199818.html
   My bibliography  Save this article

Divide and conquer! Data-mining tools and sequential multivariate analysis to search for diagnostic morphological characters within a plant polyploid complex (Veronica subsect. Pentasepalae, Plantaginaceae)

Author

Listed:
  • Noemí López-González
  • Santiago Andrés-Sánchez
  • Blanca M Rojas-Andrés
  • M Montserrat Martínez-Ortega

Abstract

This study exhaustively explores leaf features seeking diagnostic characters to aid the classification (assigning cases to groups, i.e. populations to taxa) in a polyploid plant-species complex. A challenging case study was selected: Veronica subsection Pentasepalae, a taxonomically intricate group. The “divide and conquer” approach was implemented—that is, a difficult primary dataset was split into more manageable subsets. Three techniques were explored: two data-mining tools (artificial neural networks and decision trees) and one unsupervised discriminant analysis. However, only the decision trees and discriminant analysis were finally used to select diagnostic traits. A previously established classification hypothesis based on other data sources was used as a starting point. A guided discriminant analysis (i.e. involving manual character selection) was used to produce a grouping scheme fitting this hypothesis so that it could be taken as a reference. Sequential unsupervised multivariate analysis enabled the recognition of all species and infraspecific taxa; however, a suboptimal classification rate was achieved. Decision trees resulted in better classification rates than unsupervised multivariate analysis, but three complete taxa were misidentified (not present in terminal nodes). The variable selection led to a different grouping scheme in the case of decision trees. The resulting groups displayed low misclassification rates when analyzed using artificial neural networks. The decision trees as well as the discriminant analysis are recommended in the search of diagnostic characters. Due to the high sensitivity that artificial neural networks have to the combination of input/output layers, they are proposed as evaluation tools for morphometric studies. The “divide and conquer” principle is a promising strategy, providing success in the present case study.

Suggested Citation

  • Noemí López-González & Santiago Andrés-Sánchez & Blanca M Rojas-Andrés & M Montserrat Martínez-Ortega, 2018. "Divide and conquer! Data-mining tools and sequential multivariate analysis to search for diagnostic morphological characters within a plant polyploid complex (Veronica subsect. Pentasepalae, Plantagin," PLOS ONE, Public Library of Science, vol. 13(6), pages 1-27, June.
  • Handle: RePEc:plo:pone00:0199818
    DOI: 10.1371/journal.pone.0199818
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0199818
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0199818&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0199818?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0199818. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.