IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0020957.html
   My bibliography  Save this article

Effects of the Training Dataset Characteristics on the Performance of Nine Species Distribution Models: Application to Diabrotica virgifera virgifera

Author

Listed:
  • Maxime Dupin
  • Philippe Reynaud
  • Vojtěch Jarošík
  • Richard Baker
  • Sarah Brunel
  • Dominic Eyre
  • Jan Pergl
  • David Makowski

Abstract

Many distribution models developed to predict the presence/absence of invasive alien species need to be fitted to a training dataset before practical use. The training dataset is characterized by the number of recorded presences/absences and by their geographical locations. The aim of this paper is to study the effect of the training dataset characteristics on model performance and to compare the relative importance of three factors influencing model predictive capability; size of training dataset, stage of the biological invasion, and choice of input variables. Nine models were assessed for their ability to predict the distribution of the western corn rootworm, Diabrotica virgifera virgifera, a major pest of corn in North America that has recently invaded Europe. Twenty-six training datasets of various sizes (from 10 to 428 presence records) corresponding to two different stages of invasion (1955 and 1980) and three sets of input bioclimatic variables (19 variables, six variables selected using information on insect biology, and three linear combinations of 19 variables derived from Principal Component Analysis) were considered. The models were fitted to each training dataset in turn and their performance was assessed using independent data from North America and Europe. The models were ranked according to the area under the Receiver Operating Characteristic curve and the likelihood ratio. Model performance was highly sensitive to the geographical area used for calibration; most of the models performed poorly when fitted to a restricted area corresponding to an early stage of the invasion. Our results also showed that Principal Component Analysis was useful in reducing the number of model input variables for the models that performed poorly with 19 input variables. DOMAIN, Environmental Distance, MAXENT, and Envelope Score were the most accurate models but all the models tested in this study led to a substantial rate of mis-classification.

Suggested Citation

  • Maxime Dupin & Philippe Reynaud & Vojtěch Jarošík & Richard Baker & Sarah Brunel & Dominic Eyre & Jan Pergl & David Makowski, 2011. "Effects of the Training Dataset Characteristics on the Performance of Nine Species Distribution Models: Application to Diabrotica virgifera virgifera," PLOS ONE, Public Library of Science, vol. 6(6), pages 1-11, June.
  • Handle: RePEc:plo:pone00:0020957
    DOI: 10.1371/journal.pone.0020957
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0020957
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0020957&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0020957?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. John M. Humphreys & Robert B. Srygley & David H. Branson, 2022. "Geographic Variation in Migratory Grasshopper Recruitment under Projected Climate Change," Geographies, MDPI, vol. 2(1), pages 1-19, January.
    2. Abdulwahab, Umarfarooq A. & Hammill, Edd & Hawkins, Charles P., 2022. "Choice of climate data affects the performance and interpretation of species distribution models," Ecological Modelling, Elsevier, vol. 471(C).
    3. Szalai, Márk & Kiss, József & Kövér, Szilvia & Toepfer, Stefan, 2014. "Simulating crop rotation strategies with a spatiotemporal lattice model to improve legislation for the management of the maize pest Diabrotica virgifera virgifera," Agricultural Systems, Elsevier, vol. 124(C), pages 39-50.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0020957. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.