IDEAS home Printed from https://ideas.repec.org/a/eee/ecomod/v326y2016icp4-12.html
   My bibliography  Save this article

The effects of model and data complexity on predictions from species distributions models

Author

Listed:
  • García-Callejas, David
  • Araújo, Miguel B.

Abstract

How complex does a model need to be to provide useful predictions is a matter of continuous debate across environmental sciences. In the species distributions modelling literature, studies have demonstrated that more complex models tend to provide better fits. However, studies have also shown that predictive performance does not always increase with complexity. Testing of species distributions models is challenging because independent data for testing are often lacking, but a more general problem is that model complexity has never been formally described in such studies. Here, we systematically examine predictive performance of models against data and models of varying complexity. We introduce the concept of computational complexity, widely used in theoretical computer sciences, to quantify model complexity. In addition, complexity of species distributional data is characterized by their geometrical properties. Tests involved analysis of models’ ability to predict virtual species distributions in the same region and the same time as used for training the models, and to project distributions in different times under climate change. Of the eight species distribution models analyzed five (Random Forest, boosted regression trees, generalized additive models, multivariate adaptive regression splines, MaxEnt) showed similar performance despite differences in computational complexity. The ability of models to forecast distributions under climate change was also not affected by model complexity. In contrast, geometrical characteristics of the data were related to model performance in several ways: complex datasets were consistently more difficult to model, and the complexity of the data was affected by the choice of predictors and the type of data analyzed. Given our definition of complexity, our study contradicts the widely held view that the complexity of species distributions models has significant effects in their predictive ability while findings support for previous observations that the properties of species distributions data and their relationship with the environment are strong predictors of model success.

Suggested Citation

  • García-Callejas, David & Araújo, Miguel B., 2016. "The effects of model and data complexity on predictions from species distributions models," Ecological Modelling, Elsevier, vol. 326(C), pages 4-12.
  • Handle: RePEc:eee:ecomod:v:326:y:2016:i:c:p:4-12
    DOI: 10.1016/j.ecolmodel.2015.06.002
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0304380015002513
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ecolmodel.2015.06.002?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. García-Valdés, Raúl & Gotelli, Nicholas J. & Zavala, Miguel A. & Purves, Drew W. & Araújo, Miguel B., 2015. "Effects of climate, species interactions, and dispersal on decadal colonization and extinction rates of Iberian tree species," Ecological Modelling, Elsevier, vol. 309, pages 118-127.
    2. Halekoh, Ulrich & Højsgaard, Søren, 2014. "A Kenward-Roger Approximation and Parametric Bootstrap Methods for Tests in Linear Mixed Models The R Package pbkrtest," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 59(i09).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Martin, Manon & Govaerts, Bernadette, 2019. "LiMM-PCA : combining ASCA+ and linear mixed models to analyse high dimensional designed data," LIDAM Discussion Papers ISBA 2019021, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    2. Paul E. Rose & James E. Brereton & Lewis J. Rowden & Ricardo Lemos Figueiredo & Lisa M. Riley, 2019. "What’s new from the zoo? An analysis of ten years of zoo-themed research output," Palgrave Communications, Palgrave Macmillan, vol. 5(1), pages 1-10, December.
    3. Byron C. Jaeger & Lloyd J. Edwards & Kalyan Das & Pranab K. Sen, 2017. "An statistic for fixed effects in the generalized linear mixed model," Journal of Applied Statistics, Taylor & Francis Journals, vol. 44(6), pages 1086-1105, April.
    4. Waseem Alnosaier & David Birkes, 2019. "Inner workings of the Kenward–Roger test," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 82(2), pages 195-223, March.
    5. Lenth, Russell V., 2016. "Least-Squares Means: The R Package lsmeans," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 69(i01).
    6. Sebastian Loth & Katharina Jettka & Manuel Giuliani & Stefan Kopp & Jan P de Ruiter, 2018. "Confidence in uncertainty: Error cost and commitment in early speech hypotheses," PLOS ONE, Public Library of Science, vol. 13(8), pages 1-30, August.
    7. King, K.W. & Hanrahan, B.R. & Stinner, J. & Shedekar, V.S., 2022. "Field scale discharge and water quality response, to drainage water management," Agricultural Water Management, Elsevier, vol. 264(C).
    8. Gesche Janzarik & Daniel Wollschläger & Michèle Wessa & Klaus Lieb, 2022. "A Group Intervention to Promote Resilience in Nursing Professionals: A Randomised Controlled Trial," IJERPH, MDPI, vol. 19(2), pages 1-18, January.
    9. Govaerts, B. & Francq, B. & Marion, R. & Martin, M. & Thiel, M., 2020. "The essentials on linear regression, ANOVA, general linear and linear mixed models for the chemist," LIDAM Discussion Papers ISBA 2020012, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    10. James Simpson & Megan Freeth & Kimberley Jayne Simpson & Kevin Thwaites, 2022. "Street edge subdivision: Structuring ground floor interfaces to stimulate pedestrian visual engagement," Environment and Planning B, , vol. 49(6), pages 1775-1791, July.
    11. Rúna Í. Magnússon & Alexandra Hamm & Sergey V. Karsanaev & Juul Limpens & David Kleijn & Andrew Frampton & Trofim C. Maximov & Monique M. P. D. Heijmans, 2022. "Extremely wet summer events enhance permafrost thaw for multiple years in Siberian tundra," Nature Communications, Nature, vol. 13(1), pages 1-10, December.
    12. Haoyang Lu & Li Yi & Hang Zhang, 2019. "Autistic traits influence the strategic diversity of information sampling: Insights from two-stage decision models," PLOS Computational Biology, Public Library of Science, vol. 15(12), pages 1-29, December.
    13. László Zsolt Garamszegi & Magdalena Zagalska-Neubauer & David Canal & Gábor Markó & Eszter Szász & Sándor Zsebők & Eszter Szöllősi & Gábor Herczeg & János Török, 2015. "Malaria parasites, immune challenge, MHC variability, and predator avoidance in a passerine bird," Behavioral Ecology, International Society for Behavioral Ecology, vol. 26(5), pages 1292-1302.
    14. Blake B. Anderson & Andrew Scott & Reuven Dukas, 2016. "Social behavior and activity are decoupled in larval and adult fruit flies," Behavioral Ecology, International Society for Behavioral Ecology, vol. 27(3), pages 820-828.
    15. Joren Raymenants & Caspar Geenen & Lore Budts & Jonathan Thibaut & Marijn Thijssen & Hannelore Mulder & Sarah Gorissen & Bastiaan Craessaerts & Lies Laenen & Kurt Beuselinck & Sien Ombelet & Els Keyae, 2023. "Indoor air surveillance and factors associated with respiratory pathogen detection in community settings in Belgium," Nature Communications, Nature, vol. 14(1), pages 1-11, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ecomod:v:326:y:2016:i:c:p:4-12. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.journals.elsevier.com/ecological-modelling .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.