IDEAS home Printed from https://ideas.repec.org/p/inn/wpaper/2016-11.html
   My bibliography  Save this paper

A Toolkit for Stability Assessment of Tree-Based Learners

Author

Listed:
  • Michel Philipp
  • Achim Zeileis
  • Carolin Strobl

Abstract

Recursive partitioning techniques are established and frequently applied for exploring unknown structures in complex and possibly high-dimensional data sets. The methods can be used to detect interactions and nonlinear structures in a data-driven way by recursively splitting the predictor space to form homogeneous groups of observations. However, while the resulting trees are easy to interpret, they are also known to be potentially unstable. Altering the data slightly can change either the variables and/or the cutpoints selected for splitting. Moreover, the methods do not provide measures of confidence for the selected splits and therefore users cannot assess the uncertainty of a given fitted tree. We present a toolkit of descriptive measures and graphical illustrations based on resampling, that can be used to assess the stability of the variable and cutpoint selection in recursive partitioning. The summary measures and graphics available in the toolkit are illustrated using a real world data set and implemented in the R package stablelearner.

Suggested Citation

  • Michel Philipp & Achim Zeileis & Carolin Strobl, 2016. "A Toolkit for Stability Assessment of Tree-Based Learners," Working Papers 2016-11, Faculty of Economics and Statistics, Universität Innsbruck.
  • Handle: RePEc:inn:wpaper:2016-11
    as

    Download full text from publisher

    File URL: https://www2.uibk.ac.at/downloads/c4041030/wpaper/2016-11.pdf
    Download Restriction: no
    ---><---

    More about this item

    Keywords

    stability; recursive partitioning; variable selection; cutpoint selection; decision trees;
    All these keywords.

    JEL classification:

    • C52 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Model Evaluation, Validation, and Selection
    • C14 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Semiparametric and Nonparametric Methods: General
    • C45 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods: Special Topics - - - Neural Networks and Related Topics
    • C87 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Econometric Software

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inn:wpaper:2016-11. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Janette Walde (email available below). General contact details of provider: https://edirc.repec.org/data/fuibkat.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.