IDEAS home Printed from https://ideas.repec.org/a/eee/jmvana/v146y2016icp72-83.html
   My bibliography  Save this article

On the asymptotics of random forests

Author

Listed:
  • Scornet, Erwan

Abstract

The last decade has witnessed a growing interest in random forest models which are recognized to exhibit good practical performance, especially in high-dimensional settings. On the theoretical side, however, their predictive power remains largely unexplained, thereby creating a gap between theory and practice. In this paper, we present some asymptotic results on random forests in a regression framework. Firstly, we provide theoretical guarantees to link finite forests used in practice (with a finite number M of trees) to their asymptotic counterparts (with M=∞). Using empirical process theory, we prove a uniform central limit theorem for a large class of random forest estimates, which holds in particular for Breiman’s (2001) original forests. Secondly, we show that infinite forest consistency implies finite forest consistency and thus, we state the consistency of several infinite forests. In particular, we prove that q quantile forests–close in spirit to Breiman’s (2001) forests but easier to study–are able to combine inconsistent trees to obtain a final consistent prediction, thus highlighting the benefits of random forests compared to single trees.

Suggested Citation

  • Scornet, Erwan, 2016. "On the asymptotics of random forests," Journal of Multivariate Analysis, Elsevier, vol. 146(C), pages 72-83.
  • Handle: RePEc:eee:jmvana:v:146:y:2016:i:c:p:72-83
    DOI: 10.1016/j.jmva.2015.06.009
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0047259X15001542
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jmva.2015.06.009?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Ishwaran, Hemant & Kogalur, Udaya B., 2010. "Consistency of random survival forests," Statistics & Probability Letters, Elsevier, vol. 80(13-14), pages 1056-1064, July.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Zhexiao Lin & Fang Han, 2022. "On regression-adjusted imputation estimators of the average treatment effect," Papers 2212.05424, arXiv.org, revised Jan 2023.
    2. Jiaming Mao & Jingzhi Xu, 2020. "Ensemble Learning with Statistical and Structural Models," Papers 2006.05308, arXiv.org.
    3. Lotfi Boudabsa & Damir Filipovi'c, 2022. "Ensemble learning for portfolio valuation and risk management," Papers 2204.05926, arXiv.org.
    4. Raval, Devesh & Rosenbaum, Ted & Wilson, Nathan E., 2021. "How do machine learning algorithms perform in predicting hospital choices? evidence from changing environments," Journal of Health Economics, Elsevier, vol. 78(C).
    5. Ning Guo & Li Xu & Wei Gao & Hongwei Xia & Min Xie & Xiaohan Ren, 2024. "Progress in the Application of Laser-Induced Breakdown Spectroscopy in Coal Quality Analysis," Energies, MDPI, vol. 17(14), pages 1-36, July.
    6. Emilio Carrizosa & Cristina Molero-Río & Dolores Romero Morales, 2021. "Mathematical optimization in classification and regression trees," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 29(1), pages 5-33, April.
    7. Ramosaj, Burim & Pauly, Markus, 2019. "Consistent estimation of residual variance with random forest Out-Of-Bag errors," Statistics & Probability Letters, Elsevier, vol. 151(C), pages 49-57.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Susan Athey & Julie Tibshirani & Stefan Wager, 2016. "Generalized Random Forests," Papers 1610.01271, arXiv.org, revised Apr 2018.
    2. Hoora Moradian & Denis Larocque & François Bellavance, 2017. "$$L_1$$ L 1 splitting rules in survival forests," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 23(4), pages 671-691, October.
    3. Yifei Sun & Sy Han Chiou & Mei‐Cheng Wang, 2020. "ROC‐guided survival trees and ensembles," Biometrics, The International Biometric Society, vol. 76(4), pages 1177-1189, December.
    4. Wenju Mo & Yuqin Ding & Shuai Zhao & Dehong Zou & Xiaowen Ding, 2020. "Identification of a 6-gene signature for the survival prediction of breast cancer patients based on integrated multi-omics data analysis," PLOS ONE, Public Library of Science, vol. 15(11), pages 1-18, November.
    5. Gérard Biau & Erwan Scornet, 2016. "A random forest guided tour," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 25(2), pages 197-227, June.
    6. Claudia Bühnemann & Simon Li & Haiyue Yu & Harriet Branford White & Karl L Schäfer & Antonio Llombart-Bosch & Isidro Machado & Piero Picci & Pancras C W Hogendoorn & Nicholas A Athanasou & J Alison No, 2014. "Quantification of the Heterogeneity of Prognostic Cellular Biomarkers in Ewing Sarcoma Using Automated Image and Random Survival Forest Analysis," PLOS ONE, Public Library of Science, vol. 9(9), pages 1-14, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jmvana:v:146:y:2016:i:c:p:72-83. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.