IDEAS home Printed from https://ideas.repec.org/a/bla/jorssb/v80y2018i5p1087-1110.html
   My bibliography  Save this article

Bayesian regression tree ensembles that adapt to smoothness and sparsity

Author

Listed:
  • Antonio R. Linero
  • Yun Yang

Abstract

Ensembles of decision trees are a useful tool for obtaining flexible estimates of regression functions. Examples of these methods include gradient‐boosted decision trees, random forests and Bayesian classification and regression trees. Two potential shortcomings of tree ensembles are their lack of smoothness and their vulnerability to the curse of dimensionality. We show that these issues can be overcome by instead considering sparsity inducing soft decision trees in which the decisions are treated as probabilistic. We implement this in the context of the Bayesian additive regression trees framework and illustrate its promising performance through testing on benchmark data sets. We provide strong theoretical support for our methodology by showing that the posterior distribution concentrates at the minimax rate (up to a logarithmic factor) for sparse functions and functions with additive structures in the high dimensional regime where the dimensionality of the covariate space is allowed to grow nearly exponentially in the sample size. Our method also adapts to the unknown smoothness and sparsity levels, and can be implemented by making minimal modifications to existing Bayesian additive regression tree algorithms.

Suggested Citation

  • Antonio R. Linero & Yun Yang, 2018. "Bayesian regression tree ensembles that adapt to smoothness and sparsity," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 80(5), pages 1087-1110, November.
  • Handle: RePEc:bla:jorssb:v:80:y:2018:i:5:p:1087-1110
    DOI: 10.1111/rssb.12293
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/rssb.12293
    Download Restriction: no

    File URL: https://libkey.io/10.1111/rssb.12293?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Oyebayo Ridwan Olaniran & Ali Rashash R. Alzahrani, 2023. "On the Oracle Properties of Bayesian Random Forest for Sparse High-Dimensional Gaussian Regression," Mathematics, MDPI, vol. 11(24), pages 1-29, December.
    2. Tsionas, Mike, 2022. "Efficiency estimation using probabilistic regression trees with an application to Chilean manufacturing industries," International Journal of Production Economics, Elsevier, vol. 249(C).
    3. Zhang, Yaojun & Ji, Lanpeng & Aivaliotis, Georgios & Taylor, Charles, 2024. "Bayesian CART models for insurance claims frequency," Insurance: Mathematics and Economics, Elsevier, vol. 114(C), pages 108-131.
    4. Falco J. Bargagli-Stoffi & Fabio Incerti & Massimo Riccaboni & Armando Rungi, 2023. "Machine Learning for Zombie Hunting: Predicting Distress from Firms' Accounts and Missing Values," Papers 2306.08165, arXiv.org.
    5. Yaojun Zhang & Lanpeng Ji & Georgios Aivaliotis & Charles Taylor, 2023. "Bayesian CART models for insurance claims frequency," Papers 2303.01923, arXiv.org, revised Dec 2023.
    6. Maia, Mateus & Murphy, Keefe & Parnell, Andrew C., 2024. "GP-BART: A novel Bayesian additive regression trees approach using Gaussian processes," Computational Statistics & Data Analysis, Elsevier, vol. 190(C).
    7. Philippe Goulet Coulombe & Mikael Frenette & Karin Klieber, 2023. "From Reactive to Proactive Volatility Modeling with Hemisphere Neural Networks," Working Papers 23-04, Chair in macroeconomics and forecasting, University of Quebec in Montreal's School of Management, revised Nov 2023.
    8. Jaouad Mourtada & Stéphane Gaïffas & Erwan Scornet, 2021. "AMF: Aggregated Mondrian forests for online learning," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(3), pages 505-533, July.
    9. Piyali Basak & Antonio Linero & Debajyoti Sinha & Stuart Lipsitz, 2022. "Semiparametric analysis of clustered interval‐censored survival data using soft Bayesian additive regression trees (SBART)," Biometrics, The International Biometric Society, vol. 78(3), pages 880-893, September.
    10. Philippe Goulet Coulombe & Mikael Frenette & Karin Klieber, 2023. "From Reactive to Proactive Volatility Modeling with Hemisphere Neural Networks," Papers 2311.16333, arXiv.org, revised Apr 2024.
    11. Falco J. Bargagli Stoffi & Kenneth De Beckker & Joana E. Maldonado & Kristof De Witte, 2021. "Assessing Sensitivity of Machine Learning Predictions.A Novel Toolbox with an Application to Financial Literacy," Papers 2102.04382, arXiv.org.
    12. Eoghan O'Neill, 2022. "Type I Tobit Bayesian Additive Regression Trees for Censored Outcome Regression," Papers 2211.07506, arXiv.org, revised Feb 2024.
    13. Lamprinakou, Stamatina & Barahona, Mauricio & Flaxman, Seth & Filippi, Sarah & Gandy, Axel & McCoy, Emma J., 2023. "BART-based inference for Poisson processes," Computational Statistics & Data Analysis, Elsevier, vol. 180(C).
    14. Yakun Wang & Zeda Li & Scott A. Bruce, 2023. "Adaptive Bayesian sum of trees model for covariate‐dependent spectral analysis," Biometrics, The International Biometric Society, vol. 79(3), pages 1826-1839, September.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jorssb:v:80:y:2018:i:5:p:1087-1110. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/rssssea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.