IDEAS home Printed from https://ideas.repec.org/a/bla/jorssb/v83y2021i3p505-533.html
   My bibliography  Save this article

AMF: Aggregated Mondrian forests for online learning

Author

Listed:
  • Jaouad Mourtada
  • Stéphane Gaïffas
  • Erwan Scornet

Abstract

Random forest (RF) is one of the algorithms of choice in many supervised learning applications, be it classification or regression. The appeal of such tree‐ensemble methods comes from a combination of several characteristics: a remarkable accuracy in a variety of tasks, a small number of parameters to tune, robustness with respect to features scaling, a reasonable computational cost for training and prediction, and their suitability in high‐dimensional settings. The most commonly used RF variants, however, are ‘offline’ algorithms, which require the availability of the whole dataset at once. In this paper, we introduce AMF, an online RF algorithm based on Mondrian Forests. Using a variant of the context tree weighting algorithm, we show that it is possible to efficiently perform an exact aggregation over all prunings of the trees; in particular, this enables to obtain a truly online parameter‐free algorithm which is competitive with the optimal pruning of the Mondrian tree, and thus adaptive to the unknown regularity of the regression function. Numerical experiments show that AMF is competitive with respect to several strong baselines on a large number of datasets for multi‐class classification.

Suggested Citation

  • Jaouad Mourtada & Stéphane Gaïffas & Erwan Scornet, 2021. "AMF: Aggregated Mondrian forests for online learning," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(3), pages 505-533, July.
  • Handle: RePEc:bla:jorssb:v:83:y:2021:i:3:p:505-533
    DOI: 10.1111/rssb.12425
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/rssb.12425
    Download Restriction: no

    File URL: https://libkey.io/10.1111/rssb.12425?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Robin Genuer, 2012. "Variance reduction in purely random forests," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 24(3), pages 543-562.
    2. Taddy, Matthew A. & Gramacy, Robert B. & Polson, Nicholas G., 2011. "Dynamic Trees for Learning and Design," Journal of the American Statistical Association, American Statistical Association, vol. 106(493), pages 109-123.
    3. Antonio R. Linero & Yun Yang, 2018. "Bayesian regression tree ensembles that adapt to smoothness and sparsity," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 80(5), pages 1087-1110, November.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Oyebayo Ridwan Olaniran & Ali Rashash R. Alzahrani, 2023. "On the Oracle Properties of Bayesian Random Forest for Sparse High-Dimensional Gaussian Regression," Mathematics, MDPI, vol. 11(24), pages 1-29, December.
    2. Philippe Goulet Coulombe & Mikael Frenette & Karin Klieber, 2023. "From Reactive to Proactive Volatility Modeling with Hemisphere Neural Networks," Working Papers 23-04, Chair in macroeconomics and forecasting, University of Quebec in Montreal's School of Management, revised Nov 2023.
    3. Moulin, Thibault & Perasso, Antoine & Gillet, François, 2018. "Modelling vegetation dynamics in managed grasslands: Responses to drivers depend on species richness," Ecological Modelling, Elsevier, vol. 374(C), pages 22-36.
    4. Falco J. Bargagli Stoffi & Kenneth De Beckker & Joana E. Maldonado & Kristof De Witte, 2021. "Assessing Sensitivity of Machine Learning Predictions.A Novel Toolbox with an Application to Financial Literacy," Papers 2102.04382, arXiv.org.
    5. Philippe Goulet Coulombe & Mikael Frenette & Karin Klieber, 2023. "From Reactive to Proactive Volatility Modeling with Hemisphere Neural Networks," Papers 2311.16333, arXiv.org, revised Apr 2024.
    6. Yakun Wang & Zeda Li & Scott A. Bruce, 2023. "Adaptive Bayesian sum of trees model for covariate‐dependent spectral analysis," Biometrics, The International Biometric Society, vol. 79(3), pages 1826-1839, September.
    7. Gérard Biau & Erwan Scornet, 2016. "A random forest guided tour," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 25(2), pages 197-227, June.
    8. Eoghan O'Neill, 2022. "Type I Tobit Bayesian Additive Regression Trees for Censored Outcome Regression," Papers 2211.07506, arXiv.org, revised Feb 2024.
    9. Silvia Golia & Luigi Grossi & Matteo Pelagatti, 2022. "Machine Learning Models and Intra-Daily Market Information for the Prediction of Italian Electricity Prices," Forecasting, MDPI, vol. 5(1), pages 1-21, December.
    10. Javier Pórtoles & Camino González & Javier M. Moguerza, 2018. "Electricity Price Forecasting with Dynamic Trees: A Benchmark Against the Random Forest Approach," Energies, MDPI, vol. 11(6), pages 1-21, June.
    11. Yaojun Zhang & Lanpeng Ji & Georgios Aivaliotis & Charles Taylor, 2023. "Bayesian CART models for insurance claims frequency," Papers 2303.01923, arXiv.org, revised Dec 2023.
    12. Charles Audet & Michael Kokkolaras & Sébastien Le Digabel & Bastien Talgorn, 2018. "Order-based error for managing ensembles of surrogates in mesh adaptive direct search," Journal of Global Optimization, Springer, vol. 70(3), pages 645-675, March.
    13. Maia, Mateus & Murphy, Keefe & Parnell, Andrew C., 2024. "GP-BART: A novel Bayesian additive regression trees approach using Gaussian processes," Computational Statistics & Data Analysis, Elsevier, vol. 190(C).
    14. Ruimeng Hu, 2019. "Deep Learning for Ranking Response Surfaces with Applications to Optimal Stopping Problems," Papers 1901.03478, arXiv.org, revised Mar 2020.
    15. Rogelio Ochoa-Barragán & Aurora del Carmen Munguía-López & José María Ponce-Ortega, 2024. "A hybrid machine learning-mathematical programming optimization approach for municipal solid waste management during the pandemic," Environment, Development and Sustainability: A Multidisciplinary Approach to the Theory and Practice of Sustainable Development, Springer, vol. 26(7), pages 17653-17672, July.
    16. Wei-Yin Loh, 2014. "Fifty Years of Classification and Regression Trees," International Statistical Review, International Statistical Institute, vol. 82(3), pages 329-348, December.
    17. Sylvain Arlot & Robin Genuer, 2016. "Comments on: A random forest guided tour," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 25(2), pages 228-238, June.
    18. Yi Liu & Veronika Ročková & Yuexi Wang, 2021. "Variable selection with ABC Bayesian forests," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(3), pages 453-481, July.
    19. Piyali Basak & Antonio Linero & Debajyoti Sinha & Stuart Lipsitz, 2022. "Semiparametric analysis of clustered interval‐censored survival data using soft Bayesian additive regression trees (SBART)," Biometrics, The International Biometric Society, vol. 78(3), pages 880-893, September.
    20. Lamprinakou, Stamatina & Barahona, Mauricio & Flaxman, Seth & Filippi, Sarah & Gandy, Axel & McCoy, Emma J., 2023. "BART-based inference for Poisson processes," Computational Statistics & Data Analysis, Elsevier, vol. 180(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jorssb:v:83:y:2021:i:3:p:505-533. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/rssssea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.