IDEAS home Printed from https://ideas.repec.org/p/hhs/lunewp/2018_012.html
   My bibliography  Save this paper

Asymptotically Optimal Regression Trees

Author

Listed:

Abstract

Regression trees are evaluated with respect to mean square error (MSE), mean integrated square error (MISE), and integrated squared error (ISE), as the size of the training sample goes to infinity. The asymptotically MSE- and MISE minimizing (locally adaptive) regression trees are characterized. Under an optimal tree, MSE is O(n^{-2/3}). The estimator is shown to be asymptotically normally distributed. An estimator for ISE is also proposed, which may be used as a complement to cross-validation in the pruning of trees.

Suggested Citation

  • Mohlin , Erik, 2018. "Asymptotically Optimal Regression Trees," Working Papers 2018:12, Lund University, Department of Economics.
  • Handle: RePEc:hhs:lunewp:2018_012
    as

    Download full text from publisher

    File URL: https://lucris.lub.lu.se/ws/portalfiles/portal/194854013/WP18_12
    File Function: Full text
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Cattaneo, Matias D. & Farrell, Max H., 2013. "Optimal convergence rates, Bahadur representation, and asymptotic normality of partitioning estimators," Journal of Econometrics, Elsevier, vol. 174(2), pages 127-143.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Charlier, Isabelle & Paindaveine, Davy & Saracco, Jérôme, 2015. "Conditional quantile estimation based on optimal quantization: From theory to practice," Computational Statistics & Data Analysis, Elsevier, vol. 91(C), pages 20-39.
    2. Antoine, Bertille & Lavergne, Pascal, 2023. "Identification-robust nonparametric inference in a linear IV model," Journal of Econometrics, Elsevier, vol. 235(1), pages 1-24.
    3. Semenova, Vira, 2023. "Debiased machine learning of set-identified linear models," Journal of Econometrics, Elsevier, vol. 235(2), pages 1725-1746.
    4. Jose Olmo, 2023. "A nonparametric predictive regression model using partitioning estimators based on Taylor expansions," Journal of Time Series Analysis, Wiley Blackwell, vol. 44(3), pages 294-318, May.
    5. Matias D. Cattaneo & Richard K. Crump & Max H. Farrell & Ernst Schaumburg, 2020. "Characteristic-Sorted Portfolios: Estimation and Inference," The Review of Economics and Statistics, MIT Press, vol. 102(3), pages 531-551, July.
    6. Diego Gentile Passaro & Fuhito Kojima & Bobak Pakzad-Hurson, 2023. "Equal Pay for Similar Work," Papers 2306.17111, arXiv.org, revised Dec 2024.
    7. Belloni, Alexandre & Chernozhukov, Victor & Chetverikov, Denis & Kato, Kengo, 2015. "Some new asymptotic theory for least squares series: Pointwise and uniform results," Journal of Econometrics, Elsevier, vol. 186(2), pages 345-366.
    8. Farrell, Max H., 2015. "Robust inference on average treatment effects with possibly more covariates than observations," Journal of Econometrics, Elsevier, vol. 189(1), pages 1-23.
    9. Xiaohong Chen & Timothy M. Christensen, 2013. "Optimal uniform convergence rates for sieve nonparametric instrumental variables regression," CeMMAP working papers CWP56/13, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    10. Xiaohong Chen & Timothy M. Christensen, 2014. "Optimal uniform convergence rates and asymptotic normality for series estimators under weak dependence and weak conditions," CeMMAP working papers 46/14, Institute for Fiscal Studies.
    11. Belloni, Alexandre & Chernozhukov, Victor & Chetverikov, Denis & Fernández-Val, Iván, 2019. "Conditional quantile processes based on series or many regressors," Journal of Econometrics, Elsevier, vol. 213(1), pages 4-29.
    12. Xiaohong Chen & Timothy M. Christensen, 2013. "Optimal uniform convergence rates for sieve nonparametric instrumental variables regression," CeMMAP working papers 56/13, Institute for Fiscal Studies.
    13. Holland, Ashley D., 2017. "Penalized spline estimation in the partially linear model," Journal of Multivariate Analysis, Elsevier, vol. 153(C), pages 211-235.
    14. Sebastian Calonico & Matias D. Cattaneo & Max H. Farrell, 2018. "On the Effect of Bias Estimation on Coverage Accuracy in Nonparametric Inference," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(522), pages 767-779, April.
    15. Michael Jansson & Demian Pouzo, 2017. "Towards a General Large Sample Theory for Regularized Estimators," Papers 1712.07248, arXiv.org, revised Jul 2020.
    16. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Kengo Kato, 2013. "On the asymptotic theory for least squares series: pointwise and uniform results," CeMMAP working papers 73/13, Institute for Fiscal Studies.
    17. Matias D. Cattaneo & Fang Han & Zhexiao Lin, 2023. "On Rosenbaum's Rank-based Matching Estimator," Papers 2312.07683, arXiv.org, revised Jan 2024.
    18. Bertille Antoine & Pascal Lavergne, 2020. "Identification-Robust Nonparametric Interference in a Linear IV Model," Discussion Papers dp20-03, Department of Economics, Simon Fraser University.
    19. Fang Han, 2024. "An Introduction to Permutation Processes (version 0.5)," Papers 2407.09664, arXiv.org.
    20. Chen, Xiaohong & Christensen, Timothy M., 2015. "Optimal uniform convergence rates and asymptotic normality for series estimators under weak dependence and weak conditions," Journal of Econometrics, Elsevier, vol. 188(2), pages 447-465.

    More about this item

    Keywords

    Piece-Wise Linear Regression; Partitioning Estimators; Non-Parametric Regression; Categorization; Partition; Prediction Trees; Decision Trees; Regression Trees; Regressogram; Mean Squared Error;
    All these keywords.

    JEL classification:

    • C14 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Semiparametric and Nonparametric Methods: General
    • C38 - Mathematical and Quantitative Methods - - Multiple or Simultaneous Equation Models; Multiple Variables - - - Classification Methdos; Cluster Analysis; Principal Components; Factor Analysis

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:hhs:lunewp:2018_012. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Iker Arregui Alegria (email available below). General contact details of provider: https://edirc.repec.org/data/delunse.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.