Ordered Correlation Forest

My bibliography Save this paper

Ordered Correlation Forest

Author

Listed:

Riccardo Di Francesco
(DEF, University of Rome "Tor Vergata")

Registered:

Riccardo Di Francesco

Abstract

Empirical studies in various social sciences often involve categorical outcomes with inherent ordering, such as self-evaluations of subjective well-being and self-assessments in health domains. While ordered choice models, such as the ordered logit and ordered probit, are popular tools for analyzing these outcomes, they may impose restrictive parametric and distributional assumptions. This paper introduces a novel estimator, the ordered correlation forest, that can naturally handle non-linearities in the data and does not assume a specific error term distribution. The proposed estimator modifies a standard random forest splitting criterion to build a collection of forests, each estimating the conditional probability of a single class. Under an “honesty” condition, predictions are consistent and asymptotically normal. The weights induced by each forest are used to obtain standard errors for the predicted probabilities and the covariates’ marginal effects. Evidence from synthetic data shows that the proposed estimator features a superior prediction performance than alternative forest-based estimators and demonstrates its ability to construct valid confidence intervals for the covariates’ marginal effects.

Suggested Citation

Riccardo Di Francesco, 2024. "Ordered Correlation Forest," CEIS Research Paper 577, Tor Vergata University, CEIS, revised 06 May 2024.

Handle: RePEc:rtv:ceisrp:577

Download full text from publisher

Other versions of this item:

Riccardo Di Francesco, 2023. "Ordered Correlation Forest," Papers 2309.08755, arXiv.org.

References listed on IDEAS

Franco Peracchi & Claudio Rossetti, 2013. "The heterogeneous thresholds ordered response model: identification and inference," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 176(3), pages 703-722, June.
- Franco Peracchi & Claudio Rossetti, 2010. "The heterogeneous thresholds ordered response model: Identification and inference," EIEF Working Papers Series 1012, Einaudi Institute for Economics and Finance (EIEF), revised Apr 2012.
Alexandre Belloni & Victor Chernozhukov, 2011. "High Dimensional Sparse Econometric Models: An Introduction," Papers 1106.5242, arXiv.org, revised Sep 2011.
Bruno S. Frey & Alois Stutzer, 2002. "What Can Economists Learn from Happiness Research?," Journal of Economic Literature, American Economic Association, vol. 40(2), pages 402-435, June.
- Bruno S. Frey & Alois Stutzer, "undated". "What can Economists Learn from Happiness Research?," IEW - Working Papers 080, Institute for Empirical Research in Economics - University of Zurich.
- Bruno S. Frey & Alois Stutzer, 2001. "What Can Economists Learn from Happiness Research?," CESifo Working Paper Series 503, CESifo.
Stefan Wager & Susan Athey, 2018. "Estimation and Inference of Heterogeneous Treatment Effects using Random Forests," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1228-1242, July.
- Wager, Stefan & Athey, Susan, 2017. "Estimation and Inference of Heterogeneous Treatment Effects Using Random Forests," Research Papers 3576, Stanford University, Graduate School of Business.
Franco Peracchi & Claudio Rossetti, 2012. "Heterogeneity in health responses and anchoring vignettes," Empirical Economics, Springer, vol. 42(2), pages 513-538, April.
Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2011. "Inference for High-Dimensional Sparse Econometric Models," Papers 1201.0220, arXiv.org.
- Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2011. "Inference for high-dimensional sparse econometric models," CeMMAP working papers CWP41/11, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
Janitza, Silke & Tutz, Gerhard & Boulesteix, Anne-Laure, 2016. "Random forest for ordinal responses: Prediction and variable selection," Computational Statistics & Data Analysis, Elsevier, vol. 96(C), pages 57-73.
Michael Lechner & Gabriel Okasa, 2025. "Random Forest estimation of the ordered choice model," Empirical Economics, Springer, vol. 68(1), pages 1-106, January.
- Lechner, Michael & Okasa, Gabriel, 2019. "Random Forest Estimation of the Ordered Choice Model," Economics Working Paper Series 1908, University of St. Gallen, School of Economics and Political Science.
- Michael Lechner & Gabriel Okasa, 2019. "Random Forest Estimation of the Ordered Choice Model," Papers 1907.02436, arXiv.org, revised Sep 2022.
Wright, Marvin N. & Ziegler, Andreas, 2017. "ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 77(i01).

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Bertoni, Marco, 2015. "Hungry today, unhappy tomorrow? Childhood hunger and subjective wellbeing later in life," Journal of Health Economics, Elsevier, vol. 40(C), pages 40-53.
Michael Lechner & Gabriel Okasa, 2025. "Random Forest estimation of the ordered choice model," Empirical Economics, Springer, vol. 68(1), pages 1-106, January.
- Lechner, Michael & Okasa, Gabriel, 2019. "Random Forest Estimation of the Ordered Choice Model," Economics Working Paper Series 1908, University of St. Gallen, School of Economics and Political Science.
- Michael Lechner & Gabriel Okasa, 2019. "Random Forest Estimation of the Ordered Choice Model," Papers 1907.02436, arXiv.org, revised Sep 2022.
Gabriel Okasa, 2022. "Meta-Learners for Estimation of Causal Effects: Finite Sample Cross-Fit Performance," Papers 2201.12692, arXiv.org.
William H. Greene & Mark N. Harris & Rachel J. Knott & Nigel Rice, 2021. "Specification and testing of hierarchical ordered response models with anchoring vignettes," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(1), pages 31-64, January.
- Greene, W.H.; & Harris, M.N.; & Knott, R.; & Rice, N.;, 2019. "Specification and testing of hierarchical ordered response models with anchoring vignettes," Health, Econometrics and Data Group (HEDG) Working Papers 19/18, HEDG, c/o Department of Economics, University of York.
Bokelmann, Björn & Lessmann, Stefan, 2024. "Improving uplift model evaluation on randomized controlled trial data," European Journal of Operational Research, Elsevier, vol. 313(2), pages 691-707.
Ning Xu & Jian Hong & Timothy C. G. Fisher, 2016. "Model selection consistency from the perspective of generalization ability and VC theory with an application to Lasso," Papers 1606.00142, arXiv.org.
- Xu, Ning & Hong, Jian & Fisher, Timothy, 2016. "Model selection consistency from the perspective of generalization ability and VC theory with an application to Lasso," MPRA Paper 71670, University Library of Munich, Germany.
Domenico Giannone & Michele Lenza & Giorgio E. Primiceri, 2021. "Economic Predictions With Big Data: The Illusion of Sparsity," Econometrica, Econometric Society, vol. 89(5), pages 2409-2437, September.
- Giannone, Domenico & Lenza, Michele & Primiceri, Giorgio, 2017. "Economic Predictions with Big Data: The Illusion Of Sparsity," CEPR Discussion Papers 12256, C.E.P.R. Discussion Papers.
- Domenico Giannone & Michele Lenza & Giorgio E. Primiceri, 2018. "Economic Predictions with Big Data: The Illusion of Sparsity," Liberty Street Economics 20180521, Federal Reserve Bank of New York.
- Giannone, Domenico & Lenza, Michele & Primiceri, Giorgio E., 2021. "Economic predictions with big data: the illusion of sparsity," Working Paper Series 2542, European Central Bank.
- Domenico Giannone & Michele Lenza & Giorgio E. Primiceri, 2018. "Economic predictions with big data: the illusion of sparsity," Staff Reports 847, Federal Reserve Bank of New York.
Philipp Bach & Victor Chernozhukov & Malte S. Kurz & Martin Spindler & Sven Klaassen, 2021. "DoubleML -- An Object-Oriented Implementation of Double Machine Learning in R," Papers 2103.09603, arXiv.org, revised Jun 2024.
Damian Kozbur, 2013. "Inference in additively separable models with a high-dimensional set of conditioning variables," ECON - Working Papers 284, Department of Economics - University of Zurich, revised Apr 2018.
de Paula, Aureo & Rasul, Imran & Souza, Pedro, 2018. "Identifying Network Ties from Panel Data: Theory and an Application to Tax Competition," CEPR Discussion Papers 12792, C.E.P.R. Discussion Papers.
- Áureo de Paula & Imran Rasul & Pedro CL Souza, 2023. "Identifying network ties from panel data: Theory and an application to tax competition," CeMMAP working papers 21/23, Institute for Fiscal Studies.
- Aureo de Paula & Imran Rasul & Pedro Souza, 2019. "Identifying Network Ties from Panel Data: Theory and an Application to Tax Competition," Papers 1910.07452, arXiv.org, revised Oct 2023.
- Áureo de Paula & Imran Rasul & Pedro CL Souza, 2023. "Identifying network ties from panel data: theory and an application to tax competition," IFS Working Papers WCWP21/23, Institute for Fiscal Studies.
- Imran Rasul & Pedro Souza & Aureo de Paula, 2023. "Identifying Network Ties from Panel Data: Theory and an application to tax competition," POID Working Papers 081, Centre for Economic Performance, LSE.
- Áureo de Paula & Imran Rasul & Pedro CL Souza, 2023. "Identifying network ties from panel data: theory and an application to tax competition," CeMMAP working papers 02/23, Institute for Fiscal Studies.
- Áureo de Paula & Imran Rasul & Pedro CL Souza, 2019. "Identifying network ties from panel data: theory and an application to tax competition," CeMMAP working papers CWP55/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
Roman Hornung, 2020. "Ordinal Forests," Journal of Classification, Springer;The Classification Society, vol. 37(1), pages 4-17, April.
Alexandre Belloni & Victor Chernozhukov & Lie Wang, 2013. "Pivotal estimation via square-root lasso in nonparametric regression," CeMMAP working papers CWP62/13, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Alexandre Belloni & Victor Chernozhukov & Lie Wang, 2013. "Pivotal estimation via square-root lasso in nonparametric regression," CeMMAP working papers 62/13, Institute for Fiscal Studies.
Huber, Martin & Meier, Jonas & Wallimann, Hannes, 2022. "Business analytics meets artificial intelligence: Assessing the demand effects of discounts on Swiss train tickets," Transportation Research Part B: Methodological, Elsevier, vol. 163(C), pages 22-39.
- Martin Huber & Jonas Meier & Hannes Wallimann, 2021. "Business analytics meets artificial intelligence: Assessing the demand effects of discounts on Swiss train tickets," Papers 2105.01426, arXiv.org, revised Jun 2022.
Knott, Rachel J. & Lorgelly, Paula K. & Black, Nicole & Hollingsworth, Bruce, 2017. "Differential item functioning in quality of life measurement: An analysis using anchoring vignettes," Social Science & Medicine, Elsevier, vol. 190(C), pages 247-255.
Susan Athey & Julie Tibshirani & Stefan Wager, 2016. "Generalized Random Forests," Papers 1610.01271, arXiv.org, revised Apr 2018.
- Athey, Susan & Tibshirani, Julie & Wager, Stefan, 2017. "Generalized Random Forests," Research Papers 3575, Stanford University, Graduate School of Business.
André Nunes Maranhão, 2024. "Brazilian Business Cycle Analysis in a High-Dimensional and Time-Irregular Span Context," Journal of Business Cycle Research, Springer;Centre for International Research on Economic Tendency Surveys (CIRET), vol. 20(1), pages 1-58, August.
Kock, Anders Bredahl, 2016. "Oracle inequalities, variable selection and uniform inference in high-dimensional correlated random effects panel data models," Journal of Econometrics, Elsevier, vol. 195(1), pages 71-85.
Federico A. Bugni & Mehmet Caner & Anders Bredahl Kock & Soumendra Lahiri, 2016. "Inference in partially identified models with many moment inequalities using Lasso," CREATES Research Papers 2016-12, Department of Economics and Business Economics, Aarhus University.
Daniel Felix Ahelegbey & Monica Billio & Roberto Casarin, 2016. "Sparse Graphical Vector Autoregression: A Bayesian Approach," Annals of Economics and Statistics, GENES, issue 123-124, pages 333-361.
- Roberto Casarin & Daniel Felix Ahelegbey & Monica Billio, 2014. "Sparse Graphical Vector Autoregression: A Bayesian Approach," Working Papers 2014:29, Department of Economics, University of Venice "Ca' Foscari".
André Nunes Maranhão & Nicole Rennó Castro, 2023. "Dissecting Brazilian agriculture business cycles in high-dimensional and time-irregular span contexts," Empirical Economics, Springer, vol. 65(4), pages 1543-1578, October.

More about this item

Keywords

Ordered non-numeric outcomes; choice probabilities; machine learning;
All these keywords.

JEL classification:

C14 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Semiparametric and Nonparametric Methods: General
C25 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Discrete Regression and Qualitative Choice Models; Discrete Regressors; Proportions; Probabilities
C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis

NEP fields

This paper has been announced in the following NEP Reports:

NEP-BIG-2024-06-10 (Big Data)
NEP-DCM-2024-06-10 (Discrete Choice Models)

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:rtv:ceisrp:577. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Barbara Piazzi (email available below). General contact details of provider: https://edirc.repec.org/data/csrotit.html .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Ordered Correlation Forest

Author

Abstract

Suggested Citation

Download full text from publisher

Other versions of this item:

References listed on IDEAS

Most related items

More about this item

Keywords

JEL classification:

NEP fields

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data