IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v88y2015icp53-74.html
   My bibliography  Save this article

Unbiased regression trees for longitudinal and clustered data

Author

Listed:
  • Fu, Wei
  • Simonoff, Jeffrey S.

Abstract

A new version of the RE–EM regression tree method for longitudinal and clustered data is presented. The RE–EM tree is a methodology that combines the structure of mixed effects models for longitudinal and clustered data with the flexibility of tree-based estimation methods. The RE–EM tree is less sensitive to parametric assumptions and provides improved predictive power compared to linear models with random effects and regression trees without random effects. The previously-suggested methodology used the CART tree algorithm for tree building, and therefore that RE–EM regression tree method inherits the tendency of CART to split on variables with more possible split points at the expense of those with fewer split points. A revised version of the RE–EM regression tree corrects for this bias by using the conditional inference tree as the underlying tree algorithm instead of CART. Simulation studies show that the new version is indeed unbiased, and has several improvements over the original RE–EM regression tree in terms of prediction accuracy and the ability to recover the correct tree structure.

Suggested Citation

  • Fu, Wei & Simonoff, Jeffrey S., 2015. "Unbiased regression trees for longitudinal and clustered data," Computational Statistics & Data Analysis, Elsevier, vol. 88(C), pages 53-74.
  • Handle: RePEc:eee:csdana:v:88:y:2015:i:c:p:53-74
    DOI: 10.1016/j.csda.2015.02.004
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947315000432
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2015.02.004?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Hajjem, Ahlem & Bellavance, François & Larocque, Denis, 2011. "Mixed effects regression trees for clustered data," Statistics & Probability Letters, Elsevier, vol. 81(4), pages 451-459, April.
    2. Torsten Hothorn & Achim Zeileis, 2014. "partykit: A Modular Toolkit for Recursive Partytioning in R," Working Papers 2014-10, Faculty of Economics and Statistics, Universität Innsbruck.
    3. Dee, Thomas S. & Sela, Rebecca J., 2003. "The fatality effects of highway speed limits by gender and age," Economics Letters, Elsevier, vol. 79(3), pages 401-408, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Steffen Nestler & Sarah Humberg, 2022. "A Lasso and a Regression Tree Mixed-Effect Model with Random Effects for the Level, the Residual Variance, and the Autocorrelation," Psychometrika, Springer;The Psychometric Society, vol. 87(2), pages 506-532, June.
    2. Kim, Seheon & Rasouli, Soora & Timmermans, Harry & Yang, Dujuan, 2018. "Estimating panel effects in probabilistic representations of dynamic decision trees using bayesian generalized linear mixture models," Transportation Research Part B: Methodological, Elsevier, vol. 111(C), pages 168-184.
    3. Tsionas, Mike, 2022. "Efficiency estimation using probabilistic regression trees with an application to Chilean manufacturing industries," International Journal of Production Economics, Elsevier, vol. 249(C).
    4. Thomas Bassetti & Raul Caruso & Friedrich Schneider, 2018. "The tree of political violence: a GMERT analysis," Empirical Economics, Springer, vol. 54(2), pages 839-850, March.
    5. Roberta Siciliano & Antonio D’Ambrosio & Massimo Aria & Sonia Amodio, 2017. "Analysis of Web Visit Histories, Part II: Predicting Navigation by Nested STUMP Regression Trees," Journal of Classification, Springer;The Classification Society, vol. 34(3), pages 473-493, October.
    6. Shuwen Hu & You-Gan Wang & Christopher Drovandi & Taoyun Cao, 2023. "Predictions of machine learning with mixed-effects in analyzing longitudinal data under model misspecification," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 32(2), pages 681-711, June.
    7. Seung Yeoun Choi & Sean Hay Kim, 2022. "Selection of a Transparent Meta-Model Algorithm for Feasibility Analysis Stage of Energy Efficient Building Design: Clustering vs. Tree," Energies, MDPI, vol. 15(18), pages 1-25, September.
    8. Karolis Matikonis & Matthew Gobey, 2024. "Small Business Property Tax Reductions and Firm Productivity," Small Business Economics, Springer, vol. 62(1), pages 307-324, January.
    9. Raval, Devesh & Rosenbaum, Ted & Wilson, Nathan E., 2021. "How do machine learning algorithms perform in predicting hospital choices? evidence from changing environments," Journal of Health Economics, Elsevier, vol. 78(C).
    10. Anna Gottard & Giulia Vannucci & Leonardo Grilli & Carla Rampichini, 2023. "Mixed-effect models with trees," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 17(2), pages 431-461, June.
    11. Manhal Ali & Reza Salehnejad & Mohaimen Mansur, 2018. "Hospital heterogeneity: what drives the quality of health care," The European Journal of Health Economics, Springer;Deutsche Gesellschaft für Gesundheitsökonomie (DGGÖ), vol. 19(3), pages 385-408, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Patrick Krennmair & Timo Schmid, 2022. "Flexible domain prediction using mixed effects random forests," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(5), pages 1865-1894, November.
    2. Zelenkov, Yu. & Solntsev, I., 2022. "Predicting the value of professional sport clubs. A study of European soccer, 2005-2018," Journal of the New Economic Association, New Economic Association, vol. 56(4), pages 28-46.
    3. Grubinger, Thomas & Zeileis, Achim & Pfeiffer, Karl-Peter, 2014. "evtree: Evolutionary Learning of Globally Optimal Classification and Regression Trees in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 61(i01).
    4. Schivinski, Bruno, 2021. "Eliciting brand-related social media engagement: A conditional inference tree framework," Journal of Business Research, Elsevier, vol. 130(C), pages 594-602.
    5. Castillo-Manzano, José I. & Castro-Nuño, Mercedes & Pedregal-Tercero, Diego J., 2014. "Temporary speed limit changes: An econometric estimation of the effects of the Spanish Energy Efficiency and Saving Plan," Economic Modelling, Elsevier, vol. 44(S1), pages 68-76.
    6. Daniel Albalate, 2008. "Lowering blood alcohol content levels to save lives: The European experience," Journal of Policy Analysis and Management, John Wiley & Sons, Ltd., vol. 27(1), pages 20-39.
    7. Gustavsson, Magnus & Osterholm, Par, 2006. "The informational value of unemployment statistics: A note on the time series properties of participation rates," Economics Letters, Elsevier, vol. 92(3), pages 428-433, September.
    8. Messner, Wolfgang, 2024. "Exploring multilevel data with deep learning and XAI: The effect of personal-care advertising spending on subjective happiness," International Business Review, Elsevier, vol. 33(1).
    9. D. Mark Anderson & Benjamin Hansen & Daniel I. Rees, 2013. "Medical Marijuana Laws, Traffic Fatalities, and Alcohol Consumption," Journal of Law and Economics, University of Chicago Press, vol. 56(2), pages 333-369.
    10. Tomasz Melcer & Monika E Danielewska & D Robert Iskander, 2015. "Wavelet Representation of the Corneal Pulse for Detecting Ocular Dicrotism," PLOS ONE, Public Library of Science, vol. 10(4), pages 1-13, April.
    11. Bürgin, Reto & Ritschard, Gilbert, 2015. "Tree-based varying coefficient regression for longitudinal ordinal responses," Computational Statistics & Data Analysis, Elsevier, vol. 86(C), pages 65-80.
    12. Thomas Rusch & Achim Zeileis, 2014. "Discussion," International Statistical Review, International Statistical Institute, vol. 82(3), pages 361-367, December.
    13. Martin Wagner & Achim Zeileis, 2019. "Heterogeneity and Spatial Dependence of Regional Growth in the EU: A Recursive Partitioning Approach," German Economic Review, Verein für Socialpolitik, vol. 20(1), pages 67-82, February.
    14. Mercedes Castro-Nuño & José I. Castillo-Manzano & Xavier Fageda, 2015. "Do more trucks lead to more motor vehicle fatalities in European roads? Evaluating the impact of specific safety strategies," ERSA conference papers ersa15p306, European Regional Science Association.
    15. Tsionas, Mike, 2022. "Efficiency estimation using probabilistic regression trees with an application to Chilean manufacturing industries," International Journal of Production Economics, Elsevier, vol. 249(C).
    16. Steffen Nestler & Sarah Humberg, 2022. "A Lasso and a Regression Tree Mixed-Effect Model with Random Effects for the Level, the Residual Variance, and the Autocorrelation," Psychometrika, Springer;The Psychometric Society, vol. 87(2), pages 506-532, June.
    17. Anderson, D. Mark & Rees, Daniel I., 2015. "Per se drugged driving laws and traffic fatalities," International Review of Law and Economics, Elsevier, vol. 42(C), pages 122-134.
    18. Shuwen Hu & You-Gan Wang & Christopher Drovandi & Taoyun Cao, 2023. "Predictions of machine learning with mixed-effects in analyzing longitudinal data under model misspecification," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 32(2), pages 681-711, June.
    19. Jiang, Cuiqing & Wang, Zhao & Zhao, Huimin, 2019. "A prediction-driven mixture cure model and its application in credit scoring," European Journal of Operational Research, Elsevier, vol. 277(1), pages 20-31.
    20. Tsubasa Ito & Shonosuke Sugasawa, 2023. "Grouped generalized estimating equations for longitudinal data analysis," Biometrics, The International Biometric Society, vol. 79(3), pages 1868-1879, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:88:y:2015:i:c:p:53-74. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.