IDEAS home Printed from https://ideas.repec.org/a/spr/advdac/v17y2023i2d10.1007_s11634-022-00509-3.html
   My bibliography  Save this article

Mixed-effect models with trees

Author

Listed:
  • Anna Gottard

    (Florence Center for Data Science, University of Florence)

  • Giulia Vannucci

    (Florence Center for Data Science, University of Florence)

  • Leonardo Grilli

    (Florence Center for Data Science, University of Florence)

  • Carla Rampichini

    (Florence Center for Data Science, University of Florence)

Abstract

Tree-based regression models are a class of statistical models for predicting continuous response variables when the shape of the regression function is unknown. They naturally take into account both non-linearities and interactions. However, they struggle with linear and quasi-linear effects and assume iid data. This article proposes two new algorithms for jointly estimating an interpretable predictive mixed-effect model with two components: a linear part, capturing the main effects, and a non-parametric component consisting of three trees for capturing non-linearities and interactions among individual-level predictors, among cluster-level predictors or cross-level. The first proposed algorithm focuses on prediction. The second one is an extension which implements a post-selection inference strategy to provide valid inference. The performance of the two algorithms is validated via Monte Carlo studies. An application on INVALSI data illustrates the potentiality of the proposed approach.

Suggested Citation

  • Anna Gottard & Giulia Vannucci & Leonardo Grilli & Carla Rampichini, 2023. "Mixed-effect models with trees," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 17(2), pages 431-461, June.
  • Handle: RePEc:spr:advdac:v:17:y:2023:i:2:d:10.1007_s11634-022-00509-3
    DOI: 10.1007/s11634-022-00509-3
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11634-022-00509-3
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11634-022-00509-3?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Heidi Seibold & Torsten Hothorn & Achim Zeileis, 2019. "Generalised linear model trees with global additive effects," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(3), pages 703-725, September.
    2. Elff, Martin & Heisig, Jan Paul & Schaeffer, Merlin & Shikano, Susumu, 2021. "Multilevel Analysis with Few Clusters: Improving Likelihood-Based Methods to Provide Unbiased Estimates and Accurate Inference," British Journal of Political Science, Cambridge University Press, vol. 51(1), pages 412-426, January.
    3. Hajjem, Ahlem & Larocque, Denis & Bellavance, François, 2017. "Generalized mixed effects regression trees," Statistics & Probability Letters, Elsevier, vol. 126(C), pages 114-118.
    4. Fu, Wei & Simonoff, Jeffrey S., 2015. "Unbiased regression trees for longitudinal and clustered data," Computational Statistics & Data Analysis, Elsevier, vol. 88(C), pages 53-74.
    5. Anders Skrondal & Sophia Rabe‐Hesketh, 2009. "Prediction in multilevel generalized linear models," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 172(3), pages 659-687, June.
    6. Bradley Efron, 2020. "Prediction, Estimation, and Attribution," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(530), pages 636-655, April.
    7. Bates, Douglas & Mächler, Martin & Bolker, Ben & Walker, Steve, 2015. "Fitting Linear Mixed-Effects Models Using lme4," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 67(i01).
    8. Elff, Martin & Heisig, Jan Paul & Schaeffer, Merlin & Shikano, Susumu, 2021. "Multilevel Analysis with Few Clusters: Improving Likelihood-based Methods to Provide Unbiased Estimates and Accurate Inference," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 51(1), pages 412-426.
    9. Elise Dusseldorp & Jacqueline Meulman, 2004. "The regression trunk approach to discover treatment covariate interaction," Psychometrika, Springer;The Psychometric Society, vol. 69(3), pages 355-374, September.
    10. Rügamer, David & Baumann, Philipp F.M. & Greven, Sonja, 2022. "Selective inference for additive and linear mixed models," Computational Statistics & Data Analysis, Elsevier, vol. 167(C).
    11. Bradley Efron, 2020. "Prediction, Estimation, and Attribution," International Statistical Review, International Statistical Institute, vol. 88(S1), pages 28-59, December.
    12. M Hiabu & J P Nielsen & T H Scheike, 2021. "Nonsmooth backfitting for the excess risk additive regression model with two survival time scales [A linear regression model for the analysis of life times]," Biometrika, Biometrika Trust, vol. 108(2), pages 491-506.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Neimanns, Erik & Blossey, Nils, 2022. "From media-party linkages to ownership concentration causes of cross-national variation in media outlets' economic positioning," MPIfG Discussion Paper 22/8, Max Planck Institute for the Study of Societies.
    2. Fernando Bruna, 2022. "Happy Cultures? A Multilevel Model of Well-Being with Individual and Contextual Human Values," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 164(1), pages 55-77, November.
    3. Benítez-Peña, Sandra & Carrizosa, Emilio & Guerrero, Vanesa & Jiménez-Gamero, M. Dolores & Martín-Barragán, Belén & Molero-Río, Cristina & Ramírez-Cobo, Pepa & Romero Morales, Dolores & Sillero-Denami, 2021. "On sparse ensemble methods: An application to short-term predictions of the evolution of COVID-19," European Journal of Operational Research, Elsevier, vol. 295(2), pages 648-663.
    4. Manski, Charles F., 2023. "Probabilistic prediction for binary treatment choice: With focus on personalized medicine," Journal of Econometrics, Elsevier, vol. 234(2), pages 647-663.
    5. Heisig, Jan Paul & Matthewes, Sönke Hendrik, 2022. "No Evidence that Strict Educational Tracking Improves Student Performance through Classroom Homogeneity: A Critical Reanalysis of Esser and Seuring (2020) [Keine Belege für leistungsfördernde Effek," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 51(1), pages 99-111.
    6. Weishampel, Anthony & Staicu, Ana-Maria & Rand, William, 2023. "Classification of social media users with generalized functional data analysis," Computational Statistics & Data Analysis, Elsevier, vol. 179(C).
    7. Nelson P. Rayl & Nitish R. Sinha, 2022. "Integrating Prediction and Attribution to Classify News," Finance and Economics Discussion Series 2022-042, Board of Governors of the Federal Reserve System (U.S.).
    8. Steffen Nestler & Sarah Humberg, 2022. "A Lasso and a Regression Tree Mixed-Effect Model with Random Effects for the Level, the Residual Variance, and the Autocorrelation," Psychometrika, Springer;The Psychometric Society, vol. 87(2), pages 506-532, June.
    9. Bas Bosma & Arjen Witteloostuijn, 2024. "Machine learning in international business," Journal of International Business Studies, Palgrave Macmillan;Academy of International Business, vol. 55(6), pages 676-702, August.
    10. Denis A Shah & Erick D De Wolf & Pierce A Paul & Laurence V Madden, 2021. "Accuracy in the prediction of disease epidemics when ensembling simple but highly correlated models," PLOS Computational Biology, Public Library of Science, vol. 17(3), pages 1-23, March.
    11. Paolo Libenzio Brignoli & Alessandro Varacca & Cornelis Gardebroek & Paolo Sckokai, 2024. "Machine learning to predict grains futures prices," Agricultural Economics, International Association of Agricultural Economists, vol. 55(3), pages 479-497, May.
    12. Fernando Bruna & Juan Fernández‐Sastre, 2021. "Regional characteristics and the decision to innovate in a developing country: A multilevel analysis of Ecuadorian firms," Papers in Regional Science, Wiley Blackwell, vol. 100(6), pages 1337-1354, December.
    13. Shuwen Hu & You-Gan Wang & Christopher Drovandi & Taoyun Cao, 2023. "Predictions of machine learning with mixed-effects in analyzing longitudinal data under model misspecification," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 32(2), pages 681-711, June.
    14. M. Merz & R. Richman & T. Tsanakas & M. V. Wuthrich, 2021. "Interpreting Deep Learning Models with Marginal Attribution by Conditioning on Quantiles," Papers 2103.11706, arXiv.org.
    15. Rich, Jeppe & Myhrmann, Marcus Skyum & Mabit, Stefan Eriksen, 2023. "Our children cycle less - A Danish pseudo-panel analysis," Journal of Transport Geography, Elsevier, vol. 106(C).
    16. Chun Chieh Fan & Robert Loughnan & Carolina Makowski & Diliana Pecheva & Chi-Hua Chen & Donald J. Hagler & Wesley K. Thompson & Nadine Parker & Dennis van der Meer & Oleksandr Frei & Ole A. Andreassen, 2022. "Multivariate genome-wide association study on tissue-sensitive diffusion metrics highlights pathways that shape the human brain," Nature Communications, Nature, vol. 13(1), pages 1-10, December.
    17. Jack Jewson & David Rossell, 2022. "General Bayesian loss function selection and the use of improper models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(5), pages 1640-1665, November.
    18. COJOCARIU Irina-Cristina, 2023. "Analysis Of Sports Performances Using Machine Learning And Statistical Models - A General Analysis Of The Literature," Revista Economica, Lucian Blaga University of Sibiu, Faculty of Economic Sciences, vol. 75(2), pages 34-39, June.
    19. Ord, J. Keith, 2022. "The uncertainty track: Machine learning, statistical modeling, synthesis," International Journal of Forecasting, Elsevier, vol. 38(4), pages 1526-1530.
    20. Nicholas Charron & Victor Lapuente & Andres Rodriguez-Pose, 2022. "Uncooperative Society, Uncooperative Politics or Both? Trust, Polarisation, Populism and COVID-19 Deaths across European regions," Papers in Evolutionary Economic Geography (PEEG) 2204, Utrecht University, Department of Human Geography and Spatial Planning, Group Economic Geography, revised Jan 2022.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:advdac:v:17:y:2023:i:2:d:10.1007_s11634-022-00509-3. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.