IDEAS home Printed from https://ideas.repec.org/p/yor/hectdg/24-16.html
   My bibliography  Save this paper

Consistent Estimation of Finite Mixtures: An Application to Latent Group Panel Structures

Author

Listed:
  • Langevin, R.;

Abstract

Finite mixtures are often used in econometric analyses to account for unobserved heterogeneity. This paper shows that maximizing the likelihood of a finite mixture of parametric densities leads to inconsistent estimates under weak regularity conditions. The size of the asymptotic bias is positively correlated with the degree of overlap between the densities within the mixture. In contrast, I show that maximizing the max-component likelihood function equipped with a consistent classifier leads to consistency in both estimation and classification as the number of covariates goes to infinity while leaving group membership completely unrestricted. Extending the proposed estimator to a fully nonparametric estimation setting is straightforward. The inconsistency of standard maximum likelihood estimation (MLE) procedures is confirmed via simulations. Simulation results show that the proposed algorithm generally outperforms standard MLE procedures in finite samples when all observations are correctly classified. In an application using latent group panel structures and health administrative data, estimation results show that the proposed strategy leads to a reduction in out-of-sample prediction error of around 17.6% compared to the best results obtained from standard MLE procedures.

Suggested Citation

  • Langevin, R.;, 2024. "Consistent Estimation of Finite Mixtures: An Application to Latent Group Panel Structures," Health, Econometrics and Data Group (HEDG) Working Papers 24/16, HEDG, c/o Department of Economics, University of York.
  • Handle: RePEc:yor:hectdg:24/16
    as

    Download full text from publisher

    File URL: https://www.york.ac.uk/media/economics/documents/hedg/workingpapers/2024/2416.pdf
    File Function: Main text
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Dzemski, Andreas & Okui, Ryo, 2021. "Convergence rate of estimators of clustered panel models with misclassification," Economics Letters, Elsevier, vol. 203(C).
    2. Stéphane Bonhomme & Elena Manresa, 2015. "Grouped Patterns of Heterogeneity in Panel Data," Econometrica, Econometric Society, vol. 83(3), pages 1147-1184, May.
    3. Stéphane Bonhomme & Thibaut Lamadon & Elena Manresa, 2022. "Discretizing Unobserved Heterogeneity," Econometrica, Econometric Society, vol. 90(2), pages 625-643, March.
    4. Wang, Yiren & Phillips, Peter C.B. & Su, Liangjun, 2024. "Panel data models with time-varying latent group structures," Journal of Econometrics, Elsevier, vol. 240(1).
    5. Liangjun Su & Zhentao Shi & Peter C. B. Phillips, 2016. "Identifying Latent Structures in Panel Data," Econometrica, Econometric Society, vol. 84, pages 2215-2264, November.
    6. Andrew M. Jones & James Lomas & Nigel Rice, 2015. "Healthcare Cost Regressions: Going Beyond the Mean to Estimate the Full Distribution," Health Economics, John Wiley & Sons, Ltd., vol. 24(9), pages 1192-1212, September.
    7. Tom Boot & Andreas Pick, 2018. "Optimal Forecasts from Markov Switching Models," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 36(4), pages 628-642, October.
    8. Peter Bryant, 1991. "Large-sample results for optimization-based clustering methods," Journal of Classification, Springer;The Classification Society, vol. 8(1), pages 31-44, January.
    9. Deb, Partha & Trivedi, Pravin K, 1997. "Demand for Medical Care by the Elderly: A Finite Mixture Approach," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 12(3), pages 313-336, May-June.
    10. Keane, Michael P & Wolpin, Kenneth I, 1997. "The Career Decisions of Young Men," Journal of Political Economy, University of Chicago Press, vol. 105(3), pages 473-522, June.
    11. Liangjun Su & Xia Wang & Sainan Jin, 2019. "Sieve Estimation of Time-Varying Panel Data Models With Latent Structures," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 37(2), pages 334-349, April.
    12. Gourieroux, Christian & Monfort, Alain & Trognon, Alain, 1984. "Pseudo Maximum Likelihood Methods: Theory," Econometrica, Econometric Society, vol. 52(3), pages 681-700, May.
    13. Stéphane Bonhomme & Thibaut Lamadon & Elena Manresa, 2019. "A Distributional Framework for Matched Employer Employee Data," Econometrica, Econometric Society, vol. 87(3), pages 699-739, May.
    14. Jinyong Hahn & Whitney Newey, 2004. "Jackknife and Analytical Bias Reduction for Nonlinear Panel Models," Econometrica, Econometric Society, vol. 72(4), pages 1295-1319, July.
    15. Qian, Junhui & Su, Liangjun, 2016. "Shrinkage estimation of common breaks in panel data models via adaptive group fused Lasso," Journal of Econometrics, Elsevier, vol. 191(1), pages 86-109.
    16. Okui, Ryo & Wang, Wendun, 2021. "Heterogeneous structural breaks in panel data models," Journal of Econometrics, Elsevier, vol. 220(2), pages 447-473.
    17. Liu, Ruiqi & Shang, Zuofeng & Zhang, Yonghui & Zhou, Qiankun, 2020. "Identification and estimation in panel models with overspecified number of groups," Journal of Econometrics, Elsevier, vol. 215(2), pages 574-590.
    18. Chamberlain, Gary, 2022. "Feedback in panel data models," Journal of Econometrics, Elsevier, vol. 226(1), pages 4-20.
    19. Jushan Bai, 2009. "Panel Data Models With Interactive Fixed Effects," Econometrica, Econometric Society, vol. 77(4), pages 1229-1279, July.
    20. Lumsdaine, Robin L. & Okui, Ryo & Wang, Wendun, 2023. "Estimation of panel group structure models with structural breaks in group memberships and coefficients," Journal of Econometrics, Elsevier, vol. 233(1), pages 45-65.
    21. Wang, Wuyi & Su, Liangjun, 2021. "Identifying latent group structures in nonlinear panels," Journal of Econometrics, Elsevier, vol. 220(2), pages 272-295.
    22. Jeffrey M Wooldridge, 2010. "Econometric Analysis of Cross Section and Panel Data," MIT Press Books, The MIT Press, edition 2, volume 1, number 0262232588, April.
    23. Heckman, James & Singer, Burton, 1984. "A Method for Minimizing the Impact of Distributional Assumptions in Econometric Models for Duration Data," Econometrica, Econometric Society, vol. 52(2), pages 271-320, March.
    24. Giovanni Compiani & Yuichi Kitamura, 2016. "Using mixtures in econometric models: a brief review and some new results," Econometrics Journal, Royal Economic Society, vol. 19(3), pages 95-127, October.
    25. Wuyi Wang & Peter C. B. Phillips & Liangjun Su, 2018. "Homogeneity pursuit in panel data models: Theory and application," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 33(6), pages 797-815, September.
    26. Nicolas Sirven & Thomas Rapp, 2017. "The cost of frailty in France," The European Journal of Health Economics, Springer;Deutsche Gesellschaft für Gesundheitsökonomie (DGGÖ), vol. 18(2), pages 243-253, March.
    27. Gourieroux, Christian & Monfort, Alain & Trognon, Alain, 1984. "Pseudo Maximum Likelihood Methods: Applications to Poisson Models," Econometrica, Econometric Society, vol. 52(3), pages 701-720, May.
    28. Kentaro Tanaka, 2009. "Strong Consistency of the Maximum Likelihood Estimator for Finite Mixtures of Location–Scale Distributions When Penalty is Imposed on the Ratios of the Scale Parameters," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 36(1), pages 171-184, March.
    29. Celeux, Gilles & Govaert, Gerard, 1992. "A classification EM algorithm for clustering and two stochastic versions," Computational Statistics & Data Analysis, Elsevier, vol. 14(3), pages 315-332, October.
    30. Brian Neelon & A. James O'Malley & Sharon-Lise T. Normand, 2011. "A Bayesian Two-Part Latent Class Model for Longitudinal Medical Expenditure Data: Assessing the Impact of Mental Health and Substance Abuse Parity," Biometrics, The International Biometric Society, vol. 67(1), pages 280-289, March.
    31. Manning, Willard G. & Mullahy, John, 2001. "Estimating log models: to transform or not to transform?," Journal of Health Economics, Elsevier, vol. 20(4), pages 461-494, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Wang, Yiren & Phillips, Peter C.B. & Su, Liangjun, 2024. "Panel data models with time-varying latent group structures," Journal of Econometrics, Elsevier, vol. 240(1).
    2. Yanbo Liu & Peter C. B. Phillips & Jun Yu, 2023. "A Panel Clustering Approach To Analyzing Bubble Behavior," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 64(4), pages 1347-1395, November.
    3. Lumsdaine, Robin L. & Okui, Ryo & Wang, Wendun, 2023. "Estimation of panel group structure models with structural breaks in group memberships and coefficients," Journal of Econometrics, Elsevier, vol. 233(1), pages 45-65.
    4. Yu, Lu & Gu, Jiaying & Volgushev, Stanislav, 2024. "Spectral clustering with variance information for group structure estimation in panel data," Journal of Econometrics, Elsevier, vol. 241(1).
    5. Leng, Xuan & Chen, Heng & Wang, Wendun, 2023. "Multi-dimensional latent group structures with heterogeneous distributions," Journal of Econometrics, Elsevier, vol. 233(1), pages 1-21.
    6. Mehrabani, Ali, 2023. "Estimation and identification of latent group structures in panel data," Journal of Econometrics, Elsevier, vol. 235(2), pages 1464-1482.
    7. Boyuan Zhang, 2022. "Incorporating Prior Knowledge of Latent Group Structure in Panel Data Models," Papers 2211.16714, arXiv.org, revised Oct 2023.
    8. Vasilis Sarafidis & Tom Wansbeek, 2020. "Celebrating 40 Years of Panel Data Analysis: Past, Present and Future," Monash Econometrics and Business Statistics Working Papers 6/20, Monash University, Department of Econometrics and Business Statistics.
    9. Wang, Wuyi & Su, Liangjun, 2021. "Identifying latent group structures in nonlinear panels," Journal of Econometrics, Elsevier, vol. 220(2), pages 272-295.
    10. Thibaut Lamadon & Elena Manresa & Stephane Bonhomme, 2016. "Discretizing Unobserved Heterogeneity," 2016 Meeting Papers 1536, Society for Economic Dynamics.
    11. Claudia Pigini & Alessandro Pionati & Francesco Valentini, 2023. "Specification testing with grouped fixed effects," Papers 2310.01950, arXiv.org.
    12. Yiren Wang & Liangjun Su & Yichong Zhang, 2022. "Low-rank Panel Quantile Regression: Estimation and Inference," Papers 2210.11062, arXiv.org.
    13. Su, Liangjun & Wang, Wuyi & Xu, Xingbai, 2023. "Identifying latent group structures in spatial dynamic panels," Journal of Econometrics, Elsevier, vol. 235(2), pages 1955-1980.
    14. Xiaorong Yang & Jia Chen & Degui Li & Runze Li, 2024. "Functional-Coefficient Quantile Regression for Panel Data with Latent Group Structure," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 42(3), pages 1026-1040, July.
    15. Miao, Ke & Su, Liangjun & Wang, Wendun, 2020. "Panel threshold regressions with latent group structures," Journal of Econometrics, Elsevier, vol. 214(2), pages 451-481.
    16. Okui, Ryo & Wang, Wendun, 2021. "Heterogeneous structural breaks in panel data models," Journal of Econometrics, Elsevier, vol. 220(2), pages 447-473.
    17. Stéphane Bonhomme & Thibaut Lamadon & Elena Manresa, 2022. "Discretizing Unobserved Heterogeneity," Econometrica, Econometric Society, vol. 90(2), pages 625-643, March.
    18. Ando, Tomohiro & Bai, Jushan, 2021. "Large-scale generalized linear longitudinal data models with grouped patterns of unobserved heterogeneity," MPRA Paper 111431, University Library of Munich, Germany.
    19. Huang, Wenxin & Jin, Sainan & Phillips, Peter C.B. & Su, Liangjun, 2021. "Nonstationary panel models with latent group structures and cross-section dependence," Journal of Econometrics, Elsevier, vol. 221(1), pages 198-222.
    20. Bartolucci, Francesco & Belotti, Federico & Peracchi, Franco, 2015. "Testing for time-invariant unobserved heterogeneity in generalized linear models for panel data," Journal of Econometrics, Elsevier, vol. 184(1), pages 111-123.

    More about this item

    Keywords

    panel data; Finite mixtures; EM algorithm; CEM algorithm; K-means; healthcare expenditures; unobserved heterogeneity;
    All these keywords.

    JEL classification:

    • C14 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Semiparametric and Nonparametric Methods: General
    • C23 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Models with Panel Data; Spatio-temporal Models
    • C51 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Model Construction and Estimation
    • I10 - Health, Education, and Welfare - - Health - - - General

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:yor:hectdg:24/16. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Jane Rawlings (email available below). General contact details of provider: https://edirc.repec.org/data/deyoruk.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.