IDEAS home Printed from https://ideas.repec.org/a/eee/jmvana/v159y2017icp151-167.html
   My bibliography  Save this article

Finite mixture modeling of censored data using the multivariate Student-t distribution

Author

Listed:
  • Lachos, Víctor H.
  • Moreno, Edgar J. López
  • Chen, Kun
  • Cabral, Celso Rômulo Barbosa

Abstract

Finite mixture models have been widely used for the modeling and analysis of data from a heterogeneous population. Moreover, data of this kind can be subject to some upper and/or lower detection limits because of the restriction of experimental apparatus. Another complication arises when measures of each population depart significantly from normality, for instance, in the presence of heavy tails or atypical observations. For such data structures, we propose a robust model for censored data based on finite mixtures of multivariate Student-t distributions. This approach allows us to model data with great flexibility, accommodating multimodality, heavy tails and also skewness depending on the structure of the mixture components. We develop an analytically simple, yet efficient, EM-type algorithm for conducting maximum likelihood estimation of the parameters. The algorithm has closed-form expressions at the E-step that rely on formulas for the mean and variance of the multivariate truncated Student-t distributions. Further, a general information-based method for approximating the asymptotic covariance matrix of the estimators is also presented. Results obtained from the analysis of both simulated and real datasets are reported to demonstrate the effectiveness of the proposed methodology. The proposed algorithm and methods are implemented in the new R package CensMixReg.

Suggested Citation

  • Lachos, Víctor H. & Moreno, Edgar J. López & Chen, Kun & Cabral, Celso Rômulo Barbosa, 2017. "Finite mixture modeling of censored data using the multivariate Student-t distribution," Journal of Multivariate Analysis, Elsevier, vol. 159(C), pages 151-167.
  • Handle: RePEc:eee:jmvana:v:159:y:2017:i:c:p:151-167
    DOI: 10.1016/j.jmva.2017.05.005
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0047259X1730310X
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jmva.2017.05.005?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Powell, James L, 1986. "Symmetrically Trimmed Least Squares Estimation for Tobit Models," Econometrica, Econometric Society, vol. 54(6), pages 1435-1460, November.
    2. Aldo M. Garay & Victor H. Lachos & Heleno Bolfarine & Celso R. B. Cabral, 2017. "Linear censored regression models with scale mixtures of normal distributions," Statistical Papers, Springer, vol. 58(1), pages 247-278, March.
    3. Wang, Wan-Lun & Lin, Tsung-I, 2016. "Maximum likelihood inference for the multivariate t mixture model," Journal of Multivariate Analysis, Elsevier, vol. 149(C), pages 54-64.
    4. Wan-Lun Wang & Tsung-I Lin, 2015. "Robust model-based clustering via mixtures of skew-t distributions with missing information," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 9(4), pages 423-445, December.
    5. Arellano-Valle, Reinaldo B. & Bolfarine, Heleno, 1995. "On some characterizations of the t-distribution," Statistics & Probability Letters, Elsevier, vol. 25(1), pages 79-85, October.
    6. Matos, Larissa A. & Lachos, Victor H. & Balakrishnan, N. & Labra, Filidor V., 2013. "Influence diagnostics in linear and nonlinear mixed-effects models with censored data," Computational Statistics & Data Analysis, Elsevier, vol. 57(1), pages 450-464.
    7. Reinaldo Arellano-Valle & Luis Castro & Graciela González-Farías & Karla Muñoz-Gajardo, 2012. "Student-t censored regression model: properties and inference," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 21(4), pages 453-473, November.
    8. Cabral, Celso Rômulo Barbosa & Lachos, Víctor Hugo & Prates, Marcos O., 2012. "Multivariate mixture modeling using skew-normal independent distributions," Computational Statistics & Data Analysis, Elsevier, vol. 56(1), pages 126-142, January.
    9. Adelchi Azzalini & Marc G. Genton, 2008. "Robust Likelihood Methods Based on the Skew‐t and Related Distributions," International Statistical Review, International Statistical Institute, vol. 76(1), pages 106-129, April.
    10. Powell, James L., 1984. "Least absolute deviations estimation for the censored regression model," Journal of Econometrics, Elsevier, vol. 25(3), pages 303-325, July.
    11. Steven Caudill, 2012. "A partially adaptive estimator for the censored regression model based on a mixture of normal distributions," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 21(2), pages 121-137, June.
    12. Maria Karlsson & Thomas Laitila, 2014. "Finite mixture modeling of censored regression models," Statistical Papers, Springer, vol. 55(3), pages 627-642, August.
    13. Lin, Tsung-I, 2014. "Learning from incomplete data via parameterized t mixture models through eigenvalue decomposition," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 183-195.
    14. Basso, Rodrigo M. & Lachos, Víctor H. & Cabral, Celso Rômulo Barbosa & Ghosh, Pulak, 2010. "Robust mixture modeling based on scale mixtures of skew-normal distributions," Computational Statistics & Data Analysis, Elsevier, vol. 54(12), pages 2926-2941, December.
    15. Chib, Siddhartha, 1992. "Bayes inference in the Tobit censored regression model," Journal of Econometrics, Elsevier, vol. 51(1-2), pages 79-99.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Camila Borelli Zeller & Celso Rômulo Barbosa Cabral & Víctor Hugo Lachos & Luis Benites, 2019. "Finite mixture of regression models for censored data based on scale mixtures of normal distributions," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(1), pages 89-116, March.
    2. Azzalini, Adelchi, 2022. "An overview on the progeny of the skew-normal family— A personal perspective," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    3. Christian E. Galarza & Tsung-I Lin & Wan-Lun Wang & Víctor H. Lachos, 2021. "On moments of folded and truncated multivariate Student-t distributions based on recurrence relations," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 84(6), pages 825-850, August.
    4. Wan-Lun Wang & Luis M. Castro & Yen-Ting Chang & Tsung-I Lin, 2019. "Mixtures of restricted skew-t factor analyzers with common factor loadings," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(2), pages 445-480, June.
    5. Wan-Lun Wang & Ahad Jamalizadeh & Tsung-I Lin, 2020. "Finite mixtures of multivariate scale-shape mixtures of skew-normal distributions," Statistical Papers, Springer, vol. 61(6), pages 2643-2670, December.
    6. Galarza, Christian E. & Matos, Larissa A. & Castro, Luis M. & Lachos, Victor H., 2022. "Moments of the doubly truncated selection elliptical distributions with emphasis on the unified multivariate skew-t distribution," Journal of Multivariate Analysis, Elsevier, vol. 189(C).
    7. Naderi, Mehrdad & Hung, Wen-Liang & Lin, Tsung-I & Jamalizadeh, Ahad, 2019. "A novel mixture model using the multivariate normal mean–variance mixture of Birnbaum–Saunders distributions and its application to extrasolar planets," Journal of Multivariate Analysis, Elsevier, vol. 171(C), pages 126-138.
    8. Wan-Lun Wang & Tsung-I Lin, 2022. "Robust clustering of multiply censored data via mixtures of t factor analyzers," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(1), pages 22-53, March.
    9. Francisco H. C. Alencar & Christian E. Galarza & Larissa A. Matos & Victor H. Lachos, 2022. "Finite mixture modeling of censored and missing data using the multivariate skew-normal distribution," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 16(3), pages 521-557, September.
    10. Mirfarah, Elham & Naderi, Mehrdad & Chen, Ding-Geng, 2021. "Mixture of linear experts model for censored data: A novel approach with scale-mixture of normal distributions," Computational Statistics & Data Analysis, Elsevier, vol. 158(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Víctor H. Lachos & Celso R. B. Cabral & Marcos O. Prates & Dipak K. Dey, 2019. "Flexible regression modeling for censored data based on mixtures of student-t distributions," Computational Statistics, Springer, vol. 34(1), pages 123-152, March.
    2. Camila Borelli Zeller & Celso Rômulo Barbosa Cabral & Víctor Hugo Lachos & Luis Benites, 2019. "Finite mixture of regression models for censored data based on scale mixtures of normal distributions," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(1), pages 89-116, March.
    3. Francisco H. C. Alencar & Larissa A Matos & Víctor H. Lachos, 2022. "Finite Mixture of Censored Linear Mixed Models for Irregularly Observed Longitudinal Data," Journal of Classification, Springer;The Classification Society, vol. 39(3), pages 463-486, November.
    4. Francisco H. C. Alencar & Christian E. Galarza & Larissa A. Matos & Victor H. Lachos, 2022. "Finite mixture modeling of censored and missing data using the multivariate skew-normal distribution," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 16(3), pages 521-557, September.
    5. Mirfarah, Elham & Naderi, Mehrdad & Chen, Ding-Geng, 2021. "Mixture of linear experts model for censored data: A novel approach with scale-mixture of normal distributions," Computational Statistics & Data Analysis, Elsevier, vol. 158(C).
    6. Jason Cook & James McDonald, 2013. "Partially Adaptive Estimation of Interval Censored Regression Models," Computational Economics, Springer;Society for Computational Economics, vol. 42(1), pages 119-131, June.
    7. Maria Karlsson & Thomas Laitila, 2014. "Finite mixture modeling of censored regression models," Statistical Papers, Springer, vol. 55(3), pages 627-642, August.
    8. Azzalini, Adelchi, 2022. "An overview on the progeny of the skew-normal family— A personal perspective," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    9. Wan-Lun Wang & Luis M. Castro & Yen-Ting Chang & Tsung-I Lin, 2019. "Mixtures of restricted skew-t factor analyzers with common factor loadings," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(2), pages 445-480, June.
    10. Wraith, Darren & Forbes, Florence, 2015. "Location and scale mixtures of Gaussians with flexible tail behaviour: Properties, inference and application to multivariate clustering," Computational Statistics & Data Analysis, Elsevier, vol. 90(C), pages 61-73.
    11. Randall A. Lewis & James B. McDonald, 2014. "Partially Adaptive Estimation of the Censored Regression Model," Econometric Reviews, Taylor & Francis Journals, vol. 33(7), pages 732-750, October.
    12. Morris, Katherine & Punzo, Antonio & McNicholas, Paul D. & Browne, Ryan P., 2019. "Asymmetric clusters and outliers: Mixtures of multivariate contaminated shifted asymmetric Laplace distributions," Computational Statistics & Data Analysis, Elsevier, vol. 132(C), pages 145-166.
    13. James B. McDonald & Hieu Nguyen, 2012. "Heteroskedasticity and Distributional Assumptions in the Censored Regression Model," BYU Macroeconomics and Computational Laboratory Working Paper Series 2012-09, Brigham Young University, Department of Economics, BYU Macroeconomics and Computational Laboratory.
    14. Eric Nazindigouba Kere, 2016. "Do political economy factors matter in explaining the increase in the production of bioenergy?," WIDER Working Paper Series 025, World Institute for Development Economic Research (UNU-WIDER).
    15. Lachos, Victor H. & Prates, Marcos O. & Dey, Dipak K., 2021. "Heckman selection-t model: Parameter estimation via the EM-algorithm," Journal of Multivariate Analysis, Elsevier, vol. 184(C).
    16. John Geweke & Joel Horowitz & M. Hashem Pesaran, 2006. "Econometrics: A Bird’s Eye View," CESifo Working Paper Series 1870, CESifo.
    17. Alejandro Cid & Daniel Ferres & Máximo Rossi, 2008. "Subjective Well-Being in the Southern Cone: Health, Income and Family," Documentos de Trabajo (working papers) 1308, Department of Economics - dECON.
    18. Hung Tong & Cristina Tortora, 2022. "Model-based clustering and outlier detection with missing data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 16(1), pages 5-30, March.
    19. Mark Ottoni Wilhelm, 2008. "Practical Considerations for Choosing Between Tobit and SCLS or CLAD Estimators for Censored Regression Models with an Application to Charitable Giving," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 70(4), pages 559-582, August.
    20. P. Čížek & S. Sadikoglu, 2018. "Bias-corrected quantile regression estimation of censored regression models," Statistical Papers, Springer, vol. 59(1), pages 215-247, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jmvana:v:159:y:2017:i:c:p:151-167. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.