IDEAS home Printed from https://ideas.repec.org/a/gam/jdataj/v7y2022i2p16-d733000.html
   My bibliography  Save this article

Regression-Based Approach to Test Missing Data Mechanisms

Author

Listed:
  • Serguei Rouzinov

    (Statistique Vaud, 1003 Lausanne, Switzerland)

  • André Berchtold

    (Institute of Social Sciences & NCCR LIVES, University of Lausanne, 1015 Lausanne, Switzerland)

Abstract

Missing data occur in almost all surveys; in order to handle them correctly it is essential to know their type. Missing data are generally divided into three types (or generating mechanisms): missing completely at random, missing at random, and missing not at random. The first step to understand the type of missing data generally consists in testing whether the missing data are missing completely at random or not. Several tests have been developed for that purpose, but they have difficulties when dealing with non-continuous variables and data with a low quantity of missing data. Our approach checks whether the missing data are missing completely at random or missing at random using a regression model and a distribution test, and it can be applied to continuous and categorical data. The simulation results show that our regression-based approach tends to be more sensitive to the quantity and the type of missing data than the commonly used methods.

Suggested Citation

  • Serguei Rouzinov & André Berchtold, 2022. "Regression-Based Approach to Test Missing Data Mechanisms," Data, MDPI, vol. 7(2), pages 1-28, January.
  • Handle: RePEc:gam:jdataj:v:7:y:2022:i:2:p:16-:d:733000
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2306-5729/7/2/16/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2306-5729/7/2/16/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. P. Diggle & M. G. Kenward, 1994. "Informative Drop‐Out in Longitudinal Data Analysis," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 43(1), pages 49-73, March.
    2. Hisashi Tanizaki, 1997. "Power comparison of non-parametric tests: Small-sample properties from Monte Carlo experiments," Journal of Applied Statistics, Taylor & Francis Journals, vol. 24(5), pages 603-632.
    3. Marco Marozzi, 2014. "The multisample Cucconi test," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 23(2), pages 209-227, June.
    4. Lan Huang & Ming-Hui Chen & Joseph G. Ibrahim, 2005. "Bayesian Analysis for Generalized Linear Models with Nonignorably Missing Covariates," Biometrics, The International Biometric Society, vol. 61(3), pages 767-780, September.
    5. Mortaza Jamshidian & Siavash Jalal, 2010. "Tests of Homoscedasticity, Normality, and Missing Completely at Random for Incomplete Multivariate Data," Psychometrika, Springer;The Psychometric Society, vol. 75(4), pages 649-674, December.
    6. Marco Marozzi, 2009. "Some notes on the location–scale Cucconi test," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 21(5), pages 629-647.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mukherjee, Amitava & Sen, Rudra, 2018. "Optimal design of Shewhart–Lepage type schemes and its application in monitoring service quality," European Journal of Operational Research, Elsevier, vol. 266(1), pages 147-167.
    2. David M. Murray & Jonathan L. Blitstein, 2003. "Methods To Reduce The Impact Of Intraclass Correlation In Group-Randomized Trials," Evaluation Review, , vol. 27(1), pages 79-103, February.
    3. Patrick E. B. FitzGerald, 2002. "Extended Generalized Estimating Equations for Binary Familial Data with Incomplete Families," Biometrics, The International Biometric Society, vol. 58(4), pages 718-726, December.
    4. Song, Zhi & Mukherjee, Amitava & Zhang, Jiujun, 2021. "Some robust approaches based on copula for monitoring bivariate processes and component-wise assessment," European Journal of Operational Research, Elsevier, vol. 289(1), pages 177-196.
    5. Nadine Chlass & Jens J. Krueger, 2007. "Small Sample Properties of the Wilcoxon Signed Rank Test with Discontinuous and Dependent Observations," Jena Economics Research Papers 2007-032, Friedrich-Schiller-University Jena.
    6. Pourahmadi, Mohsen & Daniels, Michael J. & Park, Trevor, 2007. "Simultaneous modelling of the Cholesky decomposition of several covariance matrices," Journal of Multivariate Analysis, Elsevier, vol. 98(3), pages 568-587, March.
    7. Chassan, Malika & Concordet, Didier, 2023. "How to test the missing data mechanism in a hidden Markov model," Computational Statistics & Data Analysis, Elsevier, vol. 182(C).
    8. Sinha, Sanjoy K. & Kaushal, Amit & Xiao, Wenzhong, 2014. "Inference for longitudinal data with nonignorable nonmonotone missing responses," Computational Statistics & Data Analysis, Elsevier, vol. 72(C), pages 77-91.
    9. E. Michael Foster & Grace Y. Fang, 2004. "Alternative Methods for Handling Attrition," Evaluation Review, , vol. 28(5), pages 434-464, October.
    10. Kateřina Macháčová & Hana Vaňková & Iva Holmerová & Inna Čábelková & Ladislav Volicer, 2018. "Ratings of activities of daily living in nursing home residents: comparison of self- and proxy ratings with actual performance and the impact of cognitive status," European Journal of Ageing, Springer, vol. 15(4), pages 349-358, December.
    11. Mette Ejrnæs & Anders Holm, 2006. "Comparing Fixed Effects and Covariance Structure Estimators for Panel Data," Sociological Methods & Research, , vol. 35(1), pages 61-83, August.
    12. Geert Verbeke & Geert Molenberghs & Herbert Thijs & Emmanuel Lesaffre & Michael G. Kenward, 2001. "Sensitivity Analysis for Nonrandom Dropout: A Local Influence Approach," Biometrics, The International Biometric Society, vol. 57(1), pages 7-14, March.
    13. Rebecca E. Anthony & Amy L. Paine & Katherine H. Shelton, 2019. "Depression and Anxiety Symptoms of British Adoptive Parents: A Prospective Four-Wave Longitudinal Study," IJERPH, MDPI, vol. 16(24), pages 1-14, December.
    14. Miran A. Jaffa & Ayad A. Jaffa, 2019. "A Likelihood-Based Approach with Shared Latent Random Parameters for the Longitudinal Binary and Informative Censoring Processes," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 11(3), pages 597-613, December.
    15. Molenberghs, Geert & Verbeke, Geert & Thijs, Herbert & Lesaffre, Emmanuel & Kenward, Michael G., 2001. "Influence analysis to assess sensitivity of the dropout process," Computational Statistics & Data Analysis, Elsevier, vol. 37(1), pages 93-113, July.
    16. Shu Xu & Shelley A. Blozis, 2011. "Sensitivity Analysis of Mixed Models for Incomplete Longitudinal Data," Journal of Educational and Behavioral Statistics, , vol. 36(2), pages 237-256, April.
    17. Sebastian Domhof & Edgar Brunner & D. Wayne Osgood, 2002. "Rank Procedures for Repeated Measures with Missing Values," Sociological Methods & Research, , vol. 30(3), pages 367-393, February.
    18. Kai Wang & Manikandan Narayanan & Hua Zhong & Martin Tompa & Eric E Schadt & Jun Zhu, 2009. "Meta-analysis of Inter-species Liver Co-expression Networks Elucidates Traits Associated with Common Human Diseases," PLOS Computational Biology, Public Library of Science, vol. 5(12), pages 1-16, December.
    19. Jun Li & Yao Yu, 2015. "A Nonparametric Test of Missing Completely at Random for Incomplete Multivariate Data," Psychometrika, Springer;The Psychometric Society, vol. 80(3), pages 707-726, September.
    20. Bian, Yuan & Yi, Grace Y. & He, Wenqing, 2024. "A unified framework of analyzing missing data and variable selection using regularized likelihood," Computational Statistics & Data Analysis, Elsevier, vol. 194(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jdataj:v:7:y:2022:i:2:p:16-:d:733000. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.