IDEAS home Printed from https://ideas.repec.org/a/gam/jdataj/v7y2022i2p16-d733000.html
   My bibliography  Save this article

Regression-Based Approach to Test Missing Data Mechanisms

Author

Listed:
  • Serguei Rouzinov

    (Statistique Vaud, 1003 Lausanne, Switzerland)

  • André Berchtold

    (Institute of Social Sciences & NCCR LIVES, University of Lausanne, 1015 Lausanne, Switzerland)

Abstract

Missing data occur in almost all surveys; in order to handle them correctly it is essential to know their type. Missing data are generally divided into three types (or generating mechanisms): missing completely at random, missing at random, and missing not at random. The first step to understand the type of missing data generally consists in testing whether the missing data are missing completely at random or not. Several tests have been developed for that purpose, but they have difficulties when dealing with non-continuous variables and data with a low quantity of missing data. Our approach checks whether the missing data are missing completely at random or missing at random using a regression model and a distribution test, and it can be applied to continuous and categorical data. The simulation results show that our regression-based approach tends to be more sensitive to the quantity and the type of missing data than the commonly used methods.

Suggested Citation

  • Serguei Rouzinov & André Berchtold, 2022. "Regression-Based Approach to Test Missing Data Mechanisms," Data, MDPI, vol. 7(2), pages 1-28, January.
  • Handle: RePEc:gam:jdataj:v:7:y:2022:i:2:p:16-:d:733000
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2306-5729/7/2/16/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2306-5729/7/2/16/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. P. Diggle & M. G. Kenward, 1994. "Informative Drop‐Out in Longitudinal Data Analysis," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 43(1), pages 49-73, March.
    2. Mortaza Jamshidian & Siavash Jalal, 2010. "Tests of Homoscedasticity, Normality, and Missing Completely at Random for Incomplete Multivariate Data," Psychometrika, Springer;The Psychometric Society, vol. 75(4), pages 649-674, December.
    3. Hisashi Tanizaki, 1997. "Power comparison of non-parametric tests: Small-sample properties from Monte Carlo experiments," Journal of Applied Statistics, Taylor & Francis Journals, vol. 24(5), pages 603-632.
    4. Marco Marozzi, 2014. "The multisample Cucconi test," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 23(2), pages 209-227, June.
    5. Lan Huang & Ming-Hui Chen & Joseph G. Ibrahim, 2005. "Bayesian Analysis for Generalized Linear Models with Nonignorably Missing Covariates," Biometrics, The International Biometric Society, vol. 61(3), pages 767-780, September.
    6. Marco Marozzi, 2009. "Some notes on the location–scale Cucconi test," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 21(5), pages 629-647.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mukherjee, Amitava & Sen, Rudra, 2018. "Optimal design of Shewhart–Lepage type schemes and its application in monitoring service quality," European Journal of Operational Research, Elsevier, vol. 266(1), pages 147-167.
    2. David M. Murray & Jonathan L. Blitstein, 2003. "Methods To Reduce The Impact Of Intraclass Correlation In Group-Randomized Trials," Evaluation Review, , vol. 27(1), pages 79-103, February.
    3. Patrick E. B. FitzGerald, 2002. "Extended Generalized Estimating Equations for Binary Familial Data with Incomplete Families," Biometrics, The International Biometric Society, vol. 58(4), pages 718-726, December.
    4. Song, Zhi & Mukherjee, Amitava & Zhang, Jiujun, 2021. "Some robust approaches based on copula for monitoring bivariate processes and component-wise assessment," European Journal of Operational Research, Elsevier, vol. 289(1), pages 177-196.
    5. Pourahmadi, Mohsen & Daniels, Michael J. & Park, Trevor, 2007. "Simultaneous modelling of the Cholesky decomposition of several covariance matrices," Journal of Multivariate Analysis, Elsevier, vol. 98(3), pages 568-587, March.
    6. E. Michael Foster & Grace Y. Fang, 2004. "Alternative Methods for Handling Attrition," Evaluation Review, , vol. 28(5), pages 434-464, October.
    7. Kateřina Macháčová & Hana Vaňková & Iva Holmerová & Inna Čábelková & Ladislav Volicer, 2018. "Ratings of activities of daily living in nursing home residents: comparison of self- and proxy ratings with actual performance and the impact of cognitive status," European Journal of Ageing, Springer, vol. 15(4), pages 349-358, December.
    8. Mette Ejrnæs & Anders Holm, 2006. "Comparing Fixed Effects and Covariance Structure Estimators for Panel Data," Sociological Methods & Research, , vol. 35(1), pages 61-83, August.
    9. Miran A. Jaffa & Ayad A. Jaffa, 2019. "A Likelihood-Based Approach with Shared Latent Random Parameters for the Longitudinal Binary and Informative Censoring Processes," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 11(3), pages 597-613, December.
    10. Shu Xu & Shelley A. Blozis, 2011. "Sensitivity Analysis of Mixed Models for Incomplete Longitudinal Data," Journal of Educational and Behavioral Statistics, , vol. 36(2), pages 237-256, April.
    11. Kai Wang & Manikandan Narayanan & Hua Zhong & Martin Tompa & Eric E Schadt & Jun Zhu, 2009. "Meta-analysis of Inter-species Liver Co-expression Networks Elucidates Traits Associated with Common Human Diseases," PLOS Computational Biology, Public Library of Science, vol. 5(12), pages 1-16, December.
    12. Krüger, Jens & Chlaß, Nadine, 2007. "Small Sample Properties of the Wilcoxon Signed Rank Test with Discontinuous and Dependent Observations," Publications of Darmstadt Technical University, Institute for Business Studies (BWL) 34399, Darmstadt Technical University, Department of Business Administration, Economics and Law, Institute for Business Studies (BWL).
    13. Bian, Yuan & Yi, Grace Y. & He, Wenqing, 2024. "A unified framework of analyzing missing data and variable selection using regularized likelihood," Computational Statistics & Data Analysis, Elsevier, vol. 194(C).
    14. Ke-Hai Yuan & Mortaza Jamshidian & Yutaka Kano, 2018. "Missing Data Mechanisms and Homogeneity of Means and Variances–Covariances," Psychometrika, Springer;The Psychometric Society, vol. 83(2), pages 425-442, June.
    15. Lars Relund Nielsen & Erik Jørgensen & Søren Højsgaard, 2011. "Embedding a state space model into a Markov decision process," Annals of Operations Research, Springer, vol. 190(1), pages 289-309, October.
    16. Hairu Wang & Zhiping Lu & Yukun Liu, 2023. "Score test for missing at random or not under logistic missingness models," Biometrics, The International Biometric Society, vol. 79(2), pages 1268-1279, June.
    17. Jennifer Chan & Wai Wan, 2011. "Bayesian approach to analysing longitudinal bivariate binary data with informative dropout," Computational Statistics, Springer, vol. 26(1), pages 121-144, March.
    18. Jayajit Chakraborty & Timothy W. Collins & Sara E. Grineski & Alejandra Maldonado, 2017. "Racial Differences in Perceptions of Air Pollution Health Risk: Does Environmental Exposure Matter?," IJERPH, MDPI, vol. 14(2), pages 1-16, January.
    19. Marco Marozzi, 2014. "The multisample Cucconi test," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 23(2), pages 209-227, June.
    20. Zhixiong Zhou & Shanshan Dong & Jun Yin & Quan Fu & Hong Ren & Zenong Yin, 2018. "Improving Physical Fitness and Cognitive Functions in Middle School Students: Study Protocol for the Chinese Childhood Health, Activity and Motor Performance Study (Chinese CHAMPS)," IJERPH, MDPI, vol. 15(5), pages 1-15, May.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jdataj:v:7:y:2022:i:2:p:16-:d:733000. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.