IDEAS home Printed from https://ideas.repec.org/a/bla/istatr/v86y2018i2p189-204.html
   My bibliography  Save this article

Missing Data: A Unified Taxonomy Guided by Conditional Independence

Author

Listed:
  • Marco Doretti
  • Sara Geneletti
  • Elena Stanghellini

Abstract

Recent work (Seaman et al., ; Mealli & Rubin, ) attempts to clarify the not always well‐understood difference between realised and everywhere definitions of missing at random (MAR) and missing completely at random. Another branch of the literature (Mohan et al., ; Pearl & Mohan, ) exploits always‐observed covariates to give variable‐based definitions of MAR and missing completely at random. In this paper, we develop a unified taxonomy encompassing all approaches. In this taxonomy, the new concept of ‘complementary MAR’ is introduced, and its relationship with the concept of data observed at random is discussed. All relationships among these definitions are analysed and represented graphically. Conditional independence, both at the random variable and at the event level, is the formal language we adopt to connect all these definitions. Our paper covers both the univariate and the multivariate case, where attention is paid to monotone missingness and to the concept of sequential MAR. Specifically, for monotone missingness, we propose a sequential MAR definition that might be more appropriate than both everywhere and variable‐based MAR to model dropout in certain contexts.

Suggested Citation

  • Marco Doretti & Sara Geneletti & Elena Stanghellini, 2018. "Missing Data: A Unified Taxonomy Guided by Conditional Independence," International Statistical Review, International Statistical Institute, vol. 86(2), pages 189-204, August.
  • Handle: RePEc:bla:istatr:v:86:y:2018:i:2:p:189-204
    DOI: 10.1111/insr.12242
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/insr.12242
    Download Restriction: no

    File URL: https://libkey.io/10.1111/insr.12242?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. M. G. Kenward, 2003. "Pattern-mixture models with proper time dependence," Biometrika, Biometrika Trust, vol. 90(1), pages 53-71, March.
    2. Geert Molenberghs & Caroline Beunckens & Cristina Sotto & Michael G. Kenward, 2008. "Every missingness not at random model has a missingness at random counterpart with equal fit," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(2), pages 371-388, April.
    3. Fabrizia Mealli & Donald B. Rubin, 2015. "Clarifying missing at random and related definitions, and implications when coupled with exchangeability," Biometrika, Biometrika Trust, vol. 102(4), pages 995-1000.
    4. G. Molenberghs & B. Michiels & M. G. Kenward & P. J. Diggle, 1998. "Monotone missing data and pattern‐mixture models," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 52(2), pages 153-161, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Thakur Narendra Singh & Shukla Diwakar, 2022. "Missing data estimation based on the chaining technique in survey sampling," Statistics in Transition New Series, Polish Statistical Association, vol. 23(4), pages 91-111, December.
    2. Mehboob Ali & Göran Kauermann, 2021. "A split questionnaire survey design in the context of statistical matching," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(4), pages 1219-1236, October.
    3. Nitzan Cohen & Yakir Berchenko, 2021. "Normalized Information Criteria and Model Selection in the Presence of Missing Data," Mathematics, MDPI, vol. 9(19), pages 1-23, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. D. M. Farewell & C. Huang & V. Didelez, 2017. "Ignorability for general longitudinal data," Biometrika, Biometrika Trust, vol. 104(2), pages 317-326.
    2. Bunouf, Pierre & Molenberghs, Geert & Grouin, Jean-Marie & Thijs, Herbert, 2015. "A SAS Program Combining R Functionalities to Implement Pattern-Mixture Models," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 68(i08).
    3. Antonio R. Linero, 2022. "Simulation‐based estimators of analytically intractable causal effects," Biometrics, The International Biometric Society, vol. 78(3), pages 1001-1017, September.
    4. Brenden Bishop & Minjeong Jeon, 2016. "Book Review," Psychometrika, Springer;The Psychometric Society, vol. 81(4), pages 1164-1167, December.
    5. Maria Josefsson & Michael J. Daniels, 2021. "Bayesian semi‐parametric G‐computation for causal inference in a cohort study with MNAR dropout and death," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 70(2), pages 398-414, March.
    6. Morten Overgaard & Stefan Nygaard Hansen, 2021. "On the assumption of independent right censoring," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 48(4), pages 1234-1255, December.
    7. Fei Wang & Yuhao Deng, 2023. "Non-Asymptotic Bounds of AIPW Estimators for Means with Missingness at Random," Mathematics, MDPI, vol. 11(4), pages 1-14, February.
    8. Shu Xu & Shelley A. Blozis, 2011. "Sensitivity Analysis of Mixed Models for Incomplete Longitudinal Data," Journal of Educational and Behavioral Statistics, , vol. 36(2), pages 237-256, April.
    9. Bian, Yuan & Yi, Grace Y. & He, Wenqing, 2024. "A unified framework of analyzing missing data and variable selection using regularized likelihood," Computational Statistics & Data Analysis, Elsevier, vol. 194(C).
    10. Kott Phillip S. & Liao Dan, 2018. "Calibration Weighting for Nonresponse with Proxy Frame Variables (So that Unit Nonresponse Can Be Not Missing at Random)," Journal of Official Statistics, Sciendo, vol. 34(1), pages 107-120, March.
    11. Chenguang Wang & Michael J. Daniels, 2011. "A Note on MAR, Identifying Restrictions, Model Comparison, and Sensitivity Analysis in Pattern Mixture Models with and without Covariates for Incomplete Data," Biometrics, The International Biometric Society, vol. 67(3), pages 810-818, September.
    12. A. R. Linero, 2017. "Bayesian nonparametric analysis of longitudinal studies in the presence of informative missingness," Biometrika, Biometrika Trust, vol. 104(2), pages 327-341.
    13. Caroline Beunckens & Cristina Sotto & Geert Molenberghs & Geert Verbeke, 2009. "A multifaceted sensitivity analysis of the Slovenian public opinion survey data," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 58(2), pages 171-196, May.
    14. Hairu Wang & Zhiping Lu & Yukun Liu, 2023. "Score test for missing at random or not under logistic missingness models," Biometrics, The International Biometric Society, vol. 79(2), pages 1268-1279, June.
    15. Andrew T. Karl & Yan Yang & Sharon L. Lohr, 2013. "A Correlated Random Effects Model for Nonignorable Missing Data in Value-Added Assessment of Teacher Effects," Journal of Educational and Behavioral Statistics, , vol. 38(6), pages 577-603, December.
    16. Xiaojun Mao & Zhonglei Wang & Shu Yang, 2023. "Matrix completion under complex survey sampling," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 75(3), pages 463-492, June.
    17. Rianne Margaretha Schouten & Gerko Vink, 2021. "The Dance of the Mechanisms: How Observed Information Influences the Validity of Missingness Assumptions," Sociological Methods & Research, , vol. 50(3), pages 1243-1258, August.
    18. Daniel, Rhian M. & Kenward, Michael G., 2012. "A method for increasing the robustness of multiple imputation," Computational Statistics & Data Analysis, Elsevier, vol. 56(6), pages 1624-1643.
    19. Geert Molenberghs, 2012. "Discussion Contribution to 091037PR4 (Ghosh, Taylor, and Sargent)," Biometrics, The International Biometric Society, vol. 68(1), pages 233-235, March.
    20. Janicki, Ryan & Malec, Donald, 2013. "A Bayesian model averaging approach to analyzing categorical data with nonignorable nonresponse," Computational Statistics & Data Analysis, Elsevier, vol. 57(1), pages 600-614.

    More about this item

    JEL classification:

    • C1 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:istatr:v:86:y:2018:i:2:p:189-204. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/isiiinl.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.