IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2003.11181.html
   My bibliography  Save this paper

Missing at Random or Not: A Semiparametric Testing Approach

Author

Listed:
  • Rui Duan
  • C. Jason Liang
  • Pamela Shaw
  • Cheng Yong Tang
  • Yong Chen

Abstract

Practical problems with missing data are common, and statistical methods have been developed concerning the validity and/or efficiency of statistical procedures. On a central focus, there have been longstanding interests on the mechanism governing data missingness, and correctly deciding the appropriate mechanism is crucially relevant for conducting proper practical investigations. The conventional notions include the three common potential classes -- missing completely at random, missing at random, and missing not at random. In this paper, we present a new hypothesis testing approach for deciding between missing at random and missing not at random. Since the potential alternatives of missing at random are broad, we focus our investigation on a general class of models with instrumental variables for data missing not at random. Our setting is broadly applicable, thanks to that the model concerning the missing data is nonparametric, requiring no explicit model specification for the data missingness. The foundational idea is to develop appropriate discrepancy measures between estimators whose properties significantly differ only when missing at random does not hold. We show that our new hypothesis testing approach achieves an objective data oriented choice between missing at random or not. We demonstrate the feasibility, validity, and efficacy of the new test by theoretical analysis, simulation studies, and a real data analysis.

Suggested Citation

  • Rui Duan & C. Jason Liang & Pamela Shaw & Cheng Yong Tang & Yong Chen, 2020. "Missing at Random or Not: A Semiparametric Testing Approach," Papers 2003.11181, arXiv.org.
  • Handle: RePEc:arx:papers:2003.11181
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2003.11181
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Kott, Phillip S. & Chang, Ted, 2010. "Using Calibration Weighting to Adjust for Nonignorable Unit Nonresponse," Journal of the American Statistical Association, American Statistical Association, vol. 105(491), pages 1265-1275.
    2. Wang Miao & Eric J. Tchetgen Tchetgen, 2016. "On varieties of doubly robust estimators under missingness not at random with a shadow variable," Biometrika, Biometrika Trust, vol. 103(2), pages 475-482.
    3. Christoph Breunig, 2019. "Testing Missing at Random Using Instrumental Variables," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 37(2), pages 223-234, April.
    4. Hidehiko Ichimura & Whitney K. Newey, 2022. "The influence function of semiparametric estimators," Quantitative Economics, Econometric Society, vol. 13(1), pages 29-61, January.
    5. Newey, Whitney K, 1994. "The Asymptotic Variance of Semiparametric Estimators," Econometrica, Econometric Society, vol. 62(6), pages 1349-1382, November.
    6. repec:mpr:mprres:8160 is not listed on IDEAS
    7. Jiwei Zhao & Yanyuan Ma, 2018. "Optimal pseudolikelihood estimation in the analysis of multivariate missing data with nonignorable nonresponse," Biometrika, Biometrika Trust, vol. 105(2), pages 479-486.
    8. Heckman, James, 2013. "Sample selection bias as a specification error," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 31(3), pages 129-137.
    9. Jun Shao & Lei Wang, 2016. "Semiparametric inverse propensity weighting for nonignorable missing data," Biometrika, Biometrika Trust, vol. 103(1), pages 175-187.
    10. Hausman, Jerry, 2015. "Specification tests in econometrics," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 38(2), pages 112-134.
    11. Yong Chen & Kung-Yee Liang, 2010. "On the asymptotic behaviour of the pseudolikelihood ratio test statistic with boundary problems," Biometrika, Biometrika Trust, vol. 97(3), pages 603-620.
    12. Jiwei Zhao & Jun Shao, 2015. "Semiparametric Pseudo-Likelihoods in Generalized Linear Models With Nonignorable Missing Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(512), pages 1577-1590, December.
    13. James Heckman, 1997. "Instrumental Variables: A Study of Implicit Behavioral Assumptions Used in Making Program Evaluations," Journal of Human Resources, University of Wisconsin Press, vol. 32(3), pages 441-462.
    14. Kim, Jae Kwang & Yu, Cindy Long, 2011. "A Semiparametric Estimation of Mean Functionals With Nonignorable Missing Data," Journal of the American Statistical Association, American Statistical Association, vol. 106(493), pages 157-165.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Hairu Wang & Zhiping Lu & Yukun Liu, 2023. "Score test for missing at random or not under logistic missingness models," Biometrics, The International Biometric Society, vol. 79(2), pages 1268-1279, June.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Tang, Cheng Yong, 2024. "A model specification test for semiparametric nonignorable missing data modeling," Econometrics and Statistics, Elsevier, vol. 30(C), pages 124-132.
    2. Pengfei Li & Jing Qin & Yukun Liu, 2023. "Instability of inverse probability weighting methods and a remedy for nonignorable missing data," Biometrics, The International Biometric Society, vol. 79(4), pages 3215-3226, December.
    3. Wang, Lei & Zhao, Puying & Shao, Jun, 2021. "Dimension-reduced semiparametric estimation of distribution functions and quantiles with nonignorable nonresponse," Computational Statistics & Data Analysis, Elsevier, vol. 156(C).
    4. Yujing Shao & Lei Wang, 2022. "Generalized partial linear models with nonignorable dropouts," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 85(2), pages 223-252, February.
    5. Li, Mengyan & Ma, Yanyuan & Zhao, Jiwei, 2022. "Efficient estimation in a partially specified nonignorable propensity score model," Computational Statistics & Data Analysis, Elsevier, vol. 174(C).
    6. Shonosuke Sugasawa & Kosuke Morikawa & Keisuke Takahata, 2022. "Bayesian semiparametric modeling of response mechanism for nonignorable missing data," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(1), pages 101-117, March.
    7. Tianqing Liu & Xiaohui Yuan, 2020. "Doubly robust augmented-estimating-equations estimation with nonignorable nonresponse data," Statistical Papers, Springer, vol. 61(6), pages 2241-2270, December.
    8. Xiaohong Chen & Andres Santos, 2018. "Overidentification in Regular Models," Econometrica, Econometric Society, vol. 86(5), pages 1771-1817, September.
    9. Guilhem Bascle, 2008. "Controlling for endogeneity with instrumental variables in strategic management research," Post-Print hal-00576795, HAL.
    10. Zhang, Jing & Wang, Qihua & Kang, Jian, 2020. "Feature screening under missing indicator imputation with non-ignorable missing response," Computational Statistics & Data Analysis, Elsevier, vol. 149(C).
    11. Aiai Yu & Yujie Zhong & Xingdong Feng & Ying Wei, 2023. "Quantile regression for nonignorable missing data with its application of analyzing electronic medical records," Biometrics, The International Biometric Society, vol. 79(3), pages 2036-2049, September.
    12. Chernozhukov, Victor & Fernández-Val, Iván & Kowalski, Amanda E., 2015. "Quantile regression with censoring and endogeneity," Journal of Econometrics, Elsevier, vol. 186(1), pages 201-221.
    13. Lei Wang & Wei Ma, 2021. "Improved empirical likelihood inference and variable selection for generalized linear models with longitudinal nonignorable dropouts," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 73(3), pages 623-647, June.
    14. repec:wyi:journl:002082 is not listed on IDEAS
    15. Guo, Xu & Song, Lianlian & Fang, Yun & Zhu, Lixing, 2019. "Model checking for general linear regression with nonignorable missing response," Computational Statistics & Data Analysis, Elsevier, vol. 138(C), pages 1-12.
    16. Liu, Tianqing & Yuan, Xiaohui & Sun, Jianguo, 2021. "Weighted rank estimation for nonparametric transformation models with nonignorable missing data," Computational Statistics & Data Analysis, Elsevier, vol. 153(C).
    17. Majid Mojirsheibani, 2022. "On the maximal deviation of kernel regression estimators with NMAR response variables," Statistical Papers, Springer, vol. 63(5), pages 1677-1705, October.
    18. Alain Trognon, 2003. "L'économétrie des panels en perspective," Revue d'économie politique, Dalloz, vol. 113(6), pages 727-748.
    19. Bindele, Huybrechts F. & Nguelifack, Brice M., 2019. "Generalized signed-rank estimation for regression models with non-ignorable missing responses," Computational Statistics & Data Analysis, Elsevier, vol. 139(C), pages 14-33.
    20. Liu, Echu & Hsiao, Cheng & Matsumoto, Tomoya & Chou, Shinyi, 2009. "Maternal full-time employment and overweight children: Parametric, semi-parametric, and non-parametric assessment," Journal of Econometrics, Elsevier, vol. 152(1), pages 61-69, September.
    21. Carolyn Heinrich & Jeffrey Wenger, 2002. "The Economic Contributions of James J. Heckman and Daniel L. McFadden," Review of Political Economy, Taylor & Francis Journals, vol. 14(1), pages 69-89.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2003.11181. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.