IDEAS home Printed from https://ideas.repec.org/a/bla/biomet/v79y2023i4p3215-3226.html
   My bibliography  Save this article

Instability of inverse probability weighting methods and a remedy for nonignorable missing data

Author

Listed:
  • Pengfei Li
  • Jing Qin
  • Yukun Liu

Abstract

Inverse probability weighting (IPW) methods are commonly used to analyze nonignorable missing data (NIMD) under the assumption of a logistic model for the missingness probability. However, solving IPW equations numerically may involve nonconvergence problems when the sample size is moderate and the missingness probability is high. Moreover, those equations often have multiple roots, and identifying the best root is challenging. Therefore, IPW methods may have low efficiency or even produce biased results. We identify the pitfall in these methods pathologically: they involve the estimation of a moment‐generating function (MGF), and such functions are notoriously unstable in general. As a remedy, we model the outcome distribution given the covariates of the completely observed individuals semiparametrically. After forming an induced logistic regression (LR) model for the missingness status of the outcome and covariate, we develop a maximum conditional likelihood method to estimate the underlying parameters. The proposed method circumvents the estimation of an MGF and hence overcomes the instability of IPW methods. Our theoretical and simulation results show that the proposed method outperforms existing competitors greatly. Two real data examples are analyzed to illustrate the advantages of our method. We conclude that if only a parametric LR is assumed but the outcome regression model is left arbitrary, then one has to be cautious in using any of the existing statistical methods in problems involving NIMD.

Suggested Citation

  • Pengfei Li & Jing Qin & Yukun Liu, 2023. "Instability of inverse probability weighting methods and a remedy for nonignorable missing data," Biometrics, The International Biometric Society, vol. 79(4), pages 3215-3226, December.
  • Handle: RePEc:bla:biomet:v:79:y:2023:i:4:p:3215-3226
    DOI: 10.1111/biom.13881
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/biom.13881
    Download Restriction: no

    File URL: https://libkey.io/10.1111/biom.13881?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Kott, Phillip S. & Chang, Ted, 2010. "Using Calibration Weighting to Adjust for Nonignorable Unit Nonresponse," Journal of the American Statistical Association, American Statistical Association, vol. 105(491), pages 1265-1275.
    2. Wang Miao & Eric J. Tchetgen Tchetgen, 2016. "On varieties of doubly robust estimators under missingness not at random with a shadow variable," Biometrika, Biometrika Trust, vol. 103(2), pages 475-482.
    3. repec:bla:obuest:v:62:y:2000:i:2:p:305-22 is not listed on IDEAS
    4. Ted Chang & Phillip S. Kott, 2008. "Using calibration weighting to adjust for nonresponse under a plausible model," Biometrika, Biometrika Trust, vol. 95(3), pages 555-571.
    5. repec:mpr:mprres:8160 is not listed on IDEAS
    6. Heckman, James, 2013. "Sample selection bias as a specification error," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 31(3), pages 129-137.
    7. Jun Shao & Lei Wang, 2016. "Semiparametric inverse propensity weighting for nonignorable missing data," Biometrika, Biometrika Trust, vol. 103(1), pages 175-187.
    8. Breusch, T S & Pagan, A R, 1979. "A Simple Test for Heteroscedasticity and Random Coefficient Variation," Econometrica, Econometric Society, vol. 47(5), pages 1287-1294, September.
    9. Gong Tang, 2003. "Analysis of multivariate missing data with nonignorable nonresponse," Biometrika, Biometrika Trust, vol. 90(4), pages 747-764, December.
    10. Jiwei Zhao & Jun Shao, 2015. "Semiparametric Pseudo-Likelihoods in Generalized Linear Models With Nonignorable Missing Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(512), pages 1577-1590, December.
    11. Byung‐Joo Lee & L. C. Marsh, 2000. "Sample Selection Bias Correction for Missing Response Observations," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 62(2), pages 305-322, May.
    12. Xuerong Chen & Denis Heng-Yan Leung & Jing Qin, 2022. "Nonignorable Missing Data, Single Index Propensity Score and Profile Synthetic Distribution Function," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 40(2), pages 705-717, April.
    13. Kim, Jae Kwang & Yu, Cindy Long, 2011. "A Semiparametric Estimation of Mean Functionals With Nonignorable Missing Data," Journal of the American Statistical Association, American Statistical Association, vol. 106(493), pages 157-165.
    14. Wang Miao & Peng Ding & Zhi Geng, 2016. "Identifiability of Normal and Normal Mixture Models with Nonignorable Missing Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1673-1683, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Li, Mengyan & Ma, Yanyuan & Zhao, Jiwei, 2022. "Efficient estimation in a partially specified nonignorable propensity score model," Computational Statistics & Data Analysis, Elsevier, vol. 174(C).
    2. Wang, Lei & Zhao, Puying & Shao, Jun, 2021. "Dimension-reduced semiparametric estimation of distribution functions and quantiles with nonignorable nonresponse," Computational Statistics & Data Analysis, Elsevier, vol. 156(C).
    3. Shonosuke Sugasawa & Kosuke Morikawa & Keisuke Takahata, 2022. "Bayesian semiparametric modeling of response mechanism for nonignorable missing data," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(1), pages 101-117, March.
    4. Rui Duan & C. Jason Liang & Pamela Shaw & Cheng Yong Tang & Yong Chen, 2020. "Missing at Random or Not: A Semiparametric Testing Approach," Papers 2003.11181, arXiv.org.
    5. Yujing Shao & Lei Wang, 2022. "Generalized partial linear models with nonignorable dropouts," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 85(2), pages 223-252, February.
    6. Tianqing Liu & Xiaohui Yuan, 2020. "Doubly robust augmented-estimating-equations estimation with nonignorable nonresponse data," Statistical Papers, Springer, vol. 61(6), pages 2241-2270, December.
    7. Zhang, Jing & Wang, Qihua & Kang, Jian, 2020. "Feature screening under missing indicator imputation with non-ignorable missing response," Computational Statistics & Data Analysis, Elsevier, vol. 149(C).
    8. Lei Wang & Wei Ma, 2021. "Improved empirical likelihood inference and variable selection for generalized linear models with longitudinal nonignorable dropouts," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 73(3), pages 623-647, June.
    9. Cui, Xia & Guo, Jianhua & Yang, Guangren, 2017. "On the identifiability and estimation of generalized linear models with parametric nonignorable missing data mechanism," Computational Statistics & Data Analysis, Elsevier, vol. 107(C), pages 64-80.
    10. Liu, Tianqing & Yuan, Xiaohui & Sun, Jianguo, 2021. "Weighted rank estimation for nonparametric transformation models with nonignorable missing data," Computational Statistics & Data Analysis, Elsevier, vol. 153(C).
    11. Puying Zhao & Hui Zhao & Niansheng Tang & Zhaohai Li, 2017. "Weighted composite quantile regression analysis for nonignorable missing data using nonresponse instrument," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 29(2), pages 189-212, April.
    12. Tang, Cheng Yong, 2024. "A model specification test for semiparametric nonignorable missing data modeling," Econometrics and Statistics, Elsevier, vol. 30(C), pages 124-132.
    13. Bindele, Huybrechts F. & Nguelifack, Brice M., 2019. "Generalized signed-rank estimation for regression models with non-ignorable missing responses," Computational Statistics & Data Analysis, Elsevier, vol. 139(C), pages 14-33.
    14. Jierui Du & Xia Cui, 2024. "Semiparametric estimation in generalized additive partial linear models with nonignorable nonresponse data," Statistical Papers, Springer, vol. 65(5), pages 3235-3259, July.
    15. Xuerong Chen & Guoqing Diao & Jing Qin, 2020. "Pseudo likelihood‐based estimation and testing of missingness mechanism function in nonignorable missing data problems," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 47(4), pages 1377-1400, December.
    16. Xianwen Ding & Jiandong Chen & Xueping Chen, 2020. "Regularized quantile regression for ultrahigh-dimensional data with nonignorable missing responses," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 83(5), pages 545-568, July.
    17. Breunig, Christoph & Mammen, Enno & Simoni, Anna, 2018. "Nonparametric estimation in case of endogenous selection," Journal of Econometrics, Elsevier, vol. 202(2), pages 268-285.
    18. Jiwei Zhao, 2017. "Reducing bias for maximum approximate conditional likelihood estimator with general missing data mechanism," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 29(3), pages 577-593, July.
    19. Aiai Yu & Yujie Zhong & Xingdong Feng & Ying Wei, 2023. "Quantile regression for nonignorable missing data with its application of analyzing electronic medical records," Biometrics, The International Biometric Society, vol. 79(3), pages 2036-2049, September.
    20. Yilin Li & Wang Miao & Ilya Shpitser & Eric J. Tchetgen Tchetgen, 2023. "A self‐censoring model for multivariate nonignorable nonmonotone missing data," Biometrics, The International Biometric Society, vol. 79(4), pages 3203-3214, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:biomet:v:79:y:2023:i:4:p:3215-3226. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0006-341X .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.