IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v174y2022ics0167947321001560.html
   My bibliography  Save this article

Efficient estimation in a partially specified nonignorable propensity score model

Author

Listed:
  • Li, Mengyan
  • Ma, Yanyuan
  • Zhao, Jiwei

Abstract

Consider the regression setting where the response variable is subject to missing data and the covariates are fully observed. A nonignorable propensity score model, i.e., the probability that the response is observed conditional on all variables depends on the missing values themselves, is assumed throughout the paper. In such problems, model misspecification and model identifiability are two critical issues. A fully parametric approach can produce results that are sensitive to the model assumptions, while a fully nonparametric approach may not be sufficient for model identification. A new flexible semiparametric propensity score model is proposed where the relationship between the missingness indicator and the partially observed response is totally unspecified and estimated nonparametrically, while the relationship between the missingness indicator and the fully observed covariates is modeled parametrically. The proposed estimator is constructed via a semiparametric treatment and is proved to be semiparametrically efficient. Comprehensive simulation studies are conducted to examine the finite-sample performance of the estimators. While the naive parametric method leads to heavily biased estimator and poor coverage results, the proposed method produces estimator with negligible finite-sample biases and also correct inference results. The proposed method is further illustrated via an electronic health records (EHR) data application for the albumin level in the blood sample. The empirical analyses demonstrated that the proposed semiparametric propensity score model is more sensible than a purely parametric model. The proposed method could be very useful to uncover the unknown and possibly nonlinear dependence of the propensity score model to the albumin level, and is recommended for practical use.

Suggested Citation

  • Li, Mengyan & Ma, Yanyuan & Zhao, Jiwei, 2022. "Efficient estimation in a partially specified nonignorable propensity score model," Computational Statistics & Data Analysis, Elsevier, vol. 174(C).
  • Handle: RePEc:eee:csdana:v:174:y:2022:i:c:s0167947321001560
    DOI: 10.1016/j.csda.2021.107322
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947321001560
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2021.107322?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Ted Chang & Phillip S. Kott, 2008. "Using calibration weighting to adjust for nonresponse under a plausible model," Biometrika, Biometrika Trust, vol. 95(3), pages 555-571.
    2. repec:mpr:mprres:8160 is not listed on IDEAS
    3. Jiwei Zhao & Yanyuan Ma, 2018. "Optimal pseudolikelihood estimation in the analysis of multivariate missing data with nonignorable nonresponse," Biometrika, Biometrika Trust, vol. 105(2), pages 479-486.
    4. Eric J. Tchetgen Tchetgen & Kathleen E. Wirth, 2017. "A general instrumental variable framework for regression analysis with outcome missing not at random," Biometrics, The International Biometric Society, vol. 73(4), pages 1123-1131, December.
    5. Jun Shao & Lei Wang, 2016. "Semiparametric inverse propensity weighting for nonignorable missing data," Biometrika, Biometrika Trust, vol. 103(1), pages 175-187.
    6. Gong Tang, 2003. "Analysis of multivariate missing data with nonignorable nonresponse," Biometrika, Biometrika Trust, vol. 90(4), pages 747-764, December.
    7. Jiwei Zhao & Jun Shao, 2015. "Semiparametric Pseudo-Likelihoods in Generalized Linear Models With Nonignorable Missing Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(512), pages 1577-1590, December.
    8. Qin J. & Leung D. & Shao J., 2002. "Estimation With Survey Data Under Nonignorable Nonresponse or Informative Sampling," Journal of the American Statistical Association, American Statistical Association, vol. 97, pages 193-200, March.
    9. Kim, Jae Kwang & Yu, Cindy Long, 2011. "A Semiparametric Estimation of Mean Functionals With Nonignorable Missing Data," Journal of the American Statistical Association, American Statistical Association, vol. 106(493), pages 157-165.
    10. Wang Miao & Peng Ding & Zhi Geng, 2016. "Identifiability of Normal and Normal Mixture Models with Nonignorable Missing Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1673-1683, October.
    11. W. R. Gilks & P. Wild, 1992. "Adaptive Rejection Sampling for Gibbs Sampling," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 41(2), pages 337-348, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jierui Du & Xia Cui, 2024. "Semiparametric estimation in generalized additive partial linear models with nonignorable nonresponse data," Statistical Papers, Springer, vol. 65(5), pages 3235-3259, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Shonosuke Sugasawa & Kosuke Morikawa & Keisuke Takahata, 2022. "Bayesian semiparametric modeling of response mechanism for nonignorable missing data," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(1), pages 101-117, March.
    2. Pengfei Li & Jing Qin & Yukun Liu, 2023. "Instability of inverse probability weighting methods and a remedy for nonignorable missing data," Biometrics, The International Biometric Society, vol. 79(4), pages 3215-3226, December.
    3. Zhang, Jing & Wang, Qihua & Kang, Jian, 2020. "Feature screening under missing indicator imputation with non-ignorable missing response," Computational Statistics & Data Analysis, Elsevier, vol. 149(C).
    4. Cui, Xia & Guo, Jianhua & Yang, Guangren, 2017. "On the identifiability and estimation of generalized linear models with parametric nonignorable missing data mechanism," Computational Statistics & Data Analysis, Elsevier, vol. 107(C), pages 64-80.
    5. Wang, Lei & Zhao, Puying & Shao, Jun, 2021. "Dimension-reduced semiparametric estimation of distribution functions and quantiles with nonignorable nonresponse," Computational Statistics & Data Analysis, Elsevier, vol. 156(C).
    6. Puying Zhao & Hui Zhao & Niansheng Tang & Zhaohai Li, 2017. "Weighted composite quantile regression analysis for nonignorable missing data using nonresponse instrument," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 29(2), pages 189-212, April.
    7. Xianwen Ding & Jiandong Chen & Xueping Chen, 2020. "Regularized quantile regression for ultrahigh-dimensional data with nonignorable missing responses," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 83(5), pages 545-568, July.
    8. Aiai Yu & Yujie Zhong & Xingdong Feng & Ying Wei, 2023. "Quantile regression for nonignorable missing data with its application of analyzing electronic medical records," Biometrics, The International Biometric Society, vol. 79(3), pages 2036-2049, September.
    9. Yujing Shao & Lei Wang, 2022. "Generalized partial linear models with nonignorable dropouts," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 85(2), pages 223-252, February.
    10. Bindele, Huybrechts F. & Nguelifack, Brice M., 2019. "Generalized signed-rank estimation for regression models with non-ignorable missing responses," Computational Statistics & Data Analysis, Elsevier, vol. 139(C), pages 14-33.
    11. Jierui Du & Xia Cui, 2024. "Semiparametric estimation in generalized additive partial linear models with nonignorable nonresponse data," Statistical Papers, Springer, vol. 65(5), pages 3235-3259, July.
    12. Tianqing Liu & Xiaohui Yuan, 2020. "Doubly robust augmented-estimating-equations estimation with nonignorable nonresponse data," Statistical Papers, Springer, vol. 61(6), pages 2241-2270, December.
    13. Ji Chen & Jun Shao & Fang Fang, 2021. "Instrument search in pseudo-likelihood approach for nonignorable nonresponse," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 73(3), pages 519-533, June.
    14. Xuerong Chen & Guoqing Diao & Jing Qin, 2020. "Pseudo likelihood‐based estimation and testing of missingness mechanism function in nonignorable missing data problems," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 47(4), pages 1377-1400, December.
    15. Rui Duan & C. Jason Liang & Pamela Shaw & Cheng Yong Tang & Yong Chen, 2020. "Missing at Random or Not: A Semiparametric Testing Approach," Papers 2003.11181, arXiv.org.
    16. Jiwei Zhao, 2017. "Reducing bias for maximum approximate conditional likelihood estimator with general missing data mechanism," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 29(3), pages 577-593, July.
    17. Zhang, Ting & Wang, Lei, 2020. "Smoothed empirical likelihood inference and variable selection for quantile regression with nonignorable missing response," Computational Statistics & Data Analysis, Elsevier, vol. 144(C).
    18. Xiaojun Mao & Zhonglei Wang & Shu Yang, 2023. "Matrix completion under complex survey sampling," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 75(3), pages 463-492, June.
    19. Lei Wang & Wei Ma, 2021. "Improved empirical likelihood inference and variable selection for generalized linear models with longitudinal nonignorable dropouts," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 73(3), pages 623-647, June.
    20. Jiwei Zhao & Jun Shao, 2015. "Semiparametric Pseudo-Likelihoods in Generalized Linear Models With Nonignorable Missing Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(512), pages 1577-1590, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:174:y:2022:i:c:s0167947321001560. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.