IDEAS home Printed from https://ideas.repec.org/a/eee/jmvana/v99y2008i6p1302-1331.html
   My bibliography  Save this article

Theory and inference for regression models with missing responses and covariates

Author

Listed:
  • Chen, Qingxia
  • Ibrahim, Joseph G.
  • Chen, Ming-Hui
  • Senchaudhuri, Pralay

Abstract

In this paper, we carry out an in-depth theoretical investigation for inference with missing response and covariate data for general regression models. We assume that the missing data are missing at random (MAR) or missing completely at random (MCAR) throughout. Previous theoretical investigations in the literature have focused only on missing covariates or missing responses, but not both. Here, we consider theoretical properties of the estimates under three different estimation settings: complete case (CC) analysis, a complete response (CR) analysis that involves an analysis of those subjects with only completely observed responses, and the all case (AC) analysis, which is an analysis based on all of the cases. Under each scenario, we derive general expressions for the likelihood and devise estimation schemes based on the EM algorithm. We carry out a theoretical investigation of the three estimation methods in the normal linear model and analytically characterize the loss of information for each method, as well as derive and compare the asymptotic variances for each method assuming the missing data are MAR or MCAR. In addition, a theoretical investigation of bias for the CC method is also carried out. A simulation study and real dataset are given to illustrate the methodology.

Suggested Citation

  • Chen, Qingxia & Ibrahim, Joseph G. & Chen, Ming-Hui & Senchaudhuri, Pralay, 2008. "Theory and inference for regression models with missing responses and covariates," Journal of Multivariate Analysis, Elsevier, vol. 99(6), pages 1302-1331, July.
  • Handle: RePEc:eee:jmvana:v:99:y:2008:i:6:p:1302-1331
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0047-259X(07)00115-7
    Download Restriction: Full text for ScienceDirect subscribers only
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Joseph G. Ibrahim & Ming-Hui Chen & Stuart R. Lipsitz & Amy H. Herring, 2005. "Missing-Data Methods for Generalized Linear Models: A Comparative Review," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 332-346, March.
    2. Amy L. Stubbendick & Joseph G. Ibrahim, 2003. "Maximum Likelihood Methods for Nonignorable Missing Responses and Covariates in Random Effects Models," Biometrics, The International Biometric Society, vol. 59(4), pages 1140-1150, December.
    3. Chen M-H. & Ibrahim J.G. & Shao Q-M., 2004. "Propriety of the Posterior Distribution and Existence of the MLE for Regression Models With Covariates Missing at Random," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 421-438, January.
    4. Joseph G. Ibrahim & Ming-Hui Chen & Stuart R. Lipsitz, 1999. "Monte Carlo EM for Missing Covariates in Parametric Regression Models," Biometrics, The International Biometric Society, vol. 55(2), pages 591-596, June.
    5. J. G. Ibrahim & S. R. Lipsitz & M.‐H. Chen, 1999. "Missing covariates in generalized linear models when the missing data mechanism is non‐ignorable," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 61(1), pages 173-190.
    6. Gong Tang, 2003. "Analysis of multivariate missing data with nonignorable nonresponse," Biometrika, Biometrika Trust, vol. 90(4), pages 747-764, December.
    7. Qingxia Chen & Joseph G. Ibrahim, 2006. "Semiparametric Models for Missing Covariate and Response Data in Regression Models," Biometrics, The International Biometric Society, vol. 62(1), pages 177-184, March.
    8. Herring A. H & Ibrahim J. G, 2001. "Likelihood-Based Methods for Missing Covariates in the Cox Proportional Hazards Model," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 292-302, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ana M. Bianco & Paula M. Spano, 2019. "Robust inference for nonlinear regression models," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(2), pages 369-398, June.
    2. Nian-Sheng Tang & Pu-Ying Zhao, 2013. "Empirical likelihood semiparametric nonlinear regression analysis for longitudinal data with responses missing at random," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 65(4), pages 639-665, August.
    3. Ana M. Bianco & Graciela Boente & Wenceslao González-Manteiga & Ana Pérez-González, 2019. "Plug-in marginal estimation under a general regression model with missing responses and covariates," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(1), pages 106-146, March.
    4. Bian, Yuan & Yi, Grace Y. & He, Wenqing, 2024. "A unified framework of analyzing missing data and variable selection using regularized likelihood," Computational Statistics & Data Analysis, Elsevier, vol. 194(C).
    5. Bindele, Huybrechts F. & Nguelifack, Brice M., 2019. "Generalized signed-rank estimation for regression models with non-ignorable missing responses," Computational Statistics & Data Analysis, Elsevier, vol. 139(C), pages 14-33.
    6. Nanhua Zhang & Roderick J. Little, 2012. "A Pseudo-Bayesian Shrinkage Approach to Regression with Missing Covariates," Biometrics, The International Biometric Society, vol. 68(3), pages 933-942, September.
    7. Fang, Fang & Shao, Jun, 2016. "Iterated imputation estimation for generalized linear models with missing response and covariate values," Computational Statistics & Data Analysis, Elsevier, vol. 103(C), pages 111-123.
    8. Joseph Ibrahim & Geert Molenberghs, 2009. "Missing data methods in longitudinal studies: a review," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 18(1), pages 1-43, May.
    9. Baojiang Chen & Xiao-Hua Zhou, 2011. "Doubly Robust Estimates for Binary Longitudinal Data Analysis with Missing Response and Missing Covariates," Biometrics, The International Biometric Society, vol. 67(3), pages 830-842, September.
    10. Yang, Miao & Das, Kalyan & Majumdar, Anandamayee, 2016. "Analysis of bivariate zero inflated count data with missing responses," Journal of Multivariate Analysis, Elsevier, vol. 148(C), pages 73-82.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Bindele, Huybrechts F. & Nguelifack, Brice M., 2019. "Generalized signed-rank estimation for regression models with non-ignorable missing responses," Computational Statistics & Data Analysis, Elsevier, vol. 139(C), pages 14-33.
    2. Fang, Fang & Shao, Jun, 2016. "Iterated imputation estimation for generalized linear models with missing response and covariate values," Computational Statistics & Data Analysis, Elsevier, vol. 103(C), pages 111-123.
    3. Lan Huang & Ming-Hui Chen & Joseph G. Ibrahim, 2005. "Bayesian Analysis for Generalized Linear Models with Nonignorably Missing Covariates," Biometrics, The International Biometric Society, vol. 61(3), pages 767-780, September.
    4. Nanhua Zhang & Roderick J. Little, 2012. "A Pseudo-Bayesian Shrinkage Approach to Regression with Missing Covariates," Biometrics, The International Biometric Society, vol. 68(3), pages 933-942, September.
    5. Chen, Xue-Dong & Fu, Ying-Zi, 2011. "Model selection for zero-inflated regression with missing covariates," Computational Statistics & Data Analysis, Elsevier, vol. 55(1), pages 765-773, January.
    6. Hongtu Zhu & Joseph G. Ibrahim & Xiaoyan Shi, 2009. "Diagnostic Measures for Generalized Linear Models with Missing Covariates," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 36(4), pages 686-712, December.
    7. Chen, Ming-Hui & Ibrahim, Joseph G. & Shao, Qi-Man, 2009. "Maximum likelihood inference for the Cox regression model with applications to missing covariates," Journal of Multivariate Analysis, Elsevier, vol. 100(9), pages 2018-2030, October.
    8. Zhang, Jing & Wang, Qihua & Kang, Jian, 2020. "Feature screening under missing indicator imputation with non-ignorable missing response," Computational Statistics & Data Analysis, Elsevier, vol. 149(C).
    9. Gerda Claeskens & Fabrizio Consentino, 2008. "Variable Selection with Incomplete Covariate Data," Biometrics, The International Biometric Society, vol. 64(4), pages 1062-1069, December.
    10. Jiang, Depeng & Zhao, Puying & Tang, Niansheng, 2016. "A propensity score adjustment method for regression models with nonignorable missing covariates," Computational Statistics & Data Analysis, Elsevier, vol. 94(C), pages 98-119.
    11. Jiang, Wei & Josse, Julie & Lavielle, Marc, 2020. "Logistic regression with missing covariates—Parameter estimation, model selection and prediction within a joint-modeling framework," Computational Statistics & Data Analysis, Elsevier, vol. 145(C).
    12. Kano, Yutaka & Takai, Keiji, 2011. "Analysis of NMAR missing data without specifying missing-data mechanisms in a linear latent variate model," Journal of Multivariate Analysis, Elsevier, vol. 102(9), pages 1241-1255, October.
    13. Samiran Sinha & Krishna K. Saha & Suojin Wang, 2014. "Semiparametric approach for non-monotone missing covariates in a parametric regression model," Biometrics, The International Biometric Society, vol. 70(2), pages 299-311, June.
    14. Breunig, Christoph, 2017. "Testing missing at random using instrumental variables," SFB 649 Discussion Papers 2017-007, Humboldt University Berlin, Collaborative Research Center 649: Economic Risk.
    15. Jiwei Zhao & Jun Shao, 2015. "Semiparametric Pseudo-Likelihoods in Generalized Linear Models With Nonignorable Missing Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(512), pages 1577-1590, December.
    16. Lee, Min Cherng & Mitra, Robin, 2016. "Multiply imputing missing values in data sets with mixed measurement scales using a sequence of generalised linear models," Computational Statistics & Data Analysis, Elsevier, vol. 95(C), pages 24-38.
    17. Wang, Lei & Zhao, Puying & Shao, Jun, 2021. "Dimension-reduced semiparametric estimation of distribution functions and quantiles with nonignorable nonresponse," Computational Statistics & Data Analysis, Elsevier, vol. 156(C).
    18. Puying Zhao & Hui Zhao & Niansheng Tang & Zhaohai Li, 2017. "Weighted composite quantile regression analysis for nonignorable missing data using nonresponse instrument," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 29(2), pages 189-212, April.
    19. Richard M. Golden & Steven S. Henley & Halbert White & T. Michael Kashner, 2019. "Consequences of Model Misspecification for Maximum Likelihood Estimation with Missing Data," Econometrics, MDPI, vol. 7(3), pages 1-27, September.
    20. Regier Michael D. & Moodie Erica E. M., 2016. "The Orthogonally Partitioned EM Algorithm: Extending the EM Algorithm for Algorithmic Stability and Bias Correction Due to Imperfect Data," The International Journal of Biostatistics, De Gruyter, vol. 12(1), pages 65-77, May.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jmvana:v:99:y:2008:i:6:p:1302-1331. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.