IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v11y2023i22p4576-d1276270.html
   My bibliography  Save this article

Variable Selection for Length-Biased and Interval-Censored Failure Time Data

Author

Listed:
  • Fan Feng

    (School of Mathematics, Jilin University, Changchun 130012, China)

  • Guanghui Cheng

    (Guangzhou Institute of International Finance, Guangzhou University, Guangzhou 510006, China)

  • Jianguo Sun

    (Department of Statistics, University of Missouri, Columbia, MO 65211, USA)

Abstract

Length-biased failure time data occur often in various biomedical fields, including clinical trials, epidemiological cohort studies and genome-wide association studies, and their analyses have been attracting a surge of interest. In practical applications, because one may collect a large number of candidate covariates for the failure event of interest, variable selection becomes a useful tool to identify the important risk factors and enhance the estimation accuracy. In this paper, we consider Cox’s proportional hazards model and develop a penalized variable selection technique with various popular penalty functions for length-biased data, in which the failure event of interest suffers from interval censoring. Specifically, a computationally stable and reliable penalized expectation-maximization algorithm via two-stage data augmentation is developed to overcome the challenge in maximizing the intractable penalized likelihood. We establish the oracle property of the proposed method and present some simulation results, suggesting that the proposed method outperforms the traditional variable selection method based on the conditional likelihood. The proposed method is then applied to a set of real data arising from the Prostate, Lung, Colorectal and Ovarian cancer screening trial. The analysis results show that African Americans and having immediate family members with prostate cancer significantly increase the risk of developing prostate cancer, while having diabetes exhibited a significantly lower risk of developing prostate cancer.

Suggested Citation

  • Fan Feng & Guanghui Cheng & Jianguo Sun, 2023. "Variable Selection for Length-Biased and Interval-Censored Failure Time Data," Mathematics, MDPI, vol. 11(22), pages 1-20, November.
  • Handle: RePEc:gam:jmathe:v:11:y:2023:i:22:p:4576-:d:1276270
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/11/22/4576/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/11/22/4576/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Zou, Hui, 2006. "The Adaptive Lasso and Its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1418-1429, December.
    2. Liuquan Sun & Shuwei Li & Lianming Wang & Xinyuan Song & Xuemei Sui, 2022. "Simultaneous variable selection in regression analysis of multivariate interval‐censored data," Biometrics, The International Biometric Society, vol. 78(4), pages 1402-1413, December.
    3. Prabhashi W. Withana Gamage & Christopher S. McMahan & Lianming Wang, 2023. "A flexible parametric approach for analyzing arbitrarily censored data that are potentially subject to left truncation under the proportional hazards model," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 29(1), pages 188-212, January.
    4. Tianyi Lu & Shuwei Li & Liuquan Sun, 2023. "Combined estimating equation approaches for the additive hazards model with left-truncated and interval-censored data," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 29(3), pages 672-697, July.
    5. Donglin Zeng & Lu Mao & D. Y. Lin, 2016. "Maximum likelihood estimation for semiparametric transformation models with interval-censored data," Biometrika, Biometrika Trust, vol. 103(2), pages 253-271.
    6. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    7. Shuwei Li & Limin Peng, 2023. "Instrumental variable estimation of complier causal treatment effect with interval‐censored data," Biometrics, The International Biometric Society, vol. 79(1), pages 253-263, March.
    8. Pao-sheng Shen & Yingwei Peng & Hsin-Jen Chen & Chyong-Mei Chen, 2022. "Maximum likelihood estimation for length-biased and interval-censored data with a nonsusceptible fraction," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 28(1), pages 68-88, January.
    9. Hao Helen Zhang & Wenbin Lu, 2007. "Adaptive Lasso for Cox's proportional hazards model," Biometrika, Biometrika Trust, vol. 94(3), pages 691-703.
    10. Shen, Yu & Ning, Jing & Qin, Jing, 2009. "Analyzing Length-Biased Data With Semiparametric Transformation and Accelerated Failure Time Models," Journal of the American Statistical Association, American Statistical Association, vol. 104(487), pages 1192-1202.
    11. Lianming Wang & Christopher S. McMahan & Michael G. Hudgens & Zaina P. Qureshi, 2016. "A flexible, computationally efficient method for fitting the proportional hazards model to interval-censored data," Biometrics, The International Biometric Society, vol. 72(1), pages 222-231, March.
    12. Dai, Linlin & Chen, Kani & Sun, Zhihua & Liu, Zhenqiu & Li, Gang, 2018. "Broken adaptive ridge regression and its asymptotic properties," Journal of Multivariate Analysis, Elsevier, vol. 168(C), pages 334-351.
    13. Fei Gao & Kwun Chuen Gary Chan, 2019. "Semiparametric regression analysis of length‐biased interval‐censored data," Biometrics, The International Biometric Society, vol. 75(1), pages 121-132, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Du, Mingyue & Zhao, Xingqiu & Sun, Jianguo, 2022. "Variable selection for case-cohort studies with informatively interval-censored outcomes," Computational Statistics & Data Analysis, Elsevier, vol. 172(C).
    2. Liuquan Sun & Shuwei Li & Lianming Wang & Xinyuan Song & Xuemei Sui, 2022. "Simultaneous variable selection in regression analysis of multivariate interval‐censored data," Biometrics, The International Biometric Society, vol. 78(4), pages 1402-1413, December.
    3. Xu, Yang & Zhao, Shishun & Hu, Tao & Sun, Jianguo, 2021. "Variable selection for generalized odds rate mixture cure models with interval-censored failure time data," Computational Statistics & Data Analysis, Elsevier, vol. 156(C).
    4. Tang, Linjun & Zhou, Zhangong & Wu, Changchun, 2012. "Weighted composite quantile estimation and variable selection method for censored regression model," Statistics & Probability Letters, Elsevier, vol. 82(3), pages 653-663.
    5. Zhang, Tao & Zhang, Qingzhao & Wang, Qihua, 2014. "Model detection for functional polynomial regression," Computational Statistics & Data Analysis, Elsevier, vol. 70(C), pages 183-197.
    6. Joseph G. Ibrahim & Hongtu Zhu & Ramon I. Garcia & Ruixin Guo, 2011. "Fixed and Random Effects Selection in Mixed Effects Models," Biometrics, The International Biometric Society, vol. 67(2), pages 495-503, June.
    7. Takumi Saegusa & Tianzhou Ma & Gang Li & Ying Qing Chen & Mei-Ling Ting Lee, 2020. "Variable Selection in Threshold Regression Model with Applications to HIV Drug Adherence Data," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 12(3), pages 376-398, December.
    8. Zhixuan Fu & Shuangge Ma & Haiqun Lin & Chirag R. Parikh & Bingqing Zhou, 2017. "Penalized Variable Selection for Multi-center Competing Risks Data," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 9(2), pages 379-405, December.
    9. Guang Cheng & Hao Zhang & Zuofeng Shang, 2015. "Sparse and efficient estimation for partial spline models with increasing dimension," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 67(1), pages 93-127, February.
    10. Lian, Heng & Li, Jianbo & Hu, Yuao, 2013. "Shrinkage variable selection and estimation in proportional hazards models with additive structure and high dimensionality," Computational Statistics & Data Analysis, Elsevier, vol. 63(C), pages 99-112.
    11. Qu, Lianqiang & Song, Xinyuan & Sun, Liuquan, 2018. "Identification of local sparsity and variable selection for varying coefficient additive hazards models," Computational Statistics & Data Analysis, Elsevier, vol. 125(C), pages 119-135.
    12. Zhao, Sihai Dave & Li, Yi, 2012. "Principled sure independence screening for Cox models with ultra-high-dimensional covariates," Journal of Multivariate Analysis, Elsevier, vol. 105(1), pages 397-411.
    13. Na You & Shun He & Xueqin Wang & Junxian Zhu & Heping Zhang, 2018. "Subtype classification and heterogeneous prognosis model construction in precision medicine," Biometrics, The International Biometric Society, vol. 74(3), pages 814-822, September.
    14. T. Cai & J. Huang & L. Tian, 2009. "Regularized Estimation for the Accelerated Failure Time Model," Biometrics, The International Biometric Society, vol. 65(2), pages 394-404, June.
    15. Heng Lian & Xin Chen & Jian-Yi Yang, 2012. "Identification of Partially Linear Structure in Additive Models with an Application to Gene Expression Prediction from Sequences," Biometrics, The International Biometric Society, vol. 68(2), pages 437-445, June.
    16. Michael R. Wierzbicki & Li-Bing Guo & Qing-Tao Du & Wensheng Guo, 2014. "Sparse Semiparametric Nonlinear Model With Application to Chromatographic Fingerprints," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(508), pages 1339-1349, December.
    17. Elena Ivona DUMITRESCU & Sullivan HUE & Christophe HURLIN & Sessi TOKPAVI, 2020. "Machine Learning or Econometrics for Credit Scoring: Let’s Get the Best of Both Worlds," LEO Working Papers / DR LEO 2839, Orleans Economics Laboratory / Laboratoire d'Economie d'Orleans (LEO), University of Orleans.
    18. Young Joo Yoon & Cheolwoo Park & Erik Hofmeister & Sangwook Kang, 2012. "Group variable selection in cardiopulmonary cerebral resuscitation data for veterinary patients," Journal of Applied Statistics, Taylor & Francis Journals, vol. 39(7), pages 1605-1621, January.
    19. Hansheng Wang & Bo Li & Chenlei Leng, 2009. "Shrinkage tuning parameter selection with a diverging number of parameters," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(3), pages 671-683, June.
    20. Engler David & Li Yi, 2009. "Survival Analysis with High-Dimensional Covariates: An Application in Microarray Studies," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 8(1), pages 1-24, February.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:11:y:2023:i:22:p:4576-:d:1276270. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.