IDEAS home Printed from https://ideas.repec.org/a/spr/lifeda/v30y2024i3d10.1007_s10985-024-09621-2.html
   My bibliography  Save this article

Cox model inference for relative hazard and pure risk from stratified weight-calibrated case-cohort data

Author

Listed:
  • Lola Etievant

    (National Cancer Institute)

  • Mitchell H. Gail

    (National Cancer Institute)

Abstract

The case-cohort design obtains complete covariate data only on cases and on a random sample (the subcohort) of the entire cohort. Subsequent publications described the use of stratification and weight calibration to increase efficiency of estimates of Cox model log-relative hazards, and there has been some work estimating pure risk. Yet there are few examples of these options in the medical literature, and we could not find programs currently online to analyze these various options. We therefore present a unified approach and R software to facilitate such analyses. We used influence functions adapted to the various design and analysis options together with variance calculations that take the two-phase sampling into account. This work clarifies when the widely used “robust” variance estimate of Barlow (Biometrics 50:1064–1072, 1994) is appropriate. The corresponding R software, CaseCohortCoxSurvival, facilitates analysis with and without stratification and/or weight calibration, for subcohort sampling with or without replacement. We also allow for phase-two data to be missing at random for stratified designs. We provide inference not only for log-relative hazards in the Cox model, but also for cumulative baseline hazards and covariate-specific pure risks. We hope these calculations and software will promote wider use of more efficient and principled design and analysis options for case-cohort studies.

Suggested Citation

  • Lola Etievant & Mitchell H. Gail, 2024. "Cox model inference for relative hazard and pure risk from stratified weight-calibrated case-cohort data," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 30(3), pages 572-599, July.
  • Handle: RePEc:spr:lifeda:v:30:y:2024:i:3:d:10.1007_s10985-024-09621-2
    DOI: 10.1007/s10985-024-09621-2
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10985-024-09621-2
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10985-024-09621-2?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Ruth H. Keogh & Shaun R. Seaman & Jonathan W. Bartlett & Angela M. Wood, 2018. "Multiple imputation of missing data in nested case‐control and case‐cohort studies," Biometrics, The International Biometric Society, vol. 74(4), pages 1438-1449, December.
    2. Yei Eun Shin & Ruth M. Pfeiffer & Barry I. Graubard & Mitchell H. Gail, 2020. "Weight calibration to improve the efficiency of pure risk estimates from case‐control samples nested in a cohort," Biometrics, The International Biometric Society, vol. 76(4), pages 1087-1097, December.
    3. Yayun Xu & Soyoung Kim & Mei-Jie Zhang & David Couper & Kwang Woo Ahn, 2022. "Competing risks regression models with covariates-adjusted censoring weight under the generalized case-cohort design," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 28(2), pages 241-262, April.
    4. Mark, Steven D. & Katki, Hormuzd A., 2006. "Specifying and Implementing Nonparametric and Semiparametric Survival Estimators in Two-Stage (Nested) Cohort Studies With Missing Case Data," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 460-471, June.
    5. Thomas Lumley & Pamela A. Shaw & James Y. Dai, 2011. "Connections between Survey Calibration Estimators and Semiparametric Models for Incomplete Data," International Statistical Review, International Statistical Institute, vol. 79(2), pages 200-220, August.
    6. Stephen J Sharp & Manon Poulaliou & Simon G Thompson & Ian R White & Angela M Wood, 2014. "A Review of Published Analyses of Case-Cohort Studies and Recommendations for Future Reporting," PLOS ONE, Public Library of Science, vol. 9(6), pages 1-6, June.
    7. Soyoung Kim & Donglin Zeng & Jianwen Cai, 2018. "Analysis of multiple survival events in generalized case‐cohort designs," Biometrics, The International Biometric Society, vol. 74(4), pages 1250-1260, December.
    8. Langholz, Bryan & Jiao, Jenny, 2007. "Computational methods for case-cohort studies," Computational Statistics & Data Analysis, Elsevier, vol. 51(8), pages 3737-3748, May.
    9. Jieli Ding & Tsui-Shan Lu & Jianwen Cai & Haibo Zhou, 2017. "Recent progresses in outcome-dependent sampling with failure time data," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 23(1), pages 57-82, January.
    10. Barry I. Graubard & Thomas R. Fears, 2005. "Standard Errors for Attributable Risk for Simple and Complex Sample Designs," Biometrics, The International Biometric Society, vol. 61(3), pages 847-855, September.
    11. Kani Chen, 2001. "Generalized case–cohort sampling," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 63(4), pages 791-809.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Erik T. Parner & Per K. Andersen & Morten Overgaard, 2020. "Cumulative risk regression in case–cohort studies using pseudo-observations," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 26(4), pages 639-658, October.
    2. Hui Zhang & Douglas E. Schaubel & John D. Kalbfleisch, 2011. "Proportional Hazards Regression for the Analysis of Clustered Survival Data from Case–Cohort Studies," Biometrics, The International Biometric Society, vol. 67(1), pages 18-28, March.
    3. Qingning Zhou & Jianwen Cai & Haibo Zhou, 2018. "Outcome†dependent sampling with interval†censored failure time data," Biometrics, The International Biometric Society, vol. 74(1), pages 58-67, March.
    4. Mingzhe Wu & Ming Zheng & Wen Yu & Ruofan Wu, 2018. "Estimation and variable selection for semiparametric transformation models under a more efficient cohort sampling design," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 27(3), pages 570-596, September.
    5. Yanqing Sun & Xiyuan Qian & Qiong Shou & Peter B. Gilbert, 2017. "Analysis of two-phase sampling data with semiparametric additive hazards models," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 23(3), pages 377-399, July.
    6. Soyoung Kim & Yayun Xu & Mei‐Jie Zhang & Kwang‐Woo Ahn, 2020. "Stratified proportional subdistribution hazards model with covariate‐adjusted censoring weight for case‐cohort studies," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 47(4), pages 1222-1242, December.
    7. Brady Ryan & Ananthika Nirmalkanna & Candemir Cigsar & Yildiz E. Yilmaz, 2023. "Evaluation of Designs and Estimation Methods Under Response-Dependent Two-Phase Sampling for Genetic Association Studies," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 15(2), pages 510-539, July.
    8. Chixiang Chen & Ming Wang & Shuo Chen, 2023. "An efficient data integration scheme for synthesizing information from multiple secondary datasets for the parameter inference of the main analysis," Biometrics, The International Biometric Society, vol. 79(4), pages 2947-2960, December.
    9. Yei Eun Shin & Ruth M. Pfeiffer & Barry I. Graubard & Mitchell H. Gail, 2022. "Weight calibration to improve efficiency for estimating pure risks from the additive hazards model with the nested case‐control design," Biometrics, The International Biometric Society, vol. 78(1), pages 179-191, March.
    10. Bryan E. Shepherd & Kyunghee Han & Tong Chen & Aihua Bian & Shannon Pugh & Stephany N. Duda & Thomas Lumley & William J. Heerman & Pamela A. Shaw, 2023. "Multiwave validation sampling for error‐prone electronic health records," Biometrics, The International Biometric Society, vol. 79(3), pages 2649-2663, September.
    11. Peisong Han, 2016. "Combining Inverse Probability Weighting and Multiple Imputation to Improve Robustness of Estimation," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 43(1), pages 246-260, March.
    12. Zheng, Ming & Zhao, Ziqiang & Yu, Wen, 2013. "Quantile regression analysis of case-cohort data," Journal of Multivariate Analysis, Elsevier, vol. 122(C), pages 20-34.
    13. Shixiao Zhang & Peisong Han & Changbao Wu, 2023. "Calibration Techniques Encompassing Survey Sampling, Missing Data Analysis and Causal Inference," International Statistical Review, International Statistical Institute, vol. 91(2), pages 165-192, August.
    14. Wei Wang & Shou‐En Lu & Jerry Q. Cheng & Minge Xie & John B. Kostis, 2022. "Multivariate survival analysis in big data: A divide‐and‐combine approach," Biometrics, The International Biometric Society, vol. 78(3), pages 852-866, September.
    15. Yei Eun Shin & Takumi Saegusa, 2024. "Nested case–control sampling without replacement," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 30(4), pages 776-799, October.
    16. Yei Eun Shin & Ruth M. Pfeiffer & Barry I. Graubard & Mitchell H. Gail, 2020. "Weight calibration to improve the efficiency of pure risk estimates from case‐control samples nested in a cohort," Biometrics, The International Biometric Society, vol. 76(4), pages 1087-1097, December.
    17. Xiaofei Bai & Anastasios A. Tsiatis & Sean M. O'Brien, 2013. "Doubly-Robust Estimators of Treatment-Specific Survival Distributions in Observational Studies with Stratified Sampling," Biometrics, The International Biometric Society, vol. 69(4), pages 830-839, December.
    18. Jie-Huei Wang & Chun-Hao Pan & I-Shou Chang & Chao Agnes Hsiung, 2020. "Penalized full likelihood approach to variable selection for Cox’s regression model under nested case–control sampling," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 26(2), pages 292-314, April.
    19. Jing Zhang & Haibo Zhou & Yanyan Liu & Jianwen Cai, 2021. "Conditional screening for ultrahigh-dimensional survival data in case-cohort studies," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 27(4), pages 632-661, October.
    20. Sangwook Kang & Jianwen Cai, 2009. "Marginal Hazards Regression for Retrospective Studies within Cohort with Possibly Correlated Failure Time Data," Biometrics, The International Biometric Society, vol. 65(2), pages 405-414, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:lifeda:v:30:y:2024:i:3:d:10.1007_s10985-024-09621-2. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.