IDEAS home Printed from https://ideas.repec.org/p/pra/mprapa/41149.html
   My bibliography  Save this paper

Survival prediction based on compound covariate under cox proportional hazard models

Author

Listed:
  • Emura, Takeshi
  • Chen, Yi-Hau
  • Chen, Hsuan-Yu

Abstract

Survival prediction from a large number of covariates is a current focus of statistical and medical research. In this paper, we study a methodology known as the compound covariate prediction performed under univariate Cox proportional hazard models. We demonstrate via simulations and real data analysis that the compound covariate method generally competes well with ridge regression and Lasso methods, both already well-studied methods for predicting survival outcomes with a large number of covariates. Furthermore, we develop a refinement of the compound covariate method by incorporating likelihood information from multivariate Cox models. The new proposal is an adaptive method that borrows information contained in both the univariate and multivariate Cox regression estimators. We show that the new proposal has a theoretical justification from a statistical large sample theory and is naturally interpreted as a shrinkage-type estimator, a popular class of estimators in statistical literature. Two datasets, the primary biliary cirrhosis of the liver data and the non-small-cell lung cancer data, are used for illustration. The proposed method is implemented in R package “compound.Cox” available in CRAN at http://cran.r-project.org/.

Suggested Citation

  • Emura, Takeshi & Chen, Yi-Hau & Chen, Hsuan-Yu, 2012. "Survival prediction based on compound covariate under cox proportional hazard models," MPRA Paper 41149, University Library of Munich, Germany.
  • Handle: RePEc:pra:mprapa:41149
    as

    Download full text from publisher

    File URL: https://mpra.ub.uni-muenchen.de/41149/1/MPRA_paper_41149.pdf
    File Function: original version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. van Wieringen, Wessel N. & Kun, David & Hampel, Regina & Boulesteix, Anne-Laure, 2009. "Survival prediction using gene expression data: A review and comparison," Computational Statistics & Data Analysis, Elsevier, vol. 53(5), pages 1590-1603, March.
    2. Xi Zhao & Einar Andreas Rødland & Therese Sørlie & Bjørn Naume & Anita Langerød & Arnoldo Frigessi & Vessela N Kristensen & Anne-Lise Børresen-Dale & Ole Christian Lingjærde, 2011. "Combining Gene Signatures Improves Prediction of Breast Cancer Survival," PLOS ONE, Public Library of Science, vol. 6(3), pages 1-15, March.
    3. Tibshirani Robert J., 2009. "Univariate Shrinkage in the Cox Model for High Dimensional Data," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 8(1), pages 1-18, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Emura, Takeshi & Kao, Fan-Hsuan & Michimae, Hirofumi, 2014. "An improved nonparametric estimator of sub-distribution function for bivariate competing risk models," Journal of Multivariate Analysis, Elsevier, vol. 132(C), pages 229-241.
    2. Ahmed A. Ewees & Mohammed A. A. Al-qaness & Laith Abualigah & Diego Oliva & Zakariya Yahya Algamal & Ahmed M. Anter & Rehab Ali Ibrahim & Rania M. Ghoniem & Mohamed Abd Elaziz, 2021. "Boosting Arithmetic Optimization Algorithm with Genetic Algorithm Operators for Feature Selection: Case Study on Cox Proportional Hazards Model," Mathematics, MDPI, vol. 9(18), pages 1-22, September.
    3. Emura, Takeshi & Chen, Yi-Hau, 2014. "Gene selection for survival data under dependent censoring: a copula-based approach," MPRA Paper 58043, University Library of Munich, Germany.
    4. Jialiang Li & Tonghui Yu & Jing Lv & Mei‐Ling Ting Lee, 2021. "Semiparametric model averaging prediction for lifetime data via hazards regression," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 70(5), pages 1187-1209, November.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lee Kyu Ha & Chakraborty Sounak & Sun Jianguo, 2011. "Bayesian Variable Selection in Semiparametric Proportional Hazards Model for High Dimensional Survival Data," The International Journal of Biostatistics, De Gruyter, vol. 7(1), pages 1-32, April.
    2. Yanfeng Wang & Haohao Wang & Sanyi Li & Lidong Wang, 2022. "Survival Risk Prediction of Esophageal Cancer Based on the Kohonen Network Clustering Algorithm and Kernel Extreme Learning Machine," Mathematics, MDPI, vol. 10(9), pages 1-20, April.
    3. Luke Kumar & Russell Greiner, 2019. "Gene expression based survival prediction for cancer patients—A topic modeling approach," PLOS ONE, Public Library of Science, vol. 14(11), pages 1-30, November.
    4. Zhao, Sihai Dave & Li, Yi, 2012. "Principled sure independence screening for Cox models with ultra-high-dimensional covariates," Journal of Multivariate Analysis, Elsevier, vol. 105(1), pages 397-411.
    5. Hyokyoung G. Hong & Xuerong Chen & David C. Christiani & Yi Li, 2018. "Integrated powered density: Screening ultrahigh dimensional covariates with survival outcomes," Biometrics, The International Biometric Society, vol. 74(2), pages 421-429, June.
    6. Stefanie Hieke & Axel Benner & Richard F Schlenk & Martin Schumacher & Lars Bullinger & Harald Binder, 2016. "Identifying Prognostic SNPs in Clinical Cohorts: Complementing Univariate Analyses by Resampling and Multivariable Modeling," PLOS ONE, Public Library of Science, vol. 11(5), pages 1-18, May.
    7. Yu Takagi & Hirokazu Matsuda & Yukio Taniguchi & Hiroaki Iwaisaki, 2014. "Predicting the Phenotypic Values of Physiological Traits Using SNP Genotype and Gene Expression Data in Mice," PLOS ONE, Public Library of Science, vol. 9(12), pages 1-17, December.
    8. Hapfelmeier, A. & Ulm, K., 2013. "A new variable selection approach using Random Forests," Computational Statistics & Data Analysis, Elsevier, vol. 60(C), pages 50-69.
    9. Ming Yi & Ruoqing Zhu & Robert M Stephens, 2018. "GradientScanSurv—An exhaustive association test method for gene expression data with censored survival outcome," PLOS ONE, Public Library of Science, vol. 13(12), pages 1-28, December.
    10. Armin Rauschenberger & Iuliana Ciocănea-Teodorescu & Marianne A. Jonker & Renée X. Menezes & Mark A. Wiel, 2020. "Sparse classification with paired covariates," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 14(3), pages 571-588, September.
    11. Farcomeni, Alessio & Nardi, Alessandra, 2010. "A two-component Weibull mixture to model early and late mortality in a Bayesian framework," Computational Statistics & Data Analysis, Elsevier, vol. 54(2), pages 416-428, February.
    12. Jing Zhang & Guosheng Yin & Yanyan Liu & Yuanshan Wu, 2018. "Censored cumulative residual independent screening for ultrahigh-dimensional survival data," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 24(2), pages 273-292, April.
    13. Antoniadis, Anestis & Fryzlewicz, Piotr & Letué, Frédérique, 2010. "The Dantzig selector in Cox's proportional hazards model," LSE Research Online Documents on Economics 30992, London School of Economics and Political Science, LSE Library.
    14. Takeshi Emura & Yi-Hau Chen & Hsuan-Yu Chen, 2012. "Survival Prediction Based on Compound Covariate under Cox Proportional Hazard Models," PLOS ONE, Public Library of Science, vol. 7(10), pages 1-12, October.
    15. Isabella Zwiener & Barbara Frisch & Harald Binder, 2014. "Transforming RNA-Seq Data to Improve the Performance of Prognostic Gene Signatures," PLOS ONE, Public Library of Science, vol. 9(1), pages 1-13, January.
    16. Chakraborty, Sounak & Guo, Ruixin, 2011. "A Bayesian hybrid Huberized support vector machine and its applications in high-dimensional medical data," Computational Statistics & Data Analysis, Elsevier, vol. 55(3), pages 1342-1356, March.
    17. Jing Zhang & Yanyan Liu & Hengjian Cui, 2021. "Model-free feature screening via distance correlation for ultrahigh dimensional survival data," Statistical Papers, Springer, vol. 62(6), pages 2711-2738, December.
    18. Jing Zhang & Haibo Zhou & Yanyan Liu & Jianwen Cai, 2021. "Conditional screening for ultrahigh-dimensional survival data in case-cohort studies," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 27(4), pages 632-661, October.
    19. Yang Qu & Yu Cheng, 2023. "Volume under the ROC surface for high-dimensional independent screening with ordinal competing risk outcomes," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 29(4), pages 735-751, October.
    20. Jing Zhang & Haibo Zhou & Yanyan Liu & Jianwen Cai, 2021. "Feature screening for case‐cohort studies with failure time outcome," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 48(1), pages 349-370, March.

    More about this item

    Keywords

    Cox proportional hazard model; Prediction; Survival analysis;
    All these keywords.

    JEL classification:

    • C13 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Estimation: General
    • C14 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Semiparametric and Nonparametric Methods: General
    • C34 - Mathematical and Quantitative Methods - - Multiple or Simultaneous Equation Models; Multiple Variables - - - Truncated and Censored Models; Switching Regression Models
    • C24 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Truncated and Censored Models; Switching Regression Models; Threshold Regression Models
    • C4 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods: Special Topics

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:pra:mprapa:41149. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Joachim Winter (email available below). General contact details of provider: https://edirc.repec.org/data/vfmunde.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.