IDEAS home Printed from https://ideas.repec.org/a/spr/lifeda/v30y2024i2d10.1007_s10985-024-09618-x.html
   My bibliography  Save this article

Pseudo-value regression trees

Author

Listed:
  • Alina Schenk

    (Medical Faculty, University of Bonn)

  • Moritz Berger

    (Medical Faculty, University of Bonn)

  • Matthias Schmid

    (Medical Faculty, University of Bonn)

Abstract

This paper presents a semi-parametric modeling technique for estimating the survival function from a set of right-censored time-to-event data. Our method, named pseudo-value regression trees (PRT), is based on the pseudo-value regression framework, modeling individual-specific survival probabilities by computing pseudo-values and relating them to a set of covariates. The standard approach to pseudo-value regression is to fit a main-effects model using generalized estimating equations (GEE). PRT extend this approach by building a multivariate regression tree with pseudo-value outcome and by successively fitting a set of regularized additive models to the data in the nodes of the tree. Due to the combination of tree learning and additive modeling, PRT are able to perform variable selection and to identify relevant interactions between the covariates, thereby addressing several limitations of the standard GEE approach. In addition, PRT include time-dependent effects in the node-wise models. Interpretability of the PRT fits is ensured by controlling the tree depth. Based on the results of two simulation studies, we investigate the properties of the PRT method and compare it to several alternative modeling techniques. Furthermore, we illustrate PRT by analyzing survival in 3,652 patients enrolled for a randomized study on primary invasive breast cancer.

Suggested Citation

  • Alina Schenk & Moritz Berger & Matthias Schmid, 2024. "Pseudo-value regression trees," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 30(2), pages 439-471, April.
  • Handle: RePEc:spr:lifeda:v:30:y:2024:i:2:d:10.1007_s10985-024-09618-x
    DOI: 10.1007/s10985-024-09618-x
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10985-024-09618-x
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10985-024-09618-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. John P. Klein & Per Kragh Andersen, 2005. "Regression Modeling of Competing Risks Data Based on Pseudovalues of the Cumulative Incidence Function," Biometrics, The International Biometric Society, vol. 61(1), pages 223-229, March.
    2. Tjeerd van der Ploeg & Frank Datema & Robert Baatenburg de Jong & Ewout W Steyerberg, 2014. "Prediction of Survival with Alternative Modeling Techniques Using Pseudo Values," PLOS ONE, Public Library of Science, vol. 9(6), pages 1-10, June.
    3. Thomas H. Scheike & Mei-Jie Zhang & Thomas A. Gerds, 2008. "Predicting cumulative incidence probability by direct binomial regression," Biometrika, Biometrika Trust, vol. 95(1), pages 205-220.
    4. Molinaro, Annette M. & Dudoit, Sandrine & van der Laan, M.J.Mark J., 2004. "Tree-based multivariate regression and density estimation with right-censored data," Journal of Multivariate Analysis, Elsevier, vol. 90(1), pages 154-177, July.
    5. Benjamin Hofner & Andreas Mayr & Nikolay Robinzonov & Matthias Schmid, 2014. "Model-based boosting in R: a hands-on tutorial using the R package mboost," Computational Statistics, Springer, vol. 29(1), pages 3-35, February.
    6. Torsten Hothorn & Thomas Kneib & Peter Bühlmann, 2014. "Conditional transformation models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(1), pages 3-27, January.
    7. Torsten Hothorn & Lisa Möst & Peter Bühlmann, 2018. "Most Likely Transformations," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 45(1), pages 110-134, March.
    8. Beilin Jia & Donglin Zeng & Jason J. Z. Liao & Guanghan F. Liu & Xianming Tan & Guoqing Diao & Joseph G. Ibrahim, 2022. "Mixture survival trees for cancer risk classification," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 28(3), pages 356-379, July.
    9. Per Kragh Andersen, 2003. "Generalised linear models for correlated pseudo-observations, with applications to multi-state models," Biometrika, Biometrika Trust, vol. 90(1), pages 15-27, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Erik T. Parner & Per K. Andersen & Morten Overgaard, 2020. "Cumulative risk regression in case–cohort studies using pseudo-observations," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 26(4), pages 639-658, October.
    2. Yanzhi Wang & Brent R. Logan, 2019. "Testing for center effects on survival and competing risks outcomes using pseudo-value regression," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 25(2), pages 206-228, April.
    3. Annalisa Orenti & Patrizia Boracchi & Giuseppe Marano & Elia Biganzoli & Federico Ambrogi, 2022. "A pseudo-values regression model for non-fatal event free survival in the presence of semi-competing risks," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 31(3), pages 709-727, September.
    4. Yuanhua Feng & Wolfgang Karl Härdle, 2021. "Uni- and multivariate extensions of the sinh-arcsinh normal distribution applied to distributional regression," Working Papers CIE 142, Paderborn University, CIE Center for International Economics.
    5. Su, Pei-Fang & Chi, Yunchan & Li, Chung-I & Shyr, Yu & Liao, Yi-De, 2011. "Analyzing survival curves at a fixed point in time for paired and clustered right-censored data," Computational Statistics & Data Analysis, Elsevier, vol. 55(4), pages 1617-1628, April.
    6. Miguel A Delgado & Andrés García-Suaza & Pedro H C Sant’Anna, 2022. "Distribution regression in duration analysis: an application to unemployment spells [Lecture notes in statistics: Proceedings]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 675-698.
    7. Kneib, Thomas & Silbersdorff, Alexander & Säfken, Benjamin, 2023. "Rage Against the Mean – A Review of Distributional Regression Approaches," Econometrics and Statistics, Elsevier, vol. 26(C), pages 99-123.
    8. Yayun Xu & Soyoung Kim & Mei-Jie Zhang & David Couper & Kwang Woo Ahn, 2022. "Competing risks regression models with covariates-adjusted censoring weight under the generalized case-cohort design," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 28(2), pages 241-262, April.
    9. Zijing Yang & Chengfeng Zhang & Yawen Hou & Zheng Chen, 2023. "Analysis of dynamic restricted mean survival time based on pseudo‐observations," Biometrics, The International Biometric Society, vol. 79(4), pages 3690-3700, December.
    10. Deresa, Negera Wakgari & Van Keilegom, Ingrid, 2020. "A multivariate normal regression model for survival data subject to different types of dependent censoring," Computational Statistics & Data Analysis, Elsevier, vol. 144(C).
    11. Sangbum Choi & Xuelin Huang, 2014. "Maximum likelihood estimation of semiparametric mixture component models for competing risks data," Biometrics, The International Biometric Society, vol. 70(3), pages 588-598, September.
    12. Michael J. Martens & Brent R. Logan, 2020. "Group sequential tests for treatment effect on survival and cumulative incidence at a fixed time point," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 26(3), pages 603-623, July.
    13. M. A. Nicolaie & J. C. van Houwelingen & T. M. de Witte & H. Putter, 2013. "Dynamic Pseudo-Observations: A Robust Approach to Dynamic Prediction in Competing Risks," Biometrics, The International Biometric Society, vol. 69(4), pages 1043-1052, December.
    14. Erik T. Parner & Per K. Andersen & Morten Overgaard, 2023. "Regression models for censored time-to-event data using infinitesimal jack-knife pseudo-observations, with applications to left-truncation," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 29(3), pages 654-671, July.
    15. Yuxue Jin & Tze Leung Lai, 2017. "A new approach to regression analysis of censored competing-risks data," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 23(4), pages 605-625, October.
    16. Wycinka Ewa, 2019. "Competing Risk Models of Default in the Presence of Early Repayments," Econometrics. Advances in Applied Data Analysis, Sciendo, vol. 23(2), pages 99-120, June.
    17. Lee, Unkyung & Sun, Yanqing & Scheike, Thomas H. & Gilbert, Peter B., 2018. "Analysis of generalized semiparametric regression models for cumulative incidence functions with missing covariates," Computational Statistics & Data Analysis, Elsevier, vol. 122(C), pages 59-79.
    18. Nadja Klein & Torsten Hothorn & Luisa Barbanti & Thomas Kneib, 2022. "Multivariate conditional transformation models," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 49(1), pages 116-142, March.
    19. Li, Ruosha & Peng, Limin, 2014. "Varying coefficient subdistribution regression for left-truncated semi-competing risks data," Journal of Multivariate Analysis, Elsevier, vol. 131(C), pages 65-78.
    20. Brent R. Logan & John P. Klein & Mei‐Jie Zhang, 2008. "Comparing Treatments in the Presence of Crossing Survival Curves: An Application to Bone Marrow Transplantation," Biometrics, The International Biometric Society, vol. 64(3), pages 733-740, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:lifeda:v:30:y:2024:i:2:d:10.1007_s10985-024-09618-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.