IDEAS home Printed from https://ideas.repec.org/a/bla/jorssa/v185y2022i4p2156-2178.html
   My bibliography  Save this article

Optimising precision and power by machine learning in randomised trials with ordinal and time‐to‐event outcomes with an application to COVID‐19

Author

Listed:
  • Nicholas Williams
  • Michael Rosenblum
  • Iván Díaz

Abstract

The rapid finding of effective therapeutics requires efficient use of available resources in clinical trials. Covariate adjustment can yield statistical estimates with improved precision, resulting in a reduction in the number of participants required to draw futility or efficacy conclusions. We focus on time‐to‐event and ordinal outcomes. When more than a few baseline covariates are available, a key question for covariate adjustment in randomised studies is how to fit a model relating the outcome and the baseline covariates to maximise precision. We present a novel theoretical result establishing conditions for asymptotic normality of a variety of covariate‐adjusted estimators that rely on machine learning (e.g., ℓ1$$ {\ell}_1 $$‐regularisation, Random Forests, XGBoost, and Multivariate Adaptive Regression Splines [MARS]), under the assumption that outcome data are missing completely at random. We further present a consistent estimator of the asymptotic variance. Importantly, the conditions do not require the machine learning methods to converge to the true outcome distribution conditional on baseline variables, as long as they converge to some (possibly incorrect) limit. We conducted a simulation study to evaluate the performance of the aforementioned prediction methods in COVID‐19 trials. Our simulation is based on resampling longitudinal data from over 1500 patients hospitalised with COVID‐19 at Weill Cornell Medicine New York Presbyterian Hospital. We found that using ℓ1$$ {\ell}_1 $$‐regularisation led to estimators and corresponding hypothesis tests that control type 1 error and are more precise than an unadjusted estimator across all sample sizes tested. We also show that when covariates are not prognostic of the outcome, ℓ1$$ {\ell}_1 $$‐regularisation remains as precise as the unadjusted estimator, even at small sample sizes (n=100$$ n=100 $$). We give an R package adjrct that performs model‐robust covariate adjustment for ordinal and time‐to‐event outcomes.

Suggested Citation

  • Nicholas Williams & Michael Rosenblum & Iván Díaz, 2022. "Optimising precision and power by machine learning in randomised trials with ordinal and time‐to‐event outcomes with an application to COVID‐19," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(4), pages 2156-2178, October.
  • Handle: RePEc:bla:jorssa:v:185:y:2022:i:4:p:2156-2178
    DOI: 10.1111/rssa.12915
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/rssa.12915
    Download Restriction: no

    File URL: https://libkey.io/10.1111/rssa.12915?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. D Benkeser & M Carone & M J Van Der Laan & P B Gilbert, 2017. "Doubly robust nonparametric inference on the average treatment effect," Biometrika, Biometrika Trust, vol. 104(4), pages 863-880.
    2. Wright, Marvin N. & Ziegler, Andreas, 2017. "ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 77(i01).
    3. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    4. Min Zhang & Anastasios A. Tsiatis & Marie Davidian, 2008. "Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates," Biometrics, The International Biometric Society, vol. 64(3), pages 707-715, September.
    5. Gruber Susan & van der Laan Mark J., 2012. "Targeted Minimum Loss Based Estimator that Outperforms a given Estimator," The International Journal of Biostatistics, De Gruyter, vol. 8(1), pages 1-22, May.
    6. Yang L. & Tsiatis A. A., 2001. "Efficiency Study of Estimators for a Treatment Effect in a Pretest-Posttest Trial," The American Statistician, American Statistical Association, vol. 55, pages 314-321, November.
    7. Iván Díaz & Elizabeth Colantuoni & Daniel F. Hanley & Michael Rosenblum, 2019. "Improved precision in the analysis of randomized trials with survival outcomes, without assuming proportional hazards," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 25(3), pages 439-468, July.
    8. Pei-Yun Chen & Anastasios A. Tsiatis, 2001. "Causal Inference on the Difference of the Restricted Mean Lifetime Between Two Groups," Biometrics, The International Biometric Society, vol. 57(4), pages 1030-1038, December.
    9. Oliver Dukes & Stijn Vansteelandt, 2021. "Inference for treatment effect parameters in potentially misspecified high-dimensional models [Approximate residual balancing: Debiased inference of average treatment effects in high dimensions]," Biometrika, Biometrika Trust, vol. 108(2), pages 321-334.
    10. Iván Díaz & Elizabeth Colantuoni & Michael Rosenblum, 2016. "Enhanced precision in the analysis of randomized trials with ordinal outcomes," Biometrics, The International Biometric Society, vol. 72(2), pages 422-431, June.
    11. Rubin Daniel B & van der Laan Mark J., 2008. "Empirical Efficiency Maximization: Improved Locally Efficient Covariate Adjustment in Randomized Experiments and Survival Analysis," The International Journal of Biostatistics, De Gruyter, vol. 4(1), pages 1-42, May.
    12. Layla Parast & Lu Tian & Tianxi Cai, 2014. "Landmark Estimation of Survival and Treatment Effect in a Randomized Clinical Trial," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(505), pages 384-394, March.
    13. Andrea Rotnitzky & Quanhong Lei & Mariela Sued & James M. Robins, 2012. "Improved double-robust estimation in missing data and causal inference models," Biometrika, Biometrika Trust, vol. 99(2), pages 439-456.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. David Benkeser & Iván Díaz & Alex Luedtke & Jodi Segal & Daniel Scharfstein & Michael Rosenblum, 2021. "Improving precision and power in randomized trials for COVID‐19 treatments using covariate adjustment, for binary, ordinal, and time‐to‐event outcomes," Biometrics, The International Biometric Society, vol. 77(4), pages 1467-1481, December.
    2. Iván Díaz & Elizabeth Colantuoni & Daniel F. Hanley & Michael Rosenblum, 2019. "Improved precision in the analysis of randomized trials with survival outcomes, without assuming proportional hazards," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 25(3), pages 439-468, July.
    3. Layla Parast & Beth Ann Griffin, 2017. "Landmark estimation of survival and treatment effects in observational studies," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 23(2), pages 161-182, April.
    4. Wang, Qihua & Su, Miaomiao & Wang, Ruoyu, 2021. "A beyond multiple robust approach for missing response problem," Computational Statistics & Data Analysis, Elsevier, vol. 155(C).
    5. Rosenblum Michael & van der Laan Mark J., 2010. "Simple, Efficient Estimators of Treatment Effects in Randomized Trials Using Generalized Linear Models to Leverage Baseline Variables," The International Journal of Biostatistics, De Gruyter, vol. 6(1), pages 1-44, April.
    6. Wei Zhang & Zhiwei Zhang & Aiyi Liu, 2023. "Optimizing treatment allocation in randomized clinical trials by leveraging baseline covariates," Biometrics, The International Biometric Society, vol. 79(4), pages 2815-2829, December.
    7. Y Cui & E J Tchetgen Tchetgen, 2024. "Selective machine learning of doubly robust functionals," Biometrika, Biometrika Trust, vol. 111(2), pages 517-535.
    8. Bokelmann, Björn & Lessmann, Stefan, 2024. "Improving uplift model evaluation on randomized controlled trial data," European Journal of Operational Research, Elsevier, vol. 313(2), pages 691-707.
    9. Chakravorty, Bhaskar & Arulampalam, Wiji & Bhatiya, Apurav Yash & Imbert, Clément & Rathelot, Roland, 2024. "Can information about jobs improve the effectiveness of vocational training? Experimental evidence from India," Journal of Development Economics, Elsevier, vol. 169(C).
    10. Philipp Bach & Victor Chernozhukov & Malte S. Kurz & Martin Spindler & Sven Klaassen, 2021. "DoubleML -- An Object-Oriented Implementation of Double Machine Learning in R," Papers 2103.09603, arXiv.org, revised Jun 2024.
    11. Heejun Shin & Joseph Antonelli, 2023. "Improved inference for doubly robust estimators of heterogeneous treatment effects," Biometrics, The International Biometric Society, vol. 79(4), pages 3140-3152, December.
    12. Huber, Martin & Meier, Jonas & Wallimann, Hannes, 2022. "Business analytics meets artificial intelligence: Assessing the demand effects of discounts on Swiss train tickets," Transportation Research Part B: Methodological, Elsevier, vol. 163(C), pages 22-39.
    13. Su, Miaomiao & Wang, Ruoyu & Wang, Qihua, 2022. "A two-stage optimal subsampling estimation for missing data problems with large-scale data," Computational Statistics & Data Analysis, Elsevier, vol. 173(C).
    14. Han, Peisong, 2012. "A note on improving the efficiency of inverse probability weighted estimator using the augmentation term," Statistics & Probability Letters, Elsevier, vol. 82(12), pages 2221-2228.
    15. Helmut Wasserbacher & Martin Spindler, 2024. "Credit Ratings: Heterogeneous Effect on Capital Structure," Papers 2406.18936, arXiv.org.
    16. David Cheng & Ashwin N. Ananthakrishnan & Tianxi Cai, 2021. "Robust and efficient semi‐supervised estimation of average treatment effects with application to electronic health records data," Biometrics, The International Biometric Society, vol. 77(2), pages 413-423, June.
    17. Jiaming Mao & Jingzhi Xu, 2020. "Ensemble Learning with Statistical and Structural Models," Papers 2006.05308, arXiv.org.
    18. Peisong Han, 2016. "Combining Inverse Probability Weighting and Multiple Imputation to Improve Robustness of Estimation," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 43(1), pages 246-260, March.
    19. Ashkan Ertefaie & Nima S. Hejazi & Mark J. van der Laan, 2023. "Nonparametric inverse‐probability‐weighted estimators based on the highly adaptive lasso," Biometrics, The International Biometric Society, vol. 79(2), pages 1029-1041, June.
    20. AmirEmad Ghassami & Andrew Ying & Ilya Shpitser & Eric Tchetgen Tchetgen, 2021. "Minimax Kernel Machine Learning for a Class of Doubly Robust Functionals with Application to Proximal Causal Inference," Papers 2104.02929, arXiv.org, revised Mar 2022.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jorssa:v:185:y:2022:i:4:p:2156-2178. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/rssssea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.