IDEAS home Printed from https://ideas.repec.org/a/sae/medema/v43y2023i3p275-287.html
   My bibliography  Save this article

Mimicking Clinical Trials Using Real-World Data: A Novel Method and Applications

Author

Listed:
  • Wei-Jhih Wang

    (The Comparative Health Outcomes, Policy, and Economics (CHOICE) Institute, University of Washington, Seattle, WA, USA)

  • Aasthaa Bansal

    (The Comparative Health Outcomes, Policy, and Economics (CHOICE) Institute, University of Washington, Seattle, WA, USA)

  • Caroline Savage Bennette

    (The Comparative Health Outcomes, Policy, and Economics (CHOICE) Institute, University of Washington, Seattle, WA, USA)

  • Anirban Basu

    (The Comparative Health Outcomes, Policy, and Economics (CHOICE) Institute, University of Washington, Seattle, WA, USA
    Department of Health Services, University of Washington, Seattle, WA, USA
    Department of Economics, University of Washington, Seattle, WA, USA)

Abstract

Introduction Simulating individual-level trial data when only summary data are available is often useful for meta-analysis, forming external control arms and calibrating trial results to real-world data (RWD). The joint distribution of baseline characteristics in a trial is usually simulated by combining its summary data with RWD’s correlations. However, RWD correlations may not be a perfect proxy for the trial. A misspecified correlation structure could bias any analysis in which the outcomes generating models are nonlinear or include effect modifiers. Methods We developed an iterative algorithm using copula and resampling, which was based on the estimated propensity score for the likelihood of enrollment in a trial given participants’ characteristics. Validation was performed using Monte Carlo simulations under different scenarios in which the marginal and joint distributions of covariates differ between trial samples and RWD. Two applications were illustrated using an actual trial and the Surveillance, Epidemiology, and End Results–Medicare data. We calculated the standardized mean difference (SMD) to assess the generalizability of the trial and explored the feasibility of generating an external control by applying a parametric Weibull model trained in RWD to predict survival in the simulated trial cohort. Results Across all scenarios, approximated correlations derived from the algorithm were closer to the true correlations than the RWD’s correlations. The algorithm also successfully reproduced the joint distribution of characteristics for the actual trial. A similar SMD was observed using simulated data and individual-level trial data. The 95% confidence intervals were overlapped between adjusted survival estimates from the simulated trial and actual trial Kaplan-Meier estimates. Conclusions The algorithm could be a feasible way to simulate individual-level data when only summary data are available. Further research is needed to validate our approach with larger sample sizes. Highlights The correlation structure is crucial to building the joint distribution of patient characteristics, and a misspecified correlation structure could potentially influence predicted outcomes. An iterative algorithm was developed to approximate a trial’s correlation structure using published summary trial data and real-world data. The algorithm could be a feasible way to simulate individual-level trial data when only trial summary data are available.

Suggested Citation

  • Wei-Jhih Wang & Aasthaa Bansal & Caroline Savage Bennette & Anirban Basu, 2023. "Mimicking Clinical Trials Using Real-World Data: A Novel Method and Applications," Medical Decision Making, , vol. 43(3), pages 275-287, April.
  • Handle: RePEc:sae:medema:v:43:y:2023:i:3:p:275-287
    DOI: 10.1177/0272989X221141381
    as

    Download full text from publisher

    File URL: https://journals.sagepub.com/doi/10.1177/0272989X221141381
    Download Restriction: no

    File URL: https://libkey.io/10.1177/0272989X221141381?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Casey Quinn, 2007. "The health-economic applications of copulas: methods in applied econometric research," Health, Econometrics and Data Group (HEDG) Working Papers 07/22, HEDG, c/o Department of Economics, University of York.
    2. Basu, A & Polsky, D & Manning, W G, 2008. "Use of propensity scores in non-linear response models: The case for health care expenditures," Health, Econometrics and Data Group (HEDG) Working Papers 08/11, HEDG, c/o Department of Economics, University of York.
    3. Alexander J. McNeil & Rüdiger Frey & Paul Embrechts, 2015. "Quantitative Risk Management: Concepts, Techniques and Tools Revised edition," Economics Books, Princeton University Press, edition 2, number 10496.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Abduraimova, Kumushoy, 2022. "Contagion and tail risk in complex financial networks," Journal of Banking & Finance, Elsevier, vol. 143(C).
    2. Masahiko Egami & Rusudan Kevkhishvili, 2020. "Time reversal and last passage time of diffusions with applications to credit risk management," Finance and Stochastics, Springer, vol. 24(3), pages 795-825, July.
    3. Pfeifer Dietmar & Mändle Andreas & Ragulina Olena, 2017. "New copulas based on general partitions-of-unity and their applications to risk management (part II)," Dependence Modeling, De Gruyter, vol. 5(1), pages 246-255, October.
    4. Makam, Vaishno Devi & Millossovich, Pietro & Tsanakas, Andreas, 2021. "Sensitivity analysis with χ2-divergences," Insurance: Mathematics and Economics, Elsevier, vol. 100(C), pages 372-383.
    5. Ansari Jonathan & Rockel Marcus, 2024. "Dependence properties of bivariate copula families," Dependence Modeling, De Gruyter, vol. 12(1), pages 1-36.
    6. E. Ramos-P'erez & P. J. Alonso-Gonz'alez & J. J. N'u~nez-Vel'azquez, 2020. "Forecasting volatility with a stacked model based on a hybridized Artificial Neural Network," Papers 2006.16383, arXiv.org, revised Aug 2020.
    7. Dimitris Bertsimas & Agni Orfanoudaki, 2021. "Algorithmic Insurance," Papers 2106.00839, arXiv.org, revised Dec 2022.
    8. Jasjeet Singh Sekhon & Richard D. Grieve, 2012. "A matching method for improving covariate balance in cost‐effectiveness analyses," Health Economics, John Wiley & Sons, Ltd., vol. 21(6), pages 695-714, June.
    9. Chandra Bhat & Ipek Sener, 2009. "A copula-based closed-form binary logit choice model for accommodating spatial correlation across observational units," Journal of Geographical Systems, Springer, vol. 11(3), pages 243-272, September.
    10. Claudia Ceci & Katia Colaneri & Rdiger Frey & Verena Kock, 2019. "Value adjustments and dynamic hedging of reinsurance counterparty risk," Papers 1909.04354, arXiv.org.
    11. Karlsson, Sune & Mazur, Stepan & Nguyen, Hoang, 2023. "Vector autoregression models with skewness and heavy tails," Journal of Economic Dynamics and Control, Elsevier, vol. 146(C).
    12. Yuanying Guan & Zhanyi Jiao & Ruodu Wang, 2022. "A reverse ES (CVaR) optimization formula," Papers 2203.02599, arXiv.org, revised May 2023.
    13. Shahzad, Muhammad Faisal & Abdulai, Awudu, 2020. "Adaptation to extreme weather conditions and farm performance in rural Pakistan," Agricultural Systems, Elsevier, vol. 180(C).
    14. Xuehai Zhang, 2019. "Value at Risk and Expected Shortfall under General Semi-parametric GARCH models," Working Papers CIE 126, Paderborn University, CIE Center for International Economics.
    15. Bernardi, Mauro & Catania, Leopoldo, 2018. "Portfolio optimisation under flexible dynamic dependence modelling," Journal of Empirical Finance, Elsevier, vol. 48(C), pages 1-18.
    16. Rudiger Frey & Kevin Kurt & Camilla Damian, 2020. "How Safe are European Safe Bonds? An Analysis from the Perspective of Modern Portfolio Credit Risk Models," Papers 2001.11249, arXiv.org, revised Jul 2020.
    17. Santos, Douglas G. & Candido, Osvaldo & Tófoli, Paula V., 2022. "Forecasting risk measures using intraday and overnight information," The North American Journal of Economics and Finance, Elsevier, vol. 60(C).
    18. Hofert Marius & Memartoluie Amir & Saunders David & Wirjanto Tony, 2017. "Improved algorithms for computing worst Value-at-Risk," Statistics & Risk Modeling, De Gruyter, vol. 34(1-2), pages 13-31, June.
    19. Hirbod Assa & Liyuan Lin & Ruodu Wang, 2022. "Calibrating distribution models from PELVE," Papers 2204.08882, arXiv.org, revised Jun 2023.
    20. Ji, Xiangfeng & Chu, Yanyu, 2020. "A target-oriented bi-attribute user equilibrium model with travelers’ perception errors on the tolled traffic network," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 144(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:sae:medema:v:43:y:2023:i:3:p:275-287. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: SAGE Publications (email available below). General contact details of provider: .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.