Applying Machine Learning in Distributed Data Networks for Pharmacoepidemiologic and Pharmacovigilance Studies: Opportunities, Challenges, and Considerations

My bibliography Save this article

Applying Machine Learning in Distributed Data Networks for Pharmacoepidemiologic and Pharmacovigilance Studies: Opportunities, Challenges, and Considerations

Author

Listed:

Jenna Wong
(Harvard Medical School & Harvard Pilgrim Health Care Institute)
Daniel Prieto-Alhambra
(NDORMS, University of Oxford
Erasmus University Medical Center)
Peter R. Rijnbeek
(Erasmus University Medical Center)
Rishi J. Desai
(Harvard Medical School)
Jenna M. Reps
(Janssen Research & Development, LLC)
Sengwee Toh
(Harvard Medical School & Harvard Pilgrim Health Care Institute)

Registered:

Abstract

Increasing availability of electronic health databases capturing real-world experiences with medical products has garnered much interest in their use for pharmacoepidemiologic and pharmacovigilance studies. The traditional practice of having numerous groups use single databases to accomplish similar tasks and address common questions about medical products can be made more efficient through well-coordinated multi-database studies, greatly facilitated through distributed data network (DDN) architectures. Access to larger amounts of electronic health data within DDNs has created a growing interest in using data-adaptive machine learning (ML) techniques that can automatically model complex associations in high-dimensional data with minimal human guidance. However, the siloed storage and diverse nature of the databases in DDNs create unique challenges for using ML. In this paper, we discuss opportunities, challenges, and considerations for applying ML in DDNs for pharmacoepidemiologic and pharmacovigilance studies. We first discuss major types of activities performed by DDNs and how ML may be used. Next, we discuss practical data-related factors influencing how DDNs work in practice. We then combine these discussions and jointly consider how opportunities for ML are affected by practical data-related factors for DDNs, leading to several challenges. We present different approaches for addressing these challenges and highlight efforts that real-world DDNs have taken or are currently taking to help mitigate them. Despite these challenges, the time is ripe for the emerging interest to use ML in DDNs, and the utility of these data-adaptive modeling techniques in pharmacoepidemiologic and pharmacovigilance studies will likely continue to increase in the coming years.

Suggested Citation

Jenna Wong & Daniel Prieto-Alhambra & Peter R. Rijnbeek & Rishi J. Desai & Jenna M. Reps & Sengwee Toh, 2022. "Applying Machine Learning in Distributed Data Networks for Pharmacoepidemiologic and Pharmacovigilance Studies: Opportunities, Challenges, and Considerations," Drug Safety, Springer, vol. 45(5), pages 493-510, May.

Handle: RePEc:spr:drugsa:v:45:y:2022:i:5:d:10.1007_s40264-022-01158-3
DOI: 10.1007/s40264-022-01158-3

Download full text from publisher

As the access to this document is restricted, you may want to search for a different version of it.

References listed on IDEAS

Jenny W Sun & Jessica M Franklin & Kathryn Rough & Rishi J Desai & Sonia Hernández-Díaz & Krista F Huybrechts & Brian T Bateman, 2020. "Predicting overdose among individuals prescribed opioids using routinely collected healthcare utilization data," PLOS ONE, Public Library of Science, vol. 15(10), pages 1-17, October.
Wright, George & Lawrence, Michael J. & Collopy, Fred, 1996. "The role and validity of judgment in forecasting," International Journal of Forecasting, Elsevier, vol. 12(1), pages 1-8, March.
van der Laan Mark J. & Rubin Daniel, 2006. "Targeted Maximum Likelihood Learning," The International Journal of Biostatistics, De Gruyter, vol. 2(1), pages 1-40, December.
Qiong Wang & Jenna M Reps & Kristin Feeney Kostka & Patrick B Ryan & Yuhui Zou & Erica A Voss & Peter R Rijnbeek & RuiJun Chen & Gowtham A Rao & Henry Morgan Stewart & Andrew E Williams & Ross D Willi, 2020. "Development and validation of a prognostic model predicting symptomatic hemorrhagic transformation in acute ischemic stroke at scale in the OHDSI network," PLOS ONE, Public Library of Science, vol. 15(1), pages 1-12, January.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

S Ariane Christie & Amanda S Conroy & Rachael A Callcut & Alan E Hubbard & Mitchell J Cohen, 2019. "Dynamic multi-outcome prediction after injury: Applying adaptive machine learning for precision medicine in trauma," PLOS ONE, Public Library of Science, vol. 14(4), pages 1-13, April.
Victor Chernozhukov & Whitney K. Newey & Victor Quintas-Martinez & Vasilis Syrgkanis, 2021. "Automatic Debiased Machine Learning via Riesz Regression," Papers 2104.14737, arXiv.org, revised Mar 2024.
Paul Frédéric Blanche & Anders Holt & Thomas Scheike, 2023. "On logistic regression with right censored data, with or without competing risks, and its use for estimating treatment effects," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 29(2), pages 441-482, April.
Martin Huber & Michael Lechner & Giovanni Mellace, 2016. "The Finite Sample Performance of Estimators for Mediation Analysis Under Sequential Conditional Independence," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(1), pages 139-160, January.
- Huber, Martin & Mellace, Giovanni & Lechner, Michael, 2014. "The finite sample performance of estimators for mediation analysis under sequential conditional independence," Economics Working Paper Series 1415, University of St. Gallen, School of Economics and Political Science, revised Nov 2014.
Gruber Susan & van der Laan Mark J., 2010. "A Targeted Maximum Likelihood Estimator of a Causal Effect on a Bounded Continuous Outcome," The International Journal of Biostatistics, De Gruyter, vol. 6(1), pages 1-18, August.
Gruber Susan & van der Laan Mark J., 2010. "An Application of Collaborative Targeted Maximum Likelihood Estimation in Causal Inference and Genomics," The International Journal of Biostatistics, De Gruyter, vol. 6(1), pages 1-31, May.
Michael C Knaus, 2022. "Double machine learning-based programme evaluation under unconfoundedness [Econometric methods for program evaluation]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 602-627.
- Knaus, Michael C., 2020. "Double Machine Learning based Program Evaluation under Unconfoundedness," Economics Working Paper Series 2004, University of St. Gallen, School of Economics and Political Science.
- Knaus, Michael C., 2020. "Double Machine Learning Based Program Evaluation under Unconfoundedness," IZA Discussion Papers 13051, Institute of Labor Economics (IZA).
- Michael C. Knaus, 2020. "Double Machine Learning based Program Evaluation under Unconfoundedness," Papers 2003.03191, arXiv.org, revised Jun 2022.
Antonelli Joseph & Cefalu Matthew, 2020. "Averaging causal estimators in high dimensions," Journal of Causal Inference, De Gruyter, vol. 8(1), pages 92-107, January.
Yuya Sasaki & Takuya Ura & Yichong Zhang, 2022. "Unconditional quantile regression with high‐dimensional data," Quantitative Economics, Econometric Society, vol. 13(3), pages 955-978, July.
- Yuya Sasaki & Takuya Ura & Yichong Zhang, 2020. "Unconditional Quantile Regression with High Dimensional Data," Papers 2007.13659, arXiv.org, revised Feb 2022.
Iván Díaz & Elizabeth Colantuoni & Daniel F. Hanley & Michael Rosenblum, 2019. "Improved precision in the analysis of randomized trials with survival outcomes, without assuming proportional hazards," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 25(3), pages 439-468, July.
Rose Sherri & van der Laan Mark J., 2008. "Simple Optimal Weighting of Cases and Controls in Case-Control Studies," The International Journal of Biostatistics, De Gruyter, vol. 4(1), pages 1-26, September.
Rose Sherri & van der Laan Mark J., 2011. "A Targeted Maximum Likelihood Estimator for Two-Stage Designs," The International Journal of Biostatistics, De Gruyter, vol. 7(1), pages 1-21, March.
Mireille E. Schnitzer & Erica E.M. Moodie & Mark J. van der Laan & Robert W. Platt & Marina B. Klein, 2014. "Modeling the impact of hepatitis C viral clearance on end-stage liver disease in an HIV co-infected cohort with targeted maximum likelihood estimation," Biometrics, The International Biometric Society, vol. 70(1), pages 144-152, March.
Frölich, Markus & Huber, Martin & Wiesenfarth, Manuel, 2017. "The finite sample performance of semi- and non-parametric estimators for treatment effects and policy evaluation," Computational Statistics & Data Analysis, Elsevier, vol. 115(C), pages 91-102.
- Frölich, Markus & Huber, Martin & Wiesenfarth, Manuel, 2015. "The Finite Sample Performance of Semi- and Nonparametric Estimators for Treatment Effects and Policy Evaluation," IZA Discussion Papers 8756, Institute of Labor Economics (IZA).
- Frölich, Markus & Huber, Martin & Wiesenfarth, Manuel, 2015. "The finite sample performance of semi- and nonparametric estimators for treatment effects and policy evaluation," FSES Working Papers 454, Faculty of Economics and Social Sciences, University of Freiburg/Fribourg Switzerland.
Hugo Bodory & Martin Huber & LukÃ¡Å¡ LaffÃ©rs, 2022. "Evaluating (weighted) dynamic treatment effects by double machine learning [Identification of causal effects using instrumental variables]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 628-648.
- Hugo Bodory & Martin Huber & Luk'av{s} Laff'ers, 2020. "Evaluating (weighted) dynamic treatment effects by double machine learning," Papers 2012.00370, arXiv.org, revised Jun 2021.
Jelena Bradic & Victor Chernozhukov & Whitney K. Newey & Yinchu Zhu, 2019. "Minimax Semiparametric Learning With Approximate Sparsity," Papers 1912.12213, arXiv.org, revised Aug 2022.
Susan Gruber & Mark J. van der Laan, 2013. "An Application of Targeted Maximum Likelihood Estimation to the Meta-Analysis of Safety Data," Biometrics, The International Biometric Society, vol. 69(1), pages 254-262, March.
Wei Luo & Yeying Zhu & Debashis Ghosh, 2017. "On estimating regression-based causal effects using sufficient dimension reduction," Biometrika, Biometrika Trust, vol. 104(1), pages 51-65.
Veronica Sciannameo & Gian Paolo Fadini & Daniele Bottigliengo & Angelo Avogaro & Ileana Baldi & Dario Gregori & Paola Berchialla, 2022. "Assessment of Glucose Lowering Medications’ Effectiveness for Cardiovascular Clinical Risk Management of Real-World Patients with Type 2 Diabetes: Targeted Maximum Likelihood Estimation under Model Mi," IJERPH, MDPI, vol. 19(22), pages 1-13, November.
Haight, Thaddeus J. & Wang, Yue & van der Laan, Mark J. & Tager, Ira B., 2010. "A cross-validation deletion-substitution-addition model selection algorithm: Application to marginal structural models," Computational Statistics & Data Analysis, Elsevier, vol. 54(12), pages 3080-3094, December.

More about this item

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:drugsa:v:45:y:2022:i:5:d:10.1007_s40264-022-01158-3. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com/economics/journal/40264 .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Applying Machine Learning in Distributed Data Networks for Pharmacoepidemiologic and Pharmacovigilance Studies: Opportunities, Challenges, and Considerations

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data