IDEAS home Printed from https://ideas.repec.org/a/bla/biomet/v77y2021i1p31-44.html
   My bibliography  Save this article

Approval policies for modifications to machine learning‐based software as a medical device: A study of bio‐creep

Author

Listed:
  • Jean Feng
  • Scott Emerson
  • Noah Simon

Abstract

Successful deployment of machine learning algorithms in healthcare requires careful assessments of their performance and safety. To date, the FDA approves locked algorithms prior to marketing and requires future updates to undergo separate premarket reviews. However, this negates a key feature of machine learning—the ability to learn from a growing dataset and improve over time. This paper frames the design of an approval policy, which we refer to as an automatic algorithmic change protocol (aACP), as an online hypothesis testing problem. As this process has obvious analogy with noninferiority testing of new drugs, we investigate how repeated testing and adoption of modifications might lead to gradual deterioration in prediction accuracy, also known as “biocreep” in the drug development literature. We consider simple policies that one might consider but do not necessarily offer any error‐rate guarantees, as well as policies that do provide error‐rate control. For the latter, we define two online error‐rates appropriate for this context: bad approval count (BAC) and bad approval and benchmark ratios (BABR). We control these rates in the simple setting of a constant population and data source using policies aACP‐BAC and aACP‐BABR, which combine alpha‐investing, group‐sequential, and gate‐keeping methods. In simulation studies, bio‐creep regularly occurred when using policies with no error‐rate guarantees, whereas aACP‐BAC and aACP‐BABR controlled the rate of bio‐creep without substantially impacting our ability to approve beneficial modifications.

Suggested Citation

  • Jean Feng & Scott Emerson & Noah Simon, 2021. "Approval policies for modifications to machine learning‐based software as a medical device: A study of bio‐creep," Biometrics, The International Biometric Society, vol. 77(1), pages 31-44, March.
  • Handle: RePEc:bla:biomet:v:77:y:2021:i:1:p:31-44
    DOI: 10.1111/biom.13379
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/biom.13379
    Download Restriction: no

    File URL: https://libkey.io/10.1111/biom.13379?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Ajit C. Tamhane & Jiangtao Gou & Christopher Jennison & Cyrus R. Mehta & Teresa Curto, 2018. "A gatekeeping procedure to test a primary and a secondary endpoint in a group sequential design with multiple interim looks," Biometrics, The International Biometric Society, vol. 74(1), pages 40-48, March.
    2. Dean P. Foster & Robert A. Stine, 2008. "α‐investing: a procedure for sequential control of expected false discoveries," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(2), pages 429-444, April.
    3. Patrick J. Heagerty & Yingye Zheng, 2005. "Survival Model Predictive Accuracy and ROC Curves," Biometrics, The International Biometric Society, vol. 61(1), pages 92-105, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Sherri Rose, 2021. "Discussion on “Approval policies for modifications to machine learning‐based software as a medical device: A study of biocreep” by Jean Feng, Scott Emerson, and Noah Simon," Biometrics, The International Biometric Society, vol. 77(1), pages 49-51, March.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Schmid, Matthias & Tutz, Gerhard & Welchowski, Thomas, 2018. "Discrimination measures for discrete time-to-event predictions," Econometrics and Statistics, Elsevier, vol. 7(C), pages 153-164.
    2. Foster, Dean P. & Stine, Robert & Young, H. Peyton, 2011. "A Markov Test for Alpha," Working Papers 11-49, University of Pennsylvania, Wharton School, Weiss Center.
    3. Yanyuan Ma & Yuanjia Wang, 2014. "Estimating disease onset distribution functions in mutation carriers with censored mixture data," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 63(1), pages 1-23, January.
    4. Olivier Lopez & Xavier Milhaud & Pierre-Emmanuel Thérond, 2016. "Tree-based censored regression with applications in insurance," Post-Print hal-01364437, HAL.
    5. Weining Shen & Jing Ning & Ying Yuan & Anna S. Lok & Ziding Feng, 2018. "Model†free scoring system for risk prediction with application to hepatocellular carcinoma study," Biometrics, The International Biometric Society, vol. 74(1), pages 239-248, March.
    6. Pablo Mart�nez-Camblor & Jacobo de U�a-�lvarez & Carmen D�az Corte, 2015. "Expanded renal transplantation: a competing risk model approach," Journal of Applied Statistics, Taylor & Francis Journals, vol. 42(12), pages 2539-2553, December.
    7. Dimitris Rizopoulos, 2011. "Dynamic Predictions and Prospective Accuracy in Joint Models for Longitudinal and Time-to-Event Data," Biometrics, The International Biometric Society, vol. 67(3), pages 819-829, September.
    8. Aasthaa Bansal & Patrick J. Heagerty, 2018. "A Tutorial on Evaluating the Time-Varying Discrimination Accuracy of Survival Models Used in Dynamic Decision Making," Medical Decision Making, , vol. 38(8), pages 904-916, November.
    9. Ruosha Li & Jing Ning & Ziding Feng, 2022. "Estimation and inference of predictive discrimination for survival outcome risk prediction models," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 28(2), pages 219-240, April.
    10. Robin Van Oirbeek & Emmanuel Lesaffre, 2018. "An Investigation of the Discriminatory Ability of the Clustering Effect of the Frailty Survival Model," Biostatistics and Biometrics Open Access Journal, Juniper Publishers Inc., vol. 6(3), pages 87-98, April.
    11. Yuanjia Wang & Huaihou Chen & Runze Li & Naihua Duan & Roberto Lewis-Fernández, 2011. "Prediction-Based Structured Variable Selection through the Receiver Operating Characteristic Curves," Biometrics, The International Biometric Society, vol. 67(3), pages 896-905, September.
    12. Susana Díaz-Coto & Pablo Martínez-Camblor & Sonia Pérez-Fernández, 2020. "smoothROCtime: an R package for time-dependent ROC curve estimation," Computational Statistics, Springer, vol. 35(3), pages 1231-1251, September.
    13. Heath, Davidson & Ringgenberg, Matthew C. & Samadi, Mehrdad & Werner, Ingrid M., 2019. "Reusing Natural Experiments," Working Paper Series 2019-21, Ohio State University, Charles A. Dice Center for Research in Financial Economics.
    14. Kevin He & Yue Wang & Xiang Zhou & Han Xu & Can Huang, 2019. "An improved variable selection procedure for adaptive Lasso in high-dimensional survival analysis," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 25(3), pages 569-585, July.
    15. Weining Shen & Jing Ning & Ying Yuan, 2015. "A direct method to evaluate the time-dependent predictive accuracy for biomarkers," Biometrics, The International Biometric Society, vol. 71(2), pages 439-449, June.
    16. Matthias Schmid & Thomas Hielscher & Thomas Augustin & Olaf Gefeller, 2011. "A Robust Alternative to the Schemper–Henderson Estimator of Prediction Error," Biometrics, The International Biometric Society, vol. 67(2), pages 524-535, June.
    17. Ao Yuan & Mihai Giurcanu & George Luta & Ming T. Tan, 2017. "U-statistics with conditional kernels for incomplete data models," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 69(2), pages 271-302, April.
    18. Fengqing Zhang & Jiangtao Gou, 2021. "Refined critical boundary with enhanced statistical power for non-directional two-sided tests in group sequential designs with multiple endpoints," Statistical Papers, Springer, vol. 62(3), pages 1265-1290, June.
    19. Dehan Kong & Joseph G. Ibrahim & Eunjee Lee & Hongtu Zhu, 2018. "FLCRM: Functional linear cox regression model," Biometrics, The International Biometric Society, vol. 74(1), pages 109-117, March.
    20. Shanshan Li & Yang Ning, 2015. "Estimation of covariate‐specific time‐dependent ROC curves in the presence of missing biomarkers," Biometrics, The International Biometric Society, vol. 71(3), pages 666-676, September.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:biomet:v:77:y:2021:i:1:p:31-44. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0006-341X .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.