IDEAS home Printed from https://ideas.repec.org/a/bla/biomet/v77y2021i1p31-44.html
   My bibliography  Save this article

Approval policies for modifications to machine learning‐based software as a medical device: A study of bio‐creep

Author

Listed:
  • Jean Feng
  • Scott Emerson
  • Noah Simon

Abstract

Successful deployment of machine learning algorithms in healthcare requires careful assessments of their performance and safety. To date, the FDA approves locked algorithms prior to marketing and requires future updates to undergo separate premarket reviews. However, this negates a key feature of machine learning—the ability to learn from a growing dataset and improve over time. This paper frames the design of an approval policy, which we refer to as an automatic algorithmic change protocol (aACP), as an online hypothesis testing problem. As this process has obvious analogy with noninferiority testing of new drugs, we investigate how repeated testing and adoption of modifications might lead to gradual deterioration in prediction accuracy, also known as “biocreep” in the drug development literature. We consider simple policies that one might consider but do not necessarily offer any error‐rate guarantees, as well as policies that do provide error‐rate control. For the latter, we define two online error‐rates appropriate for this context: bad approval count (BAC) and bad approval and benchmark ratios (BABR). We control these rates in the simple setting of a constant population and data source using policies aACP‐BAC and aACP‐BABR, which combine alpha‐investing, group‐sequential, and gate‐keeping methods. In simulation studies, bio‐creep regularly occurred when using policies with no error‐rate guarantees, whereas aACP‐BAC and aACP‐BABR controlled the rate of bio‐creep without substantially impacting our ability to approve beneficial modifications.

Suggested Citation

  • Jean Feng & Scott Emerson & Noah Simon, 2021. "Approval policies for modifications to machine learning‐based software as a medical device: A study of bio‐creep," Biometrics, The International Biometric Society, vol. 77(1), pages 31-44, March.
  • Handle: RePEc:bla:biomet:v:77:y:2021:i:1:p:31-44
    DOI: 10.1111/biom.13379
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/biom.13379
    Download Restriction: no

    File URL: https://libkey.io/10.1111/biom.13379?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Patrick J. Heagerty & Yingye Zheng, 2005. "Survival Model Predictive Accuracy and ROC Curves," Biometrics, The International Biometric Society, vol. 61(1), pages 92-105, March.
    2. Ajit C. Tamhane & Jiangtao Gou & Christopher Jennison & Cyrus R. Mehta & Teresa Curto, 2018. "A gatekeeping procedure to test a primary and a secondary endpoint in a group sequential design with multiple interim looks," Biometrics, The International Biometric Society, vol. 74(1), pages 40-48, March.
    3. Dean P. Foster & Robert A. Stine, 2008. "α‐investing: a procedure for sequential control of expected false discoveries," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(2), pages 429-444, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Sherri Rose, 2021. "Discussion on “Approval policies for modifications to machine learning‐based software as a medical device: A study of biocreep” by Jean Feng, Scott Emerson, and Noah Simon," Biometrics, The International Biometric Society, vol. 77(1), pages 49-51, March.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Foster, Dean P. & Stine, Robert & Young, H. Peyton, 2011. "A Markov Test for Alpha," Working Papers 11-49, University of Pennsylvania, Wharton School, Weiss Center.
    2. Yanyuan Ma & Yuanjia Wang, 2014. "Estimating disease onset distribution functions in mutation carriers with censored mixture data," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 63(1), pages 1-23, January.
    3. Pablo Mart�nez-Camblor & Jacobo de U�a-�lvarez & Carmen D�az Corte, 2015. "Expanded renal transplantation: a competing risk model approach," Journal of Applied Statistics, Taylor & Francis Journals, vol. 42(12), pages 2539-2553, December.
    4. Dimitris Rizopoulos, 2011. "Dynamic Predictions and Prospective Accuracy in Joint Models for Longitudinal and Time-to-Event Data," Biometrics, The International Biometric Society, vol. 67(3), pages 819-829, September.
    5. Robin Van Oirbeek & Emmanuel Lesaffre, 2018. "An Investigation of the Discriminatory Ability of the Clustering Effect of the Frailty Survival Model," Biostatistics and Biometrics Open Access Journal, Juniper Publishers Inc., vol. 6(3), pages 87-98, April.
    6. Yuanjia Wang & Huaihou Chen & Runze Li & Naihua Duan & Roberto Lewis-Fernández, 2011. "Prediction-Based Structured Variable Selection through the Receiver Operating Characteristic Curves," Biometrics, The International Biometric Society, vol. 67(3), pages 896-905, September.
    7. Heath, Davidson & Ringgenberg, Matthew C. & Samadi, Mehrdad & Werner, Ingrid M., 2019. "Reusing Natural Experiments," Working Paper Series 2019-21, Ohio State University, Charles A. Dice Center for Research in Financial Economics.
    8. Kevin He & Yue Wang & Xiang Zhou & Han Xu & Can Huang, 2019. "An improved variable selection procedure for adaptive Lasso in high-dimensional survival analysis," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 25(3), pages 569-585, July.
    9. Weining Shen & Jing Ning & Ying Yuan, 2015. "A direct method to evaluate the time-dependent predictive accuracy for biomarkers," Biometrics, The International Biometric Society, vol. 71(2), pages 439-449, June.
    10. Matthias Schmid & Thomas Hielscher & Thomas Augustin & Olaf Gefeller, 2011. "A Robust Alternative to the Schemper–Henderson Estimator of Prediction Error," Biometrics, The International Biometric Society, vol. 67(2), pages 524-535, June.
    11. Ao Yuan & Mihai Giurcanu & George Luta & Ming T. Tan, 2017. "U-statistics with conditional kernels for incomplete data models," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 69(2), pages 271-302, April.
    12. Dehan Kong & Joseph G. Ibrahim & Eunjee Lee & Hongtu Zhu, 2018. "FLCRM: Functional linear cox regression model," Biometrics, The International Biometric Society, vol. 74(1), pages 109-117, March.
    13. Shanshan Li & Yang Ning, 2015. "Estimation of covariate‐specific time‐dependent ROC curves in the presence of missing biomarkers," Biometrics, The International Biometric Society, vol. 71(3), pages 666-676, September.
    14. Cullen F. Goenner, 2020. "Uncertain times and early predictions of bank failure," The Financial Review, Eastern Finance Association, vol. 55(4), pages 583-601, November.
    15. P. Saha & P. J. Heagerty, 2010. "Time-Dependent Predictive Accuracy in the Presence of Competing Risks," Biometrics, The International Biometric Society, vol. 66(4), pages 999-1011, December.
    16. Hannes Kröger & Rasmus Hoffmann, 2018. "The association between CVD-related biomarkers and mortality in the Health and Retirement Survey," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 38(62), pages 1933-2002.
    17. Jing Zhang & Jing Ning & Ruosha Li, 2023. "Evaluating Dynamic Discrimination Performance of Risk Prediction Models for Survival Outcomes," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 15(2), pages 353-371, July.
    18. Gong, Siliang & Zhang, Kai & Liu, Yufeng, 2018. "Efficient test-based variable selection for high-dimensional linear models," Journal of Multivariate Analysis, Elsevier, vol. 166(C), pages 17-31.
    19. Janez Stare & Maja Pohar Perme & Robin Henderson, 2011. "A Measure of Explained Variation for Event History Data," Biometrics, The International Biometric Society, vol. 67(3), pages 750-759, September.
    20. Maede S. Nouri & Daniel J. Lizotte & Kamran Sedig & Sheikh S. Abdullah, 2021. "VISEMURE: A Visual Analytics System for Making Sense of Multimorbidity Using Electronic Medical Record Data," Data, MDPI, vol. 6(8), pages 1-19, August.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:biomet:v:77:y:2021:i:1:p:31-44. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0006-341X .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.