IDEAS home Printed from https://ideas.repec.org/a/bla/biomet/v79y2023i3p2382-2393.html
   My bibliography  Save this article

Estimating the area under the ROC curve when transporting a prediction model to a target population

Author

Listed:
  • Bing Li
  • Constantine Gatsonis
  • Issa J. Dahabreh
  • Jon A. Steingrimsson

Abstract

We propose methods for estimating the area under the receiver operating characteristic (ROC) curve (AUC) of a prediction model in a target population that differs from the source population that provided the data used for original model development. If covariates that are associated with model performance, as measured by the AUC, have a different distribution in the source and target populations, then AUC estimators that only use data from the source population will not reflect model performance in the target population. Here, we provide identification results for the AUC in the target population when outcome and covariate data are available from the sample of the source population, but only covariate data are available from the sample of the target population. In this setting, we propose three estimators for the AUC in the target population and show that they are consistent and asymptotically normal. We evaluate the finite‐sample performance of the estimators using simulations and use them to estimate the AUC in a nationally representative target population from the National Health and Nutrition Examination Survey for a lung cancer risk prediction model developed using source population data from the National Lung Screening Trial.

Suggested Citation

  • Bing Li & Constantine Gatsonis & Issa J. Dahabreh & Jon A. Steingrimsson, 2023. "Estimating the area under the ROC curve when transporting a prediction model to a target population," Biometrics, The International Biometric Society, vol. 79(3), pages 2382-2393, September.
  • Handle: RePEc:bla:biomet:v:79:y:2023:i:3:p:2382-2393
    DOI: 10.1111/biom.13796
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/biom.13796
    Download Restriction: no

    File URL: https://libkey.io/10.1111/biom.13796?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Barbara J. McNeil & James A. Hanley, 1984. "Statistical Approaches to the Analysis of Receiver Operating Characteristic (ROC) Curves," Medical Decision Making, , vol. 4(2), pages 137-150, June.
    2. Yilin Chen & Pengfei Li & Changbao Wu, 2020. "Doubly Robust Inference With Nonprobability Survey Samples," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(532), pages 2011-2021, December.
    3. Issa J. Dahabreh & Sarah E. Robertson & Eric J. Tchetgen & Elizabeth A. Stuart & Miguel A. Hernán, 2019. "Generalizing causal inferences from individuals in randomized trials to all trial‐eligible individuals," Biometrics, The International Biometric Society, vol. 75(2), pages 685-694, June.
    4. Alex Luedtke & Marco Carone & Mark J. van der Laan, 2019. "An omnibus non‐parametric test of equality in distribution for unknown functions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 81(1), pages 75-99, February.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Colnet Bénédicte & Josse Julie & Varoquaux Gaël & Scornet Erwan, 2022. "Causal effect on a target population: A sensitivity analysis to handle missing covariates," Journal of Causal Inference, De Gruyter, vol. 10(1), pages 372-414, January.
    2. Benjamin Lu & Eli Ben-Michael & Avi Feller & Luke Miratrix, 2023. "Is It Who You Are or Where You Are? Accounting for Compositional Differences in Cross-Site Treatment Effect Variation," Journal of Educational and Behavioral Statistics, , vol. 48(4), pages 420-453, August.
    3. Ieva Burakauskaitė & Andrius Čiginas, 2023. "An Approach to Integrating a Non-Probability Sample in the Population Census," Mathematics, MDPI, vol. 11(8), pages 1-14, April.
    4. Kara E. Rudolph & Jonathan Levy & Mark J. van der Laan, 2021. "Transporting stochastic direct and indirect effects to new populations," Biometrics, The International Biometric Society, vol. 77(1), pages 197-211, March.
    5. Paola Berchialla & Silvia Snidero & Alexandru Stancu & Cecilia Scarinzi & Roberto Corradetti & Dario Gregori & the ESFBI Study Group, 2007. "Predicting Severity of Foreign Body Injuries in Children in Upper Airways: An Approach Based on Regression Trees," Risk Analysis, John Wiley & Sons, vol. 27(5), pages 1255-1263, October.
    6. Chien-Min Huang & F. Jay Breidt, 2023. "A dual-frame approach for estimation with respondent-driven samples," METRON, Springer;Sapienza Università di Roma, vol. 81(1), pages 65-81, April.
    7. Juana-María Vivo & Manuel Franco & Donatella Vicari, 2018. "Rethinking an ROC partial area index for evaluating the classification performance at a high specificity range," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 12(3), pages 683-704, September.
    8. J Mauricio Calvo-Calle & Iwona Strug & Maria-Dorothea Nastke & Stephen P Baker & Lawrence J Stern, 2007. "Human CD4+ T Cell Epitopes from Vaccinia Virus Induced by Vaccination or Infection," PLOS Pathogens, Public Library of Science, vol. 3(10), pages 1-19, October.
    9. Xiangjin Shen & Shiliang Li & Hiroki Tsurumi, 2013. "Comparison of Parametric and Semi-Parametric Binary Response Models," Departmental Working Papers 201308, Rutgers University, Department of Economics.
    10. Xiaojun Mao & Zhonglei Wang & Shu Yang, 2023. "Matrix completion under complex survey sampling," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 75(3), pages 463-492, June.
    11. Bo Zhang, 2023. "Efficient algorithms for building representative matched pairs with enhanced generalizability," Biometrics, The International Biometric Society, vol. 79(4), pages 3981-3997, December.
    12. James A. Hanley, 1988. "The Robustness of the "Binormal" Assumptions Used in Fitting ROC Curves," Medical Decision Making, , vol. 8(3), pages 197-203, August.
    13. Rémi Baroso & Pauline Sellier & Federica Defendi & Delphine Charignon & Arije Ghannam & Mohammed Habib & Christian Drouet & Bertrand Favier, 2016. "Kininogen Cleavage Assay: Diagnostic Assistance for Kinin-Mediated Angioedema Conditions," PLOS ONE, Public Library of Science, vol. 11(9), pages 1-16, September.
    14. Leelambar Singh & Subbarayan Saravanan & J. Jacinth Jennifer & D. Abijith, 2021. "Application of multi-influence factor (MIF) technique for the identification of suitable sites for urban settlement in Tiruchirappalli City, Tamil Nadu, India," Asia-Pacific Journal of Regional Science, Springer, vol. 5(3), pages 797-823, October.
    15. Ramón Ferri-García & Jean-François Beaumont & Keven Bosa & Joanne Charlebois & Kenneth Chu, 2022. "Weight smoothing for nonprobability surveys," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(3), pages 619-643, September.
    16. Xinyu Li & Wang Miao & Fang Lu & Xiao‐Hua Zhou, 2023. "Improving efficiency of inference in clinical trials with external control data," Biometrics, The International Biometric Society, vol. 79(1), pages 394-403, March.
    17. Xiangjin Shen & Iskander Karibzhanov & Hiroki Tsurumi & Shiliang Li, 2022. "Comparison of Bayesian and Sample Theory Parametric and Semiparametric Binary Response Models," Staff Working Papers 22-31, Bank of Canada.
    18. Sixia Chen & Alexandra May Woodruff & Janis Campbell & Sara Vesely & Zheng Xu & Cuyler Snider, 2023. "Combining Probability and Nonprobability Samples by Using Multivariate Mass Imputation Approaches with Application to Biomedical Research," Stats, MDPI, vol. 6(2), pages 1-9, May.
    19. Masahiro Kato & Masatoshi Uehara & Shota Yasui, 2020. "Off-Policy Evaluation and Learning for External Validity under a Covariate Shift," Papers 2002.11642, arXiv.org, revised Oct 2020.
    20. Luis Castro-Martín & María del Mar Rueda & Ramón Ferri-García & César Hernando-Tamayo, 2021. "On the Use of Gradient Boosting Methods to Improve the Estimation with Data Obtained with Self-Selection Procedures," Mathematics, MDPI, vol. 9(23), pages 1-23, November.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:biomet:v:79:y:2023:i:3:p:2382-2393. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0006-341X .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.