IDEAS home Printed from https://ideas.repec.org/a/spr/stabio/v9y2017i1d10.1007_s12561-016-9159-7.html
   My bibliography  Save this article

ROC Curve Analysis in the Presence of Imperfect Reference Standards

Author

Listed:
  • Peizhou Liao

    (Emory University)

  • Hao Wu

    (Emory University)

  • Tianwei Yu

    (Emory University)

Abstract

The receiver operating characteristic (ROC) curve is an important tool for the evaluation and comparison of predictive models when the outcome is binary. If the class membership of the outcomes is known, ROC can be constructed for a model, and the ROC with greater area under the curve indicates better performance. However in practice, imperfect reference standards often exist, in which class membership of every data point is not fully determined. This situation is especially prevalent in high-throughput biomedical data because obtaining perfect reference standards for all data points is either too costly or technically impractical. To construct ROC curves for these data, the common practice is to either ignore the uncertainties in references or remove data points with high uncertainties. Such approaches may cause bias to the ROC curves and generate misleading results in method evaluation. Here we present a framework to incorporate membership uncertainties into the construction of ROC curve, termed the expected ROC or “eROC” curve. We develop an efficient procedure for the estimation of eROC curve. The advantages of using eROC are demonstrated using simulated and real data.

Suggested Citation

  • Peizhou Liao & Hao Wu & Tianwei Yu, 2017. "ROC Curve Analysis in the Presence of Imperfect Reference Standards," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 9(1), pages 91-104, June.
  • Handle: RePEc:spr:stabio:v:9:y:2017:i:1:d:10.1007_s12561-016-9159-7
    DOI: 10.1007/s12561-016-9159-7
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s12561-016-9159-7
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s12561-016-9159-7?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Hong, Yili, 2013. "On computing the distribution function for the Poisson binomial distribution," Computational Statistics & Data Analysis, Elsevier, vol. 59(C), pages 41-51.
    2. Paul S. Albert & Aiyi Liu & Tonja Nansel, 2014. "Efficient logistic regression designs under an imperfect population identifier," Biometrics, The International Biometric Society, vol. 70(1), pages 175-184, March.
    3. Xiao-Hua Zhou & Pete Castelluccio & Chuan Zhou, 2005. "Nonparametric Estimation of ROC Curves in the Absence of a Gold Standard," Biometrics, The International Biometric Society, vol. 61(2), pages 600-609, June.
    4. Tianwei Yu, 2012. "ROCS: Receiver Operating Characteristic Surface for Class-Skewed High-Throughput Data," PLOS ONE, Public Library of Science, vol. 7(7), pages 1-8, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Paul S. Albert, 2007. "Random Effects Modeling Approaches for Estimating ROC Curves from Repeated Ordinal Tests without a Gold Standard," Biometrics, The International Biometric Society, vol. 63(2), pages 593-602, June.
    2. Mauricio Romero & Ã lvaro Riascos & Diego Jara, 2015. "On the Optimality of Answer-Copying Indices," Journal of Educational and Behavioral Statistics, , vol. 40(5), pages 435-453, October.
    3. Arun G. Chandrasekhar & Robert Townsend & Juan Pablo Xandri, 2018. "Financial Centrality and Liquidity Provision," NBER Working Papers 24406, National Bureau of Economic Research, Inc.
    4. Deligiannis, Michalis & Liberopoulos, George, 2023. "Dynamic ordering and buyer selection policies when service affects future demand," Omega, Elsevier, vol. 118(C).
    5. Neal, Zachary & Domagalski, Rachel & Yan, Xiaoqin, 2020. "Party Control as a Context for Homophily in Collaborations among US House Representatives, 1981 -- 2015," OSF Preprints qwdxs, Center for Open Science.
    6. Róbert Pethes & Levente Kovács, 2023. "An Exact and an Approximation Method to Compute the Degree Distribution of Inhomogeneous Random Graph Using Poisson Binomial Distribution," Mathematics, MDPI, vol. 11(6), pages 1-24, March.
    7. Van der Auweraer, Sarah & Boute, Robert, 2019. "Forecasting spare part demand using service maintenance information," International Journal of Production Economics, Elsevier, vol. 213(C), pages 138-149.
    8. Piero Mazzarisi & Adele Ravagnani & Paola Deriu & Fabrizio Lillo & Francesca Medda & Antonio Russo, 2022. "A machine learning approach to support decision in insider trading detection," Papers 2212.05912, arXiv.org.
    9. Mika J. Straka & Guido Caldarelli & Tiziano Squartini & Fabio Saracco, 2017. "From Ecology to Finance (and Back?): Recent Advancements in the Analysis of Bipartite Networks," Papers 1710.10143, arXiv.org.
    10. Jeff Alstott & Giorgio Triulzi & Bowen Yan & Jianxi Luo, 2017. "Mapping technology space by normalizing patent networks," Scientometrics, Springer;Akadémiai Kiadó, vol. 110(1), pages 443-479, January.
    11. Van der Auweraer, Sarah & Zhu, Sha & Boute, Robert N., 2021. "The value of installed base information for spare part inventory control," International Journal of Production Economics, Elsevier, vol. 239(C).
    12. María Belén Atiencia-Carrera & Fausto Sebastián Cabezas-Mera & Eduardo Tejera & António Machado, 2022. "Prevalence of biofilms in Candida spp. bloodstream infections: A meta-analysis," PLOS ONE, Public Library of Science, vol. 17(2), pages 1-23, February.
    13. Arun Chandrasekhar & Robert Townsend & Juan Pablo Pablo Xandri, 2019. "Financial Centrality and the Value of Key Players," Working Papers 2019-26, Princeton University. Economics Department..
    14. Zhenqiu Liu & Ming Tan, 2008. "ROC-Based Utility Function Maximization for Feature Selection and Classification with Applications to High-Dimensional Protease Data," Biometrics, The International Biometric Society, vol. 64(4), pages 1155-1161, December.
    15. Stanislao Gualdi & Giulio Cimini & Kevin Primicerio & Riccardo Di Clemente & Damien Challet, 2016. "Statistically validated network of portfolio overlaps and systemic risk," Papers 1603.05914, arXiv.org, revised Sep 2016.
    16. Cheng, Dunlei & Branscum, Adam J. & Stamey, James D., 2010. "A Bayesian approach to sample size determination for studies designed to evaluate continuous medical tests," Computational Statistics & Data Analysis, Elsevier, vol. 54(2), pages 298-307, February.
    17. Samuel Davis & Nasser Fard, 2020. "Theoretical bounds and approximation of the probability mass function of future hospital bed demand," Health Care Management Science, Springer, vol. 23(1), pages 20-33, March.
    18. Alessio Farcomeni & Monia Ranalli & Sara Viviani, 2021. "Dimension reduction for longitudinal multivariate data by optimizing class separation of projected latent Markov models," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(2), pages 462-480, June.
    19. Musa Çağlar & Sinan Gürel, 2017. "Public R&D project portfolio selection problem with cancellations," OR Spectrum: Quantitative Approaches in Management, Springer;Gesellschaft für Operations Research e.V., vol. 39(3), pages 659-687, July.
    20. Hiroyuki Kasahara & Katsumi Shimotsu, 2007. "Nonparametric Identification And Estimation Of Multivariate Mixtures," Working Paper 1153, Economics Department, Queen's University.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:stabio:v:9:y:2017:i:1:d:10.1007_s12561-016-9159-7. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.