IDEAS home Printed from https://ideas.repec.org/a/inm/orijoc/v34y2022i1p503-521.html
   My bibliography  Save this article

Improving Reliability Estimation for Individual Numeric Predictions: A Machine Learning Approach

Author

Listed:
  • Gediminas Adomavicius

    (Department of Information and Decision Sciences, Carlson School of Management, University of Minnesota, Minneapolis, Minnesota 55455)

  • Yaqiong Wang

    (Information Systems & Analytics Department, Leavey School of Business, Santa Clara University, Santa Clara, California 95050)

Abstract

Numerical predictive modeling is widely used in different application domains. Although many modeling techniques have been proposed, and a number of different aggregate accuracy metrics exist for evaluating the overall performance of predictive models, other important aspects, such as the reliability (or confidence and uncertainty) of individual predictions, have been underexplored. We propose to use estimated absolute prediction error as the indicator of individual prediction reliability, which has the benefits of being intuitive and providing highly interpretable information to decision makers, as well as allowing for more precise evaluation of reliability estimation quality. As importantly, the proposed reliability indicator allows the reframing of reliability estimation itself as a canonical numeric prediction problem, which makes the proposed approach general-purpose (i.e., it can work in conjunction with any outcome prediction model), alleviates the need for distributional assumptions, and enables the use of advanced, state-of-the-art machine learning techniques to learn individual prediction reliability patterns directly from data. Extensive experimental results on multiple real-world data sets show that the proposed machine learning-based approach can significantly improve individual prediction reliability estimation as compared with a number of baselines from prior work, especially in more complex predictive scenarios.

Suggested Citation

  • Gediminas Adomavicius & Yaqiong Wang, 2022. "Improving Reliability Estimation for Individual Numeric Predictions: A Machine Learning Approach," INFORMS Journal on Computing, INFORMS, vol. 34(1), pages 503-521, January.
  • Handle: RePEc:inm:orijoc:v:34:y:2022:i:1:p:503-521
    DOI: 10.1287/ijoc.2020.1019
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/ijoc.2020.1019
    Download Restriction: no

    File URL: https://libkey.io/10.1287/ijoc.2020.1019?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Sebastian Briesemeister & Jörg Rahnenführer & Oliver Kohlbacher, 2012. "No Longer Confidential: Estimating the Confidence of Individual Regression Predictions," PLOS ONE, Public Library of Science, vol. 7(11), pages 1-9, November.
    2. Bradley Efron, 2004. "The Estimation of Prediction Error: Covariance Penalties and Cross-Validation," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 619-632, January.
    3. David J. Hand & Keming Yu, 2001. "Idiot's Bayes—Not So Stupid After All?," International Statistical Review, International Statistical Institute, vol. 69(3), pages 385-398, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Xing, Jin & Chi, Guotai & Pan, Ancheng, 2024. "Instance-dependent misclassification cost-sensitive learning for default prediction," Research in International Business and Finance, Elsevier, vol. 69(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Theo Dijkstra, 2014. "Ridge regression and its degrees of freedom," Quality & Quantity: International Journal of Methodology, Springer, vol. 48(6), pages 3185-3193, November.
    2. Sieds, 2012. "Complete Volume LXVI n.1 2012," RIEDS - Rivista Italiana di Economia, Demografia e Statistica - The Italian Journal of Economic, Demographic and Statistical Studies, SIEDS Societa' Italiana di Economia Demografia e Statistica, vol. 66(1), pages 1-296.
    3. DE CNUDDE, Sofie & MARTENS, David & EVGENIOU, Theodoros & PROVOST, Foster, 2017. "A benchmarking study of classification techniques for behavioral data," Working Papers 2017005, University of Antwerp, Faculty of Business and Economics.
    4. Hettihewa, Samanthala & Saha, Shrabani & Zhang, Hanxiong, 2018. "Does an aging population influence stock markets? Evidence from New Zealand," Economic Modelling, Elsevier, vol. 75(C), pages 142-158.
    5. Sascha O. Becker & Luigi Pascali, 2019. "Religion, Division of Labor, and Conflict: Anti-semitism in Germany over 600 Years," American Economic Review, American Economic Association, vol. 109(5), pages 1764-1804, May.
    6. Mendez, Guillermo & Lohr, Sharon, 2011. "Estimating residual variance in random forest regression," Computational Statistics & Data Analysis, Elsevier, vol. 55(11), pages 2937-2950, November.
    7. Yanagihara, Hirokazu & Satoh, Kenichi, 2010. "An unbiased Cp criterion for multivariate ridge regression," Journal of Multivariate Analysis, Elsevier, vol. 101(5), pages 1226-1238, May.
    8. Zhang, Xinyu & Yu, Jihai, 2018. "Spatial weights matrix selection and model averaging for spatial autoregressive models," Journal of Econometrics, Elsevier, vol. 203(1), pages 1-18.
    9. Brighton, Henry, 2020. "Statistical foundations of ecological rationality," Economics - The Open-Access, Open-Assessment E-Journal (2007-2020), Kiel Institute for the World Economy (IfW Kiel), vol. 14, pages 1-32.
    10. Chunming Zhang, 2008. "Prediction Error Estimation Under Bregman Divergence for Non‐Parametric Regression and Classification," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 35(3), pages 496-523, September.
    11. Giessing, Alexander & He, Xuming, 2019. "On the predictive risk in misspecified quantile regression," Journal of Econometrics, Elsevier, vol. 213(1), pages 235-260.
    12. James Younker, 2022. "Calculating Effective Degrees of Freedom for Forecast Combinations and Ensemble Models," Discussion Papers 2022-19, Bank of Canada.
    13. Wang, You-Gan & Hin, Lin-Yee, 2010. "Modeling strategies in longitudinal data analysis: Covariate, variance function and correlation structure selection," Computational Statistics & Data Analysis, Elsevier, vol. 54(12), pages 3359-3370, December.
    14. Daudin, Jean-Jacques & Mary-Huard, Tristan, 2008. "Estimation of the conditional risk in classification: The swapping method," Computational Statistics & Data Analysis, Elsevier, vol. 52(6), pages 3220-3232, February.
    15. Kun Chen & Kung-Sik Chan & Nils Chr. Stenseth, 2014. "Source-Sink Reconstruction Through Regularized Multicomponent Regression Analysis-With Application to Assessing Whether North Sea Cod Larvae Contributed to Local Fjord Cod in Skagerrak," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(506), pages 560-573, June.
    16. Kruse, René-Marcel & Silbersdorff, Alexander & Säfken, Benjamin, 2022. "Model averaging for linear mixed models via augmented Lagrangian," Computational Statistics & Data Analysis, Elsevier, vol. 167(C).
    17. Chen, Yin-Ping & Huang, Hsin-Cheng & Tu, I-Ping, 2010. "A new approach for selecting the number of factors," Computational Statistics & Data Analysis, Elsevier, vol. 54(12), pages 2990-2998, December.
    18. Kundan Deval & P. K. Joshi, 2022. "Vegetation type and land cover mapping in a semi-arid heterogeneous forested wetland of India: comparing image classification algorithms," Environment, Development and Sustainability: A Multidisciplinary Approach to the Theory and Practice of Sustainable Development, Springer, vol. 24(3), pages 3947-3966, March.
    19. ter Braak, Cajo J.F., 2006. "Bayesian sigmoid shrinkage with improper variance priors and an application to wavelet denoising," Computational Statistics & Data Analysis, Elsevier, vol. 51(2), pages 1232-1242, November.
    20. Hirose, Kei & Tateishi, Shohei & Konishi, Sadanori, 2013. "Tuning parameter selection in sparse regression modeling," Computational Statistics & Data Analysis, Elsevier, vol. 59(C), pages 28-40.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:orijoc:v:34:y:2022:i:1:p:503-521. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.