IDEAS home Printed from https://ideas.repec.org/a/inm/orijoc/v34y2022i1p503-521.html
   My bibliography  Save this article

Improving Reliability Estimation for Individual Numeric Predictions: A Machine Learning Approach

Author

Listed:
  • Gediminas Adomavicius

    (Department of Information and Decision Sciences, Carlson School of Management, University of Minnesota, Minneapolis, Minnesota 55455)

  • Yaqiong Wang

    (Information Systems & Analytics Department, Leavey School of Business, Santa Clara University, Santa Clara, California 95050)

Abstract

Numerical predictive modeling is widely used in different application domains. Although many modeling techniques have been proposed, and a number of different aggregate accuracy metrics exist for evaluating the overall performance of predictive models, other important aspects, such as the reliability (or confidence and uncertainty) of individual predictions, have been underexplored. We propose to use estimated absolute prediction error as the indicator of individual prediction reliability, which has the benefits of being intuitive and providing highly interpretable information to decision makers, as well as allowing for more precise evaluation of reliability estimation quality. As importantly, the proposed reliability indicator allows the reframing of reliability estimation itself as a canonical numeric prediction problem, which makes the proposed approach general-purpose (i.e., it can work in conjunction with any outcome prediction model), alleviates the need for distributional assumptions, and enables the use of advanced, state-of-the-art machine learning techniques to learn individual prediction reliability patterns directly from data. Extensive experimental results on multiple real-world data sets show that the proposed machine learning-based approach can significantly improve individual prediction reliability estimation as compared with a number of baselines from prior work, especially in more complex predictive scenarios.

Suggested Citation

  • Gediminas Adomavicius & Yaqiong Wang, 2022. "Improving Reliability Estimation for Individual Numeric Predictions: A Machine Learning Approach," INFORMS Journal on Computing, INFORMS, vol. 34(1), pages 503-521, January.
  • Handle: RePEc:inm:orijoc:v:34:y:2022:i:1:p:503-521
    DOI: 10.1287/ijoc.2020.1019
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/ijoc.2020.1019
    Download Restriction: no

    File URL: https://libkey.io/10.1287/ijoc.2020.1019?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Bradley Efron, 2004. "The Estimation of Prediction Error: Covariance Penalties and Cross-Validation," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 619-632, January.
    2. Sebastian Briesemeister & Jörg Rahnenführer & Oliver Kohlbacher, 2012. "No Longer Confidential: Estimating the Confidence of Individual Regression Predictions," PLOS ONE, Public Library of Science, vol. 7(11), pages 1-9, November.
    3. David J. Hand & Keming Yu, 2001. "Idiot's Bayes—Not So Stupid After All?," International Statistical Review, International Statistical Institute, vol. 69(3), pages 385-398, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Xing, Jin & Chi, Guotai & Pan, Ancheng, 2024. "Instance-dependent misclassification cost-sensitive learning for default prediction," Research in International Business and Finance, Elsevier, vol. 69(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Theo Dijkstra, 2014. "Ridge regression and its degrees of freedom," Quality & Quantity: International Journal of Methodology, Springer, vol. 48(6), pages 3185-3193, November.
    2. Diego Vidaurre & Concha Bielza & Pedro Larrañaga, 2013. "A Survey of L1 Regression," International Statistical Review, International Statistical Institute, vol. 81(3), pages 361-387, December.
    3. Graham, John R. & Grennan, Jillian & Harvey, Campbell R. & Rajgopal, Shivaram, 2022. "Corporate culture: Evidence from the field," Journal of Financial Economics, Elsevier, vol. 146(2), pages 552-593.
    4. Sieds, 2012. "Complete Volume LXVI n.1 2012," RIEDS - Rivista Italiana di Economia, Demografia e Statistica - The Italian Journal of Economic, Demographic and Statistical Studies, SIEDS Societa' Italiana di Economia Demografia e Statistica, vol. 66(1), pages 1-296.
    5. Yi, Feng & Zou, Hui, 2013. "SURE-tuned tapering estimation of large covariance matrices," Computational Statistics & Data Analysis, Elsevier, vol. 58(C), pages 339-351.
    6. DE CNUDDE, Sofie & MARTENS, David & EVGENIOU, Theodoros & PROVOST, Foster, 2017. "A benchmarking study of classification techniques for behavioral data," Working Papers 2017005, University of Antwerp, Faculty of Business and Economics.
    7. Wei, Jiawei & Zhou, Lan, 2010. "Model selection using modified AIC and BIC in joint modeling of paired functional data," Statistics & Probability Letters, Elsevier, vol. 80(23-24), pages 1918-1924, December.
    8. Stefano Marchetti & Maciej Beręsewicz & Nicola Salvati & Marcin Szymkowiak & Łukasz Wawrowski, 2018. "The use of a three‐level M‐quantile model to map poverty at local administrative unit 1 in Poland," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 181(4), pages 1077-1104, October.
    9. Hettihewa, Samanthala & Saha, Shrabani & Zhang, Hanxiong, 2018. "Does an aging population influence stock markets? Evidence from New Zealand," Economic Modelling, Elsevier, vol. 75(C), pages 142-158.
    10. Matthew Gentzkow & Bryan T. Kelly & Matt Taddy, 2017. "Text as Data," NBER Working Papers 23276, National Bureau of Economic Research, Inc.
    11. Minami, Kentaro, 2020. "Degrees of freedom in submodular regularization: A computational perspective of Stein’s unbiased risk estimate," Journal of Multivariate Analysis, Elsevier, vol. 175(C).
    12. Sascha O. Becker & Luigi Pascali, 2019. "Religion, Division of Labor, and Conflict: Anti-semitism in Germany over 600 Years," American Economic Review, American Economic Association, vol. 109(5), pages 1764-1804, May.
    13. Vinciotti Veronica & Augugliaro Luigi & Abbruzzo Antonino & Wit Ernst C., 2016. "Model selection for factorial Gaussian graphical models with an application to dynamic regulatory networks," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 15(3), pages 193-212, June.
    14. Bradley Efron, 2021. "Resampling Plans and the Estimation of Prediction Error," Stats, MDPI, vol. 4(4), pages 1-25, December.
    15. Rajeev D S Raizada & Yune-Sang Lee, 2013. "Smoothness without Smoothing: Why Gaussian Naive Bayes Is Not Naive for Multi-Subject Searchlight Studies," PLOS ONE, Public Library of Science, vol. 8(7), pages 1-10, July.
    16. Mendez, Guillermo & Lohr, Sharon, 2011. "Estimating residual variance in random forest regression," Computational Statistics & Data Analysis, Elsevier, vol. 55(11), pages 2937-2950, November.
    17. Binder, Harald & Sauerbrei, Willi, 2008. "Increasing the usefulness of additive spline models by knot removal," Computational Statistics & Data Analysis, Elsevier, vol. 52(12), pages 5305-5318, August.
    18. Yanagihara, Hirokazu & Satoh, Kenichi, 2010. "An unbiased Cp criterion for multivariate ridge regression," Journal of Multivariate Analysis, Elsevier, vol. 101(5), pages 1226-1238, May.
    19. Marbac, Matthieu & Vandewalle, Vincent, 2019. "A tractable multi-partitions clustering," Computational Statistics & Data Analysis, Elsevier, vol. 132(C), pages 167-179.
    20. Yongli Zhang & Xiaotong Shen, 2015. "Adaptive Modeling Procedure Selection by Data Perturbation," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 33(4), pages 541-551, October.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:orijoc:v:34:y:2022:i:1:p:503-521. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.