IDEAS home Printed from https://ideas.repec.org/p/ecl/stabus/4208.html
   My bibliography  Save this paper

Data-Driven Error Estimation: Upper Bounding Multiple Errors with No Technical Debt

Author

Listed:
  • Krishnamurthy, Sanath Kumar

    (Stanford U)

  • Athey, Susan

    (Stanford U)

  • Brunskill, Emma

    (Stanford U)

Abstract

We formulate the problem of constructing multiple simultaneously valid confidence intervals (CIs) as estimating a high probability upper bound on the maximum error for a class/set of estimate-estimand-error tuples, and refer to this as the error estimation problem. For a single such tuple, data-driven confidence intervals can often be used to bound the error in our estimate. However, for a class of estimate-estimand-error tuples, nontrivial high probability upper bounds on the maximum error often require class complexity as input — limiting the practicality of such methods and often resulting in loose bounds. Rather than deriving theoretical class complexity-based bounds, we propose a completely data-driven approach to estimate an upper bound on the maximum error. The simple and general nature of our solution to this fundamental challenge lends itself to several applications including: multiple CI construction, multiple hypothesis testing, estimating excess risk bounds (a fundamental measure of uncertainty in machine learning) for any training/fine-tuning algorithm, and enabling the development of a contextual bandit pipeline that can leverage any reward model estimation procedure as input (without additional mathematical analysis).

Suggested Citation

  • Krishnamurthy, Sanath Kumar & Athey, Susan & Brunskill, Emma, 2024. "Data-Driven Error Estimation: Upper Bounding Multiple Errors with No Technical Debt," Research Papers 4208, Stanford University, Graduate School of Business.
  • Handle: RePEc:ecl:stabus:4208
    as

    Download full text from publisher

    File URL: https://www.gsb.stanford.edu/faculty-research/working-papers/data-driven-error-estimation-upper-bounding-multiple-errors-no
    Download Restriction: no
    ---><---

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ecl:stabus:4208. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: https://edirc.repec.org/data/gsstaus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.