IDEAS home Printed from https://ideas.repec.org/a/inm/ormoor/v48y2023i4p2043-2065.html
   My bibliography  Save this article

Generalization Bounds in the Predict-Then-Optimize Framework

Author

Listed:
  • Othman El Balghiti

    (Department of Industrial Engineering and Operations Research, Columbia University, New York, New York 10027)

  • Adam N. Elmachtoub

    (Department of Industrial Engineering and Operations Research, Columbia University, New York, New York 10027)

  • Paul Grigas

    (Department of Industrial Engineering and Operations Research, University of California, Berkeley, Berkeley, California 94720)

  • Ambuj Tewari

    (Department of Statistics, University of Michigan, Ann Arbor, Michigan 48109)

Abstract

The predict-then-optimize framework is fundamental in many practical settings: predict the unknown parameters of an optimization problem and then solve the problem using the predicted values of the parameters. A natural loss function in this environment is to consider the cost of the decisions induced by the predicted parameters in contrast to the prediction error of the parameters. This loss function is referred to as the smart predict-then-optimize (SPO) loss. In this work, we seek to provide bounds on how well the performance of a prediction model fit on training data generalizes out of sample in the context of the SPO loss. Because the SPO loss is nonconvex and non-Lipschitz, standard results for deriving generalization bounds do not apply. We first derive bounds based on the Natarajan dimension that, in the case of a polyhedral feasible region, scale at most logarithmically in the number of extreme points but, in the case of a general convex feasible region, have linear dependence on the decision dimension. By exploiting the structure of the SPO loss function and a key property of the feasible region, which we denote as the strength property, we can dramatically improve the dependence on the decision and feature dimensions. Our approach and analysis rely on placing a margin around problematic predictions that do not yield unique optimal solutions and then providing generalization bounds in the context of a modified margin SPO loss function that is Lipschitz continuous. Finally, we characterize the strength property and show that the modified SPO loss can be computed efficiently for both strongly convex bodies and polytopes with an explicit extreme point representation.

Suggested Citation

  • Othman El Balghiti & Adam N. Elmachtoub & Paul Grigas & Ambuj Tewari, 2023. "Generalization Bounds in the Predict-Then-Optimize Framework," Mathematics of Operations Research, INFORMS, vol. 48(4), pages 2043-2065, November.
  • Handle: RePEc:inm:ormoor:v:48:y:2023:i:4:p:2043-2065
    DOI: 10.1287/moor.2022.1330
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/moor.2022.1330
    Download Restriction: no

    File URL: https://libkey.io/10.1287/moor.2022.1330?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:ormoor:v:48:y:2023:i:4:p:2043-2065. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.