IDEAS home Printed from https://ideas.repec.org/a/inm/oropre/v71y2023i4p1434-1439.html
   My bibliography  Save this article

Technical Note—The Elliptical Potential Lemma for General Distributions with an Application to Linear Thompson Sampling

Author

Listed:
  • Nima Hamidi

    (Department of Statistics, Stanford University, Stanford, California 94305)

  • Mohsen Bayati

    (Operations, Information, and Technology, Graduate School of Business, Stanford University, Stanford, California 94305)

Abstract

In this note, we introduce a general version of the well-known elliptical potential lemma that is a widely used technique in the analysis of algorithms in sequential learning and decision-making problems. We consider a stochastic linear bandit setting where decision makers sequentially choose among a set of given actions, observe their noisy rewards, and aim to maximize their cumulative expected reward over a decision-making horizon. The elliptical potential lemma is a key tool for quantifying uncertainty in estimating parameters of the reward function, but it requires the noise and the prior distributions to be Gaussian. Our general elliptical potential lemma relaxes this Gaussian requirement, which is a highly nontrivial extension for a number of reasons; unlike the Gaussian case, there is no closed-form solution for the covariance matrix of the posterior distribution, the covariance matrix is not a deterministic function of the actions, and the covariance matrix is not decreasing with respect to the semidefinite inequality. Although this result is of broad interest, we showcase an application of it to prove an improved Bayesian regret bound for the well-known Thompson sampling algorithm in stochastic linear bandits with changing action sets where prior and noise distributions are general. This bound is minimax optimal up to constants.

Suggested Citation

  • Nima Hamidi & Mohsen Bayati, 2023. "Technical Note—The Elliptical Potential Lemma for General Distributions with an Application to Linear Thompson Sampling," Operations Research, INFORMS, vol. 71(4), pages 1434-1439, July.
  • Handle: RePEc:inm:oropre:v:71:y:2023:i:4:p:1434-1439
    DOI: 10.1287/opre.2022.2274
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/opre.2022.2274
    Download Restriction: no

    File URL: https://libkey.io/10.1287/opre.2022.2274?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:oropre:v:71:y:2023:i:4:p:1434-1439. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.