IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2211.08649.html
   My bibliography  Save this paper

Causal Bandits: Online Decision-Making in Endogenous Settings

Author

Listed:
  • Jingwen Zhang
  • Yifang Chen
  • Amandeep Singh

Abstract

The deployment of Multi-Armed Bandits (MAB) has become commonplace in many economic applications. However, regret guarantees for even state-of-the-art linear bandit algorithms (such as Optimism in the Face of Uncertainty Linear bandit (OFUL)) make strong exogeneity assumptions w.r.t. arm covariates. This assumption is very often violated in many economic contexts and using such algorithms can lead to sub-optimal decisions. Further, in social science analysis, it is also important to understand the asymptotic distribution of estimated parameters. To this end, in this paper, we consider the problem of online learning in linear stochastic contextual bandit problems with endogenous covariates. We propose an algorithm we term $\epsilon$-BanditIV, that uses instrumental variables to correct for this bias, and prove an $\tilde{\mathcal{O}}(k\sqrt{T})$ upper bound for the expected regret of the algorithm. Further, we demonstrate the asymptotic consistency and normality of the $\epsilon$-BanditIV estimator. We carry out extensive Monte Carlo simulations to demonstrate the performance of our algorithms compared to other methods. We show that $\epsilon$-BanditIV significantly outperforms other existing methods in endogenous settings. Finally, we use data from real-time bidding (RTB) system to demonstrate how $\epsilon$-BanditIV can be used to estimate the causal impact of advertising in such settings and compare its performance with other existing methods.

Suggested Citation

  • Jingwen Zhang & Yifang Chen & Amandeep Singh, 2022. "Causal Bandits: Online Decision-Making in Endogenous Settings," Papers 2211.08649, arXiv.org, revised Feb 2023.
  • Handle: RePEc:arx:papers:2211.08649
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2211.08649
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Hansen, Lars Peter, 1982. "Large Sample Properties of Generalized Method of Moments Estimators," Econometrica, Econometric Society, vol. 50(4), pages 1029-1054, July.
    2. Haoyu Chen & Wenbin Lu & Rui Song, 2021. "Statistical Inference for Online Decision Making via Stochastic Gradient Descent," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 116(534), pages 708-719, April.
    3. Matthew T. Clements & Hiroshi Ohashi, 2005. "Indirect Network Effects And The Product Cycle: Video Games In The U.S., 1994–2002," Journal of Industrial Economics, Wiley Blackwell, vol. 53(4), pages 515-542, December.
    4. Brett R. Gordon & Florian Zettelmeyer & Neha Bhargava & Dan Chapsky, 2019. "A Comparison of Approaches to Advertising Measurement: Evidence from Big Field Experiments at Facebook," Marketing Science, INFORMS, vol. 38(2), pages 193-225, March.
    5. Hamsa Bastani & Mohsen Bayati, 2020. "Online Decision Making with High-Dimensional Covariates," Operations Research, INFORMS, vol. 68(1), pages 276-294, January.
    6. Michael Sinkinson & Amanda Starc, 2019. "Ask Your Doctor? Direct-to-Consumer Advertising of Pharmaceuticals," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 86(2), pages 836-881.
    7. Hausman, Jerry A., 1983. "Specification and estimation of simultaneous equation models," Handbook of Econometrics, in: Z. Griliches† & M. D. Intriligator (ed.), Handbook of Econometrics, edition 1, volume 1, chapter 7, pages 391-448, Elsevier.
    8. Dimitris Bertsimas & Nikita Korolko & Alexander M. Weinstein, 2019. "Covariate-Adaptive Optimization in Online Clinical Trials," Operations Research, INFORMS, vol. 67(4), pages 1150-1161, July.
    9. Kanishka Misra & Eric M. Schwartz & Jacob Abernethy, 2019. "Dynamic Online Pricing with Incomplete Information Using Multiarmed Bandit Experiments," Marketing Science, INFORMS, vol. 38(2), pages 226-252, March.
    10. Hausman, Jerry A & Newey, Whitney K & Taylor, William E, 1987. "Efficient Estimation and Identification of Simultaneous Equation Models with Covariance Restrictions," Econometrica, Econometric Society, vol. 55(4), pages 849-874, July.
    11. Imbens, Guido W., 2014. "Instrumental Variables: An Econometrician's Perspective," IZA Discussion Papers 8048, Institute of Labor Economics (IZA).
    12. Imbens, Guido W & Angrist, Joshua D, 1994. "Identification and Estimation of Local Average Treatment Effects," Econometrica, Econometric Society, vol. 62(2), pages 467-475, March.
    13. Griliches, Zvi, 1977. "Estimating the Returns to Schooling: Some Econometric Problems," Econometrica, Econometric Society, vol. 45(1), pages 1-22, January.
    14. Whitney K. Newey & James L. Powell, 2003. "Instrumental Variable Estimation of Nonparametric Models," Econometrica, Econometric Society, vol. 71(5), pages 1565-1578, September.
    15. Chunrong Ai & Xiaohong Chen, 2003. "Efficient Estimation of Models with Conditional Moment Restrictions Containing Unknown Functions," Econometrica, Econometric Society, vol. 71(6), pages 1795-1843, November.
    16. Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881.
    17. Hamsa Bastani & Mohsen Bayati & Khashayar Khosravi, 2021. "Mostly Exploration-Free Algorithms for Contextual Bandits," Management Science, INFORMS, vol. 67(3), pages 1329-1349, March.
    18. Chernozhukov, Victor & Imbens, Guido W. & Newey, Whitney K., 2007. "Instrumental variable estimation of nonseparable models," Journal of Econometrics, Elsevier, vol. 139(1), pages 4-14, July.
    19. Somayeh Moazeni & Boris Defourny & Monika J. Wilczak, 2020. "Sequential Learning in Designing Marketing Campaigns for Market Entry," Management Science, INFORMS, vol. 66(9), pages 4226-4245, September.
    20. Haoyu Chen & Wenbin Lu & Rui Song, 2021. "Statistical Inference for Online Decision Making: In a Contextual Bandit Setting," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 116(533), pages 240-255, March.
    21. Xiaohong Chen & Han Hong & Elie Tamer, 2005. "Measurement Error Models with Auxiliary Data," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 72(2), pages 343-366.
    22. Edvard Bakhitov & Amandeep Singh, 2021. "Causal Gradient Boosting: Boosted Instrumental Variable Regression," Papers 2101.06078, arXiv.org.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jin Li & Ye Luo & Xiaowei Zhang, 2021. "Causal Reinforcement Learning: An Instrumental Variable Approach," Papers 2103.04021, arXiv.org, revised Sep 2022.
    2. Halbert White & Karim Chalak, 2013. "Identification and Identification Failure for Treatment Effects Using Structural Systems," Econometric Reviews, Taylor & Francis Journals, vol. 32(3), pages 273-317, November.
    3. Chen, Xiaohong, 2007. "Large Sample Sieve Estimation of Semi-Nonparametric Models," Handbook of Econometrics, in: J.J. Heckman & E.E. Leamer (ed.), Handbook of Econometrics, edition 1, volume 6, chapter 76, Elsevier.
    4. Halbert White & Karim Chalak, 2008. "Identifying Structural Effects in Nonseparable Systems Using Covariates," Boston College Working Papers in Economics 734, Boston College Department of Economics.
    5. Xiaohong Chen & Andres Santos, 2018. "Overidentification in Regular Models," Econometrica, Econometric Society, vol. 86(5), pages 1771-1817, September.
    6. Arthur Lewbel, 2012. "Using Heteroscedasticity to Identify and Estimate Mismeasured and Endogenous Regressor Models," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 30(1), pages 67-80.
    7. Xiaohong Chen & Victor Chernozhukov & Sokbae Lee & Whitney K. Newey, 2014. "Local Identification of Nonparametric and Semiparametric Models," Econometrica, Econometric Society, vol. 82(2), pages 785-809, March.
    8. Hidehiko Ichimura & Whitney K. Newey, 2022. "The influence function of semiparametric estimators," Quantitative Economics, Econometric Society, vol. 13(1), pages 29-61, January.
    9. Xiaohong Chen & Yingyao Hu & Arthur Lewbel, 2007. "Nonparametric Identification and Estimation of Nonclassical Errors-in-Variables Models Without Additional Information," Boston College Working Papers in Economics 676, Boston College Department of Economics.
    10. Ai, Chunrong & Chen, Xiaohong, 2012. "The semiparametric efficiency bound for models of sequential moment restrictions containing unknown functions," Journal of Econometrics, Elsevier, vol. 170(2), pages 442-457.
    11. Xiaohong Chen & Yingyao Hu, 2006. "Identification and Inference of Nonlinear Models Using Two Samples with Arbitrary Measurement Errors," Cowles Foundation Discussion Papers 1590, Cowles Foundation for Research in Economics, Yale University.
    12. A. Belloni & V. Chernozhukov & I. Fernández‐Val & C. Hansen, 2017. "Program Evaluation and Causal Inference With High‐Dimensional Data," Econometrica, Econometric Society, vol. 85, pages 233-298, January.
    13. Xiaolin Sun, 2022. "Estimation of Heterogeneous Treatment Effects Using a Conditional Moment Based Approach," Papers 2210.15829, arXiv.org, revised Oct 2024.
    14. Thomas J. Kane & Cecilia E. Rouse, 1993. "Labor Market Returns to Two- and Four-Year Colleges: Is a Credit a Credit and Do Degrees Matter?," NBER Working Papers 4268, National Bureau of Economic Research, Inc.
    15. Song, Suyong, 2015. "Semiparametric estimation of models with conditional moment restrictions in the presence of nonclassical measurement errors," Journal of Econometrics, Elsevier, vol. 185(1), pages 95-109.
    16. Liao, Yuan & Jiang, Wenxin, 2011. "Posterior consistency of nonparametric conditional moment restricted models," MPRA Paper 38700, University Library of Munich, Germany.
    17. Victor Chernozhukov & Whitney K. Newey & Andres Santos, 2023. "Constrained Conditional Moment Restriction Models," Econometrica, Econometric Society, vol. 91(2), pages 709-736, March.
    18. Yingyao Hu & Susanne M. Schennach, 2008. "Instrumental Variable Treatment of Nonclassical Measurement Error Models," Econometrica, Econometric Society, vol. 76(1), pages 195-216, January.
    19. Arthur Lewbel, 2019. "The Identification Zoo: Meanings of Identification in Econometrics," Journal of Economic Literature, American Economic Association, vol. 57(4), pages 835-903, December.
    20. Xiaohong Chen & Demian Pouzo, 2012. "Estimation of Nonparametric Conditional Moment Models With Possibly Nonsmooth Generalized Residuals," Econometrica, Econometric Society, vol. 80(1), pages 277-321, January.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2211.08649. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.