Optimizing Adaptive Experiments: A Unified Approach to Regret Minimization and Best-Arm Identification

My bibliography Save this paper

Optimizing Adaptive Experiments: A Unified Approach to Regret Minimization and Best-Arm Identification

Author

Listed:

Chao Qin
Daniel Russo

Registered:

Abstract

Practitioners conducting adaptive experiments often encounter two competing priorities: maximizing total welfare (or `reward') through effective treatment assignment and swiftly concluding experiments to implement population-wide treatments. Current literature addresses these priorities separately, with regret minimization studies focusing on the former and best-arm identification research on the latter. This paper bridges this divide by proposing a unified model that simultaneously accounts for within-experiment performance and post-experiment outcomes. We provide a sharp theory of optimal performance in large populations that not only unifies canonical results in the literature but also uncovers novel insights. Our theory reveals that familiar algorithms, such as the recently proposed top-two Thompson sampling algorithm, can optimize a broad class of objectives if a single scalar parameter is appropriately adjusted. In addition, we demonstrate that substantial reductions in experiment duration can often be achieved with minimal impact on both within-experiment and post-experiment regret.

Suggested Citation

Chao Qin & Daniel Russo, 2024. "Optimizing Adaptive Experiments: A Unified Approach to Regret Minimization and Best-Arm Identification," Papers 2402.10592, arXiv.org, revised Jul 2024.

Handle: RePEc:arx:papers:2402.10592

Download full text from publisher

References listed on IDEAS

Maximilian Kasy & Anja Sautmann, 2021. "Adaptive Treatment Assignment in Experiments for Policy Choice," Econometrica, Econometric Society, vol. 89(1), pages 113-132, January.
- Maximilian Kasy & Anja Sautmann, 2019. "Adaptive Treatment Assignment in Experiments for Policy Choice," CESifo Working Paper Series 7778, CESifo.
Gilles Stoltz & Sébastien Bubeck & Rémi Munos, 2011. "Pure exploration in finitely-armed and continuous-armed bandits," Post-Print hal-00609550, HAL.
Steven L. Scott, 2010. "A modern Bayesian look at the multi‐armed bandit," Applied Stochastic Models in Business and Industry, John Wiley & Sons, vol. 26(6), pages 639-658, November.
Ye Chen & Ilya O. Ryzhov, 2023. "Balancing Optimal Large Deviations in Sequential Selection," Management Science, INFORMS, vol. 69(6), pages 3457-3473, June.
Stephen E. Chick & Noah Gans, 2009. "Economic Analysis of Simulation Selection Problems," Management Science, INFORMS, vol. 55(3), pages 421-437, March.
Karun Adusumilli, 2022. "How to sample and when to stop sampling: The generalized Wald problem and minimax policies," Papers 2210.15841, arXiv.org, revised Feb 2024.
Daniel Russo & Benjamin Van Roy, 2018. "Learning to Optimize via Information-Directed Sampling," Operations Research, INFORMS, vol. 66(1), pages 230-252, January.
Stephen E. Chick & Peter Frazier, 2012. "Sequential Sampling with Economics of Selection Procedures," Management Science, INFORMS, vol. 58(3), pages 550-569, March.
Kaito Ariu & Masahiro Kato & Junpei Komiyama & Kenichiro McAlinn & Chao Qin, 2021. "Policy Choice and Best Arm Identification: Asymptotic Analysis of Exploration Sampling," Papers 2109.08229, arXiv.org, revised Nov 2021.
Stephen E. Chick & Koichiro Inoue, 2001. "New Two-Stage and Sequential Procedures for Selecting the Best Simulated System," Operations Research, INFORMS, vol. 49(5), pages 732-743, October.
Karun Adusumilli, 2022. "Neyman allocation is minimax optimal for best arm identification with two arms," Papers 2204.05527, arXiv.org, revised Aug 2022.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Toru Kitagawa & Jeff Rowley, 2024. "Bandit algorithms for policy learning: methods, implementation, and welfare-performance," The Japanese Economic Review, Springer, vol. 75(3), pages 407-447, July.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Eric M. Schwartz & Eric T. Bradlow & Peter S. Fader, 2017. "Customer Acquisition via Display Advertising Using Multi-Armed Bandit Experiments," Marketing Science, INFORMS, vol. 36(4), pages 500-522, July.
Masahiro Kato & Masaaki Imaizumi & Takuya Ishihara & Toru Kitagawa, 2023. "Asymptotically Optimal Fixed-Budget Best Arm Identification with Variance-Dependent Bounds," Papers 2302.02988, arXiv.org, revised Jul 2023.
Haihui Shen & L. Jeff Hong & Xiaowei Zhang, 2021. "Ranking and Selection with Covariates for Personalized Decision Making," INFORMS Journal on Computing, INFORMS, vol. 33(4), pages 1500-1519, October.
Elea McDonnell Feit & Ron Berman, 2019. "Test & Roll: Profit-Maximizing A/B Tests," Marketing Science, INFORMS, vol. 38(6), pages 1038-1058, November.
Masahiro Kato, 2023. "Locally Optimal Fixed-Budget Best Arm Identification in Two-Armed Gaussian Bandits with Unknown Variances," Papers 2312.12741, arXiv.org, revised Mar 2024.
Daniel Russo, 2020. "Simple Bayesian Algorithms for Best-Arm Identification," Operations Research, INFORMS, vol. 68(6), pages 1625-1647, November.
Stephen E. Chick & Jürgen Branke & Christian Schmidt, 2010. "Sequential Sampling to Myopically Maximize the Expected Value of Information," INFORMS Journal on Computing, INFORMS, vol. 22(1), pages 71-80, February.
Weiwei Fan & L. Jeff Hong & Barry L. Nelson, 2016. "Indifference-Zone-Free Selection of the Best," Operations Research, INFORMS, vol. 64(6), pages 1499-1514, December.
A Stefano Caria & Grant Gordon & Maximilian Kasy & Simon Quinn & Soha Osman Shami & Alexander Teytelboym, 2024. "An Adaptive Targeted Field Experiment: Job Search Assistance for Refugees in Jordan," Journal of the European Economic Association, European Economic Association, vol. 22(2), pages 781-836.
- Stefano Caria & Grant Gordon & Maximilian Kasy & Simon Quinn & Soha Shami & Alexander Teytelboym, 2020. "An Adaptive Targeted Field Experiment: Job Search Assistance for Refugees in Jordan," CESifo Working Paper Series 8535, CESifo.
- A. Stefano Caria & Grant Gordon & Maximilian Kasy & Simon Quinn & Soha Shami & Alexander Teytelboym, 2020. "An Adaptive Targeted Field Experiment: Job Search Assistance for Refugees in Jordan," CSAE Working Paper Series 2020-20, Centre for the Study of African Economies, University of Oxford.
- Caria, Stefano & Gordon, Grant & Kasy, Maximilian & Quinn, Simon & Shami, Soha & Teytelboym, Alexander, 2021. "An Adaptive Targeted Field Experiment: Job Search Assistance for Refugees in Jordan," CAGE Online Working Paper Series 547, Competitive Advantage in the Global Economy (CAGE).
- Caria, Stefano & Gordon, Grant & Kasy, Maximilian & Quinn, Simon & Shami, Soha & Teytelboym, Alexander, 2021. "An Adaptive Targeted Field Experiment : Job Search Assistance for Refugees in Jordan," The Warwick Economics Research Paper Series (TWERPS) 1335, University of Warwick, Department of Economics.
- Quinn, Simon & Caria, Stefano & Gordon, Grant & Kasy, Maximilian & Shami, Soha & Teytelboym, Alexander, 2020. "An Adaptive Targeted Field Experiment: Job Search Assistance for Refugees in Jordan," CEPR Discussion Papers 15359, C.E.P.R. Discussion Papers.
Raluca M. Ursu & Qingliang Wang & Pradeep K. Chintagunta, 2020. "Search Duration," Marketing Science, INFORMS, vol. 39(5), pages 849-871, September.
Victor F. Araman & René A. Caldentey, 2022. "Diffusion Approximations for a Class of Sequential Experimentation Problems," Management Science, INFORMS, vol. 68(8), pages 5958-5979, August.
Stephen Chick & Martin Forster & Paolo Pertile, 2017. "A Bayesian decision theoretic model of sequential experimentation with delayed response," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(5), pages 1439-1462, November.
- Stephen Chick & Martin Forster & Paolo Pertile, 2015. "A Bayesian Decision-Theoretic Model of Sequential Experimentation with Delayed Response," Discussion Papers 15/09, Department of Economics, University of York.
Nikhil Bhat & Vivek F. Farias & Ciamac C. Moallemi & Deeksha Sinha, 2020. "Near-Optimal A-B Testing," Management Science, INFORMS, vol. 66(10), pages 4477-4495, October.
Gongbo Zhang & Yijie Peng & Jianghua Zhang & Enlu Zhou, 2023. "Asymptotically Optimal Sampling Policy for Selecting Top- m Alternatives," INFORMS Journal on Computing, INFORMS, vol. 35(6), pages 1261-1285, November.
Masahiro Kato, 2024. "Generalized Neyman Allocation for Locally Minimax Optimal Best-Arm Identification," Papers 2405.19317, arXiv.org, revised Feb 2025.
Maximilian Kasy & Anja Sautmann, 2021. "Adaptive Treatment Assignment in Experiments for Policy Choice," Econometrica, Econometric Society, vol. 89(1), pages 113-132, January.
- Maximilian Kasy & Anja Sautmann, 2019. "Adaptive Treatment Assignment in Experiments for Policy Choice," CESifo Working Paper Series 7778, CESifo.
Stephen E. Chick & Peter Frazier, 2012. "Sequential Sampling with Economics of Selection Procedures," Management Science, INFORMS, vol. 58(3), pages 550-569, March.
Masahiro Kato & Kyohei Okumura & Takuya Ishihara & Toru Kitagawa, 2024. "Adaptive Experimental Design for Policy Learning," Papers 2401.03756, arXiv.org, revised Feb 2024.
Yijie Peng & Chun-Hung Chen & Michael C. Fu & Jian-Qiang Hu, 2016. "Dynamic Sampling Allocation and Design Selection," INFORMS Journal on Computing, INFORMS, vol. 28(2), pages 195-208, May.
Stephen E. Chick & Noah Gans & Özge Yapar, 2022. "Bayesian Sequential Learning for Clinical Trials of Multiple Correlated Medical Interventions," Management Science, INFORMS, vol. 68(7), pages 4919-4938, July.

More about this item

NEP fields

This paper has been announced in the following NEP Reports:

NEP-EXP-2024-03-11 (Experimental Economics)

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2402.10592. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Optimizing Adaptive Experiments: A Unified Approach to Regret Minimization and Best-Arm Identification

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

NEP fields

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data