The Knowledge Gradient Algorithm for a General Class of Online Learning Problems
Author
Abstract
Suggested Citation
DOI: 10.1287/opre.1110.0999
Download full text from publisher
References listed on IDEAS
- Peter Frazier & Warren Powell & Savas Dayanik, 2009. "The Knowledge-Gradient Policy for Correlated Normal Beliefs," INFORMS Journal on Computing, INFORMS, vol. 21(4), pages 599-613, November.
- Monica Brezzi & Tze Leung Lai, 2000. "Incomplete Learning from Endogenous Data in Dynamic Allocation," Econometrica, Econometric Society, vol. 68(6), pages 1511-1516, November.
- Brezzi, Monica & Lai, Tze Leung, 2002. "Optimal learning and experimentation in bandit problems," Journal of Economic Dynamics and Control, Elsevier, vol. 27(1), pages 87-108, November.
- Stephen E. Chick & Noah Gans, 2009. "Economic Analysis of Simulation Selection Problems," Management Science, INFORMS, vol. 55(3), pages 421-437, March.
- Ilya O. Ryzhov & Warren B. Powell, 2011. "Information Collection on a Graph," Operations Research, INFORMS, vol. 59(1), pages 188-201, February.
- Dimitris Bertsimas & José Niño-Mora, 2000. "Restless Bandits, Linear Programming Relaxations, and a Primal-Dual Index Heuristic," Operations Research, INFORMS, vol. 48(1), pages 80-90, February.
- Michael N. Katehakis & Arthur F. Veinott, 1987. "The Multi-Armed Bandit Problem: Decomposition and Computation," Mathematics of Operations Research, INFORMS, vol. 12(2), pages 262-268, May.
- Stephen E. Chick & Jürgen Branke & Christian Schmidt, 2010. "Sequential Sampling to Myopically Maximize the Expected Value of Information," INFORMS Journal on Computing, INFORMS, vol. 22(1), pages 71-80, February.
Citations
Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
Cited by:
- Shervin Shahrokhi Tehrani & Andrew T. Ching, 2024. "A Heuristic Approach to Explore: The Value of Perfect Information," Management Science, INFORMS, vol. 70(5), pages 3200-3224, May.
- Bolong Cheng & Arta Jamshidi & Warren Powell, 2015. "Optimal learning with a local parametric belief model," Journal of Global Optimization, Springer, vol. 63(2), pages 401-425, October.
- Samuel Cohen & Tanut Treetanthiploet, 2021. "Generalised correlated batched bandits via the ARC algorithm with application to dynamic pricing," Papers 2102.04263, arXiv.org, revised Oct 2022.
- Juergen Branke & Wen Zhang, 2019. "Identifying efficient solutions via simulation: myopic multi-objective budget allocation for the bi-objective case," OR Spectrum: Quantitative Approaches in Management, Springer;Gesellschaft für Operations Research e.V., vol. 41(3), pages 831-865, September.
- Warren B. Powell, 2016. "Perspectives of approximate dynamic programming," Annals of Operations Research, Springer, vol. 241(1), pages 319-356, June.
- Martinelli, Gabriele & Eidsvik, Jo & Hauge, Ragnar, 2013. "Dynamic decision making for graphical models applied to oil exploration," European Journal of Operational Research, Elsevier, vol. 230(3), pages 688-702.
- Ilya O. Ryzhov, 2016. "On the Convergence Rates of Expected Improvement Methods," Operations Research, INFORMS, vol. 64(6), pages 1515-1528, December.
- David B. Brown & James E. Smith, 2013. "Optimal Sequential Exploration: Bandits, Clairvoyants, and Wildcats," Operations Research, INFORMS, vol. 61(3), pages 644-665, June.
- Jing Xie & Peter I. Frazier, 2013. "Sequential Bayes-Optimal Policies for Multiple Comparisons with a Known Standard," Operations Research, INFORMS, vol. 61(5), pages 1174-1189, October.
- Sarang Deo & Seyed Iravani & Tingting Jiang & Karen Smilowitz & Stephen Samuelson, 2013. "Improving Health Outcomes Through Better Capacity Allocation in a Community-Based Chronic Care Model," Operations Research, INFORMS, vol. 61(6), pages 1277-1294, December.
- Daniel Russo & Benjamin Van Roy, 2018. "Learning to Optimize via Information-Directed Sampling," Operations Research, INFORMS, vol. 66(1), pages 230-252, January.
- L. Jeff Hong & Guangxin Jiang, 2019. "Offline Simulation Online Application: A New Framework of Simulation-Based Decision Making," Asia-Pacific Journal of Operational Research (APJOR), World Scientific Publishing Co. Pte. Ltd., vol. 36(06), pages 1-22, December.
- Michael Jong Kim, 2020. "Variance Regularization in Sequential Bayesian Optimization," Mathematics of Operations Research, INFORMS, vol. 45(3), pages 966-992, August.
- Daniel Russo, 2020. "Simple Bayesian Algorithms for Best-Arm Identification," Operations Research, INFORMS, vol. 68(6), pages 1625-1647, November.
- Yan Li & Kristofer G. Reyes & Jorge Vazquez-Anderson & Yingfei Wang & Lydia M. Contreras & Warren B. Powell, 2018. "A Knowledge Gradient Policy for Sequencing Experiments to Identify the Structure of RNA Molecules Using a Sparse Additive Belief Model," INFORMS Journal on Computing, INFORMS, vol. 30(4), pages 750-767, November.
- Andres Alban & Stephen E. Chick & Martin Forster, 2023. "Value-Based Clinical Trials: Selecting Recruitment Rates and Trial Lengths in Different Regulatory Contexts," Management Science, INFORMS, vol. 69(6), pages 3516-3535, June.
- Ilya O. Ryzhov & Martijn R. K. Mes & Warren B. Powell & Gerald van den Berg, 2019. "Bayesian Exploration for Approximate Dynamic Programming," Operations Research, INFORMS, vol. 67(1), pages 198-214, January.
- Satya S. Malladi & Alan L. Erera & Chelsea C. White, 2021. "Managing mobile production-inventory systems influenced by a modulation process," Annals of Operations Research, Springer, vol. 304(1), pages 299-330, September.
- Seokhyun Chung & Raed Al Kontar & Zhenke Wu, 2022. "Weakly Supervised Multi-output Regression via Correlated Gaussian Processes," INFORMS Joural on Data Science, INFORMS, vol. 1(2), pages 115-137, October.
- Michael Jong Kim & Andrew E.B. Lim, 2016. "Robust Multiarmed Bandit Problems," Management Science, INFORMS, vol. 62(1), pages 264-285, January.
- Yunxiao Deng & Suvrajeet Sen, 2022. "Predictive stochastic programming," Computational Management Science, Springer, vol. 19(1), pages 65-98, January.
- Daniel Russo & Benjamin Van Roy, 2014. "Learning to Optimize via Posterior Sampling," Mathematics of Operations Research, INFORMS, vol. 39(4), pages 1221-1243, November.
Most related items
These are the items that most often cite the same works as this one and are cited by the same works as this one.- Eric M. Schwartz & Eric T. Bradlow & Peter S. Fader, 2017. "Customer Acquisition via Display Advertising Using Multi-Armed Bandit Experiments," Marketing Science, INFORMS, vol. 36(4), pages 500-522, July.
- Stephen E. Chick & Peter Frazier, 2012. "Sequential Sampling with Economics of Selection Procedures," Management Science, INFORMS, vol. 58(3), pages 550-569, March.
- Jing Xie & Peter I. Frazier, 2013. "Sequential Bayes-Optimal Policies for Multiple Comparisons with a Known Standard," Operations Research, INFORMS, vol. 61(5), pages 1174-1189, October.
- Powell, Warren B., 2019. "A unified framework for stochastic optimization," European Journal of Operational Research, Elsevier, vol. 275(3), pages 795-821.
- Ilya O. Ryzhov & Warren B. Powell, 2011. "Information Collection on a Graph," Operations Research, INFORMS, vol. 59(1), pages 188-201, February.
- Warren B. Powell, 2016. "Perspectives of approximate dynamic programming," Annals of Operations Research, Springer, vol. 241(1), pages 319-356, June.
- Konon, Alexander, 2016. "Career choice under uncertainty," VfS Annual Conference 2016 (Augsburg): Demographic Change 145583, Verein für Socialpolitik / German Economic Association.
- Raluca M. Ursu & Qingliang Wang & Pradeep K. Chintagunta, 2020. "Search Duration," Marketing Science, INFORMS, vol. 39(5), pages 849-871, September.
- Victor F. Araman & René A. Caldentey, 2022. "Diffusion Approximations for a Class of Sequential Experimentation Problems," Management Science, INFORMS, vol. 68(8), pages 5958-5979, August.
- Weiwei Fan & L. Jeff Hong & Xiaowei Zhang, 2020. "Distributionally Robust Selection of the Best," Management Science, INFORMS, vol. 66(1), pages 190-208, January.
- Stephen Chick & Martin Forster & Paolo Pertile, 2017.
"A Bayesian decision theoretic model of sequential experimentation with delayed response,"
Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(5), pages 1439-1462, November.
- Stephen Chick & Martin Forster & Paolo Pertile, 2015. "A Bayesian Decision-Theoretic Model of Sequential Experimentation with Delayed Response," Discussion Papers 15/09, Department of Economics, University of York.
- Emre Barut & Warren Powell, 2014. "Optimal learning for sequential sampling with non-parametric beliefs," Journal of Global Optimization, Springer, vol. 58(3), pages 517-543, March.
- Gongbo Zhang & Yijie Peng & Jianghua Zhang & Enlu Zhou, 2023. "Asymptotically Optimal Sampling Policy for Selecting Top- m Alternatives," INFORMS Journal on Computing, INFORMS, vol. 35(6), pages 1261-1285, November.
- Diana M. Negoescu & Peter I. Frazier & Warren B. Powell, 2011. "The Knowledge-Gradient Algorithm for Sequencing Experiments in Drug Discovery," INFORMS Journal on Computing, INFORMS, vol. 23(3), pages 346-363, August.
- Huashuai Qu & Ilya O. Ryzhov & Michael C. Fu & Zi Ding, 2015. "Sequential Selection with Unknown Correlation Structures," Operations Research, INFORMS, vol. 63(4), pages 931-948, August.
- Ilya O. Ryzhov & Martijn R. K. Mes & Warren B. Powell & Gerald van den Berg, 2019. "Bayesian Exploration for Approximate Dynamic Programming," Operations Research, INFORMS, vol. 67(1), pages 198-214, January.
- Ilya O. Ryzhov, 2016. "On the Convergence Rates of Expected Improvement Methods," Operations Research, INFORMS, vol. 64(6), pages 1515-1528, December.
- Yijie Peng & Chun-Hung Chen & Michael C. Fu & Jian-Qiang Hu, 2016. "Dynamic Sampling Allocation and Design Selection," INFORMS Journal on Computing, INFORMS, vol. 28(2), pages 195-208, May.
- David J. Eckman & Shane G. Henderson, 2022. "Posterior-Based Stopping Rules for Bayesian Ranking-and-Selection Procedures," INFORMS Journal on Computing, INFORMS, vol. 34(3), pages 1711-1728, May.
- Bin Han & Ilya O. Ryzhov & Boris Defourny, 2016. "Optimal Learning in Linear Regression with Combinatorial Feature Selection," INFORMS Journal on Computing, INFORMS, vol. 28(4), pages 721-735, November.
More about this item
Keywords
multiarmed bandit; optimal learning; online learning; knowledge gradient; Gittins index; index policy;All these keywords.
Statistics
Access and download statisticsCorrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:oropre:v:60:y:2012:i:1:p:180-195. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .
Please note that corrections may take a couple of weeks to filter through the various RePEc services.