Bandit Theory and Thompson Sampling-Guided Directed Evolution for
Sequence Optimization

Bandit Theory and Thompson Sampling-Guided Directed Evolution for Sequence Optimization

5 June 2022

Csaba Szepesvári

Mengdi Wang

ArXiv (abs)PDF HTML

Papers citing "Bandit Theory and Thompson Sampling-Guided Directed Evolution for Sequence Optimization"

9 / 9 papers shown

Title
Feel-Good Thompson Sampling for Contextual Bandits and Reinforcement Learning Tong Zhang 70 65 0 02 Oct 2021
High-Dimensional Sparse Linear Bandits Botao Hao Tor Lattimore Mengdi Wang 94 63 0 08 Nov 2020
AdaLead: A simple and robust adaptive greedy search algorithm for sequence design Sam Sinai Richard Wang Alexander Whatley Stewart Slocum Elina Locane Eric D. Kelsic 64 81 0 05 Oct 2020
Optimal Mutation Rates for the $(1+λ)$ EA on OneMax M. Buzdalov Carola Doerr 23 19 0 20 Jun 2020
Sequential Batch Learning in Finite-Action Linear Contextual Bandits Yanjun Han Zhengqing Zhou Zhengyuan Zhou Jose H. Blanchet Peter Glynn Yinyu Ye OffRL 90 71 0 14 Apr 2020
Reinforcement Learning in Feature Space: Matrix Bandit, Kernels, and Regret Bound Lin F. Yang Mengdi Wang OffRL GP 73 288 0 24 May 2019
Learning to Optimize Via Posterior Sampling Daniel Russo Benjamin Van Roy 223 703 0 11 Jan 2013
Thompson Sampling for Contextual Bandits with Linear Payoffs Shipra Agrawal Navin Goyal 207 1,006 0 15 Sep 2012
A Contextual-Bandit Approach to Personalized News Article Recommendation Lihong Li Wei Chu John Langford Robert Schapire 473 2,957 0 28 Feb 2010