Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1301.2609
Cited By
Learning to Optimize Via Posterior Sampling
11 January 2013
Daniel Russo
Benjamin Van Roy
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning to Optimize Via Posterior Sampling"
47 / 147 papers shown
Title
Safe Linear Thompson Sampling with Side Information
Ahmadreza Moradipari
Sanae Amani
M. Alizadeh
Christos Thrampoulidis
27
42
0
06 Nov 2019
Recovering Bandits
Ciara Pike-Burke
Steffen Grunewalder
15
40
0
31 Oct 2019
Thompson Sampling in Non-Episodic Restless Bandits
Young Hun Jung
Marc Abeille
Ambuj Tewari
9
19
0
12 Oct 2019
Personalized HeartSteps: A Reinforcement Learning Algorithm for Optimizing Physical Activity
Peng Liao
Kristjan Greenewald
P. Klasnja
Susan Murphy
25
83
0
08 Sep 2019
Linear Stochastic Bandits Under Safety Constraints
Sanae Amani
M. Alizadeh
Christos Thrampoulidis
36
117
0
16 Aug 2019
Exploration by Optimisation in Partial Monitoring
Tor Lattimore
Csaba Szepesvári
33
38
0
12 Jul 2019
Regret Bounds for Thompson Sampling in Episodic Restless Bandit Problems
Young Hun Jung
Ambuj Tewari
27
44
0
29 May 2019
Connections Between Mirror Descent, Thompson Sampling and the Information Ratio
Julian Zimmert
Tor Lattimore
30
34
0
28 May 2019
Best Arm Identification in Generalized Linear Bandits
Abbas Kazerouni
L. Wein
25
29
0
20 May 2019
Adaptive Sensor Placement for Continuous Spaces
James A Grant
A. Boukouvalas
Ryan-Rhys Griffiths
David S Leslie
Sattar Vakili
Enrique Munoz de Cote
29
13
0
16 May 2019
Hedging the Drift: Learning to Optimize under Non-Stationarity
Wang Chi Cheung
D. Simchi-Levi
Ruihao Zhu
35
89
0
04 Mar 2019
Constrained Thompson Sampling for Wireless Link Optimization
Vidit Saxena
Joseph E. Gonzalez
Ion Stoica
H. Tullberg
Joakim Jaldén
16
7
0
28 Feb 2019
Meta Dynamic Pricing: Transfer Learning Across Experiments
Hamsa Bastani
D. Simchi-Levi
Ruihao Zhu
39
88
0
28 Feb 2019
Learning to Optimize under Non-Stationarity
Wang Chi Cheung
D. Simchi-Levi
Ruihao Zhu
47
133
0
06 Oct 2018
Randomized Prior Functions for Deep Reinforcement Learning
Ian Osband
John Aslanides
Albin Cassirer
UQCV
BDL
27
372
0
08 Jun 2018
A Flexible Framework for Multi-Objective Bayesian Optimization using Random Scalarizations
Biswajit Paria
Kirthevasan Kandasamy
Barnabás Póczós
25
126
0
30 May 2018
An Information-Theoretic Analysis for Thompson Sampling with Many Actions
Shi Dong
Benjamin Van Roy
14
49
0
30 May 2018
Addressing the Item Cold-start Problem by Attribute-driven Active Learning
Y. Zhu
Jinhao Lin
S. He
Beidou Wang
Ziyu Guan
Haifeng Liu
Deng Cai
30
130
0
23 May 2018
PG-TS: Improved Thompson Sampling for Logistic Contextual Bandits
Bianca Dumitrascu
Karen Feng
Barbara E. Engelhardt
19
41
0
18 May 2018
Semiparametric Contextual Bandits
A. Krishnamurthy
Zhiwei Steven Wu
Vasilis Syrgkanis
33
44
0
12 Mar 2018
Multi-objective Contextual Bandit Problem with Similarity Information
E. Turğay
Doruk Öner
Cem Tekin
21
36
0
11 Mar 2018
Reinforcement Learning for Dynamic Bidding in Truckload Markets: an Application to Large-Scale Fleet Management with Advance Commitments
Yingfei Wang
J. Nascimento
Warrren B Powell
18
1
0
25 Feb 2018
Estimation Considerations in Contextual Bandits
Maria Dimakopoulou
Zhengyuan Zhou
Susan Athey
Guido Imbens
34
69
0
19 Nov 2017
Multi-objective Contextual Multi-armed Bandit with a Dominant Objective
Cem Tekin
E. Turğay
36
36
0
18 Aug 2017
On Optimistic versus Randomized Exploration in Reinforcement Learning
Ian Osband
Benjamin Van Roy
6
10
0
13 Jun 2017
Thompson Sampling for the MNL-Bandit
Shipra Agrawal
Vashist Avadhanula
Vineet Goyal
A. Zeevi
35
96
0
03 Jun 2017
Adaptive Rate of Convergence of Thompson Sampling for Gaussian Process Optimization
Kinjal Basu
Souvik Ghosh
21
42
0
18 May 2017
Multi-dueling Bandits with Dependent Arms
Yanan Sui
Vincent Zhuang
J. W. Burdick
Yisong Yue
28
80
0
29 Apr 2017
On Kernelized Multi-armed Bandits
Sayak Ray Chowdhury
Aditya Gopalan
35
449
0
03 Apr 2017
Deep Exploration via Randomized Value Functions
Ian Osband
Benjamin Van Roy
Daniel Russo
Zheng Wen
41
300
0
22 Mar 2017
Provably Optimal Algorithms for Generalized Linear Contextual Bandits
Lihong Li
Yu Lu
Dengyong Zhou
23
94
0
28 Feb 2017
Efficient simulation of high dimensional Gaussian vectors
N. Kahalé
14
4
0
28 Feb 2017
Learning to Learn without Gradient Descent by Gradient Descent
Yutian Chen
Matthew W. Hoffman
Sergio Gomez Colmenarejo
Misha Denil
Timothy Lillicrap
Matt Botvinick
Nando de Freitas
26
42
0
11 Nov 2016
The End of Optimism? An Asymptotic Analysis of Finite-Armed Linear Bandits
Tor Lattimore
Csaba Szepesvári
22
103
0
14 Oct 2016
Why is Posterior Sampling Better than Optimism for Reinforcement Learning?
Ian Osband
Benjamin Van Roy
BDL
19
255
0
01 Jul 2016
The Bayesian Linear Information Filtering Problem
Bangrui Chen
P. Frazier
28
1
0
30 May 2016
Double Thompson Sampling for Dueling Bandits
Huasen Wu
Xin Liu
22
87
0
25 Apr 2016
Simple Bayesian Algorithms for Best Arm Identification
Daniel Russo
31
273
0
26 Feb 2016
On Bayesian index policies for sequential resource allocation
E. Kaufmann
46
84
0
06 Jan 2016
Adaptive Ensemble Learning with Confidence Bounds
Cem Tekin
Jinsung Yoon
M. Schaar
FedML
19
40
0
23 Dec 2015
A Survey of Online Experiment Design with the Stochastic Multi-Armed Bandit
Giuseppe Burtini
Jason L. Loeppky
Ramon Lawrence
39
119
0
02 Oct 2015
Efficient Learning in Large-Scale Combinatorial Semi-Bandits
Zheng Wen
Branislav Kveton
Azin Ashkan
OffRL
59
96
0
28 Jun 2014
An Information-Theoretic Analysis of Thompson Sampling
Daniel Russo
Benjamin Van Roy
34
421
0
21 Mar 2014
Near-optimal Reinforcement Learning in Factored MDPs
Ian Osband
Benjamin Van Roy
47
120
0
15 Mar 2014
Generalized Thompson Sampling for Contextual Bandits
Lihong Li
29
23
0
27 Oct 2013
Thompson Sampling for 1-Dimensional Exponential Family Bandits
N. Korda
E. Kaufmann
Rémi Munos
29
152
0
12 Jul 2013
(More) Efficient Reinforcement Learning via Posterior Sampling
Ian Osband
Daniel Russo
Benjamin Van Roy
54
525
0
04 Jun 2013
Previous
1
2
3