Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1202.3079
Cited By
Towards minimax policies for online linear optimization with bandit feedback
14 February 2012
Sébastien Bubeck
Nicolò Cesa-Bianchi
Sham Kakade
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Towards minimax policies for online linear optimization with bandit feedback"
27 / 27 papers shown
Title
Improved Regret of Linear Ensemble Sampling
Harin Lee
Min-hwan Oh
39
1
0
06 Nov 2024
Information-Theoretic Regret Bounds for Bandits with Fixed Expert Advice
Khaled Eldowa
Nicolò Cesa-Bianchi
Alberto Maria Metelli
Marcello Restelli
19
3
0
14 Mar 2023
A Blackbox Approach to Best of Both Worlds in Bandits and Beyond
Christoph Dann
Chen-Yu Wei
Julian Zimmert
26
22
0
20 Feb 2023
Refined Regret for Adversarial MDPs with Linear Function Approximation
Yan Dai
Haipeng Luo
Chen-Yu Wei
Julian Zimmert
31
12
0
30 Jan 2023
Banker Online Mirror Descent: A Universal Approach for Delayed Online Bandit Learning
Jiatai Huang
Yan Dai
Longbo Huang
27
6
0
25 Jan 2023
Tight Guarantees for Interactive Decision Making with the Decision-Estimation Coefficient
Dylan J. Foster
Noah Golowich
Yanjun Han
OffRL
33
29
0
19 Jan 2023
Faster Gradient-Free Algorithms for Nonsmooth Nonconvex Stochastic Optimization
Le‐Yu Chen
Jing Xu
Luo Luo
36
15
0
16 Jan 2023
Contexts can be Cheap: Solving Stochastic Contextual Bandits with Linear Bandit Algorithms
Osama A. Hanna
Lin F. Yang
Christina Fragouli
27
11
0
08 Nov 2022
Socially Fair Reinforcement Learning
Debmalya Mandal
Jiarui Gan
OffRL
30
13
0
26 Aug 2022
Corralling a Larger Band of Bandits: A Case Study on Switching Regret for Linear Bandits
Haipeng Luo
Mengxiao Zhang
Peng Zhao
Zhi Zhou
34
17
0
12 Feb 2022
A bandit-learning approach to multifidelity approximation
Yiming Xu
Vahid Keshavarzzadeh
Robert M. Kirby
A. Narayan
16
6
0
29 Mar 2021
Linear Bandits on Uniformly Convex Sets
Thomas Kerdreux
Christophe Roux
Alexandre d’Aspremont
Sebastian Pokutta
31
7
0
10 Mar 2021
Unifying mirror descent and dual averaging
A. Juditsky
Joon Kwon
Eric Moulines
17
29
0
30 Oct 2019
Adaptive Sampling for Stochastic Risk-Averse Learning
Sebastian Curi
Kfir Y. Levy
Stefanie Jegelka
Andreas Krause
24
52
0
28 Oct 2019
Regret Bounds for Batched Bandits
Hossein Esfandiari
Amin Karbasi
Abbas Mehrabian
Vahab Mirrokni
28
61
0
11 Oct 2019
Bandit Convex Optimization in Non-stationary Environments
Peng Zhao
G. Wang
Lijun Zhang
Zhi Zhou
36
41
0
29 Jul 2019
Exploration by Optimisation in Partial Monitoring
Tor Lattimore
Csaba Szepesvári
33
38
0
12 Jul 2019
Semiparametric Contextual Bandits
A. Krishnamurthy
Zhiwei Steven Wu
Vasilis Syrgkanis
33
44
0
12 Mar 2018
Online Learning: A Comprehensive Survey
Guosheng Lin
Doyen Sahoo
Jing Lu
P. Zhao
OffRL
31
634
0
08 Feb 2018
The Price of Differential Privacy For Online Learning
Naman Agarwal
Karan Singh
FedML
14
94
0
27 Jan 2017
Corralling a Band of Bandit Algorithms
Alekh Agarwal
Haipeng Luo
Behnam Neyshabur
Robert Schapire
30
154
0
19 Dec 2016
Explore no more: Improved high-probability regret bounds for non-stochastic bandits
Gergely Neu
22
181
0
10 Jun 2015
The entropic barrier: a simple and optimal universal self-concordant barrier
Sébastien Bubeck
Ronen Eldan
26
65
0
04 Dec 2014
On the Complexity of Bandit Linear Optimization
Ohad Shamir
37
14
0
11 Aug 2014
Combinatorial Multi-Armed Bandit and Its Extension to Probabilistically Triggered Arms
Wei Chen
Yajun Wang
Yang Yuan
Qinshi Wang
54
279
0
31 Jul 2014
On the Complexity of Bandit and Derivative-Free Stochastic Convex Optimization
Ohad Shamir
41
190
0
11 Sep 2012
Contextual Bandits with Similarity Information
Aleksandrs Slivkins
51
451
0
23 Jul 2009
1