ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1202.3079
  4. Cited By
Towards minimax policies for online linear optimization with bandit
  feedback

Towards minimax policies for online linear optimization with bandit feedback

14 February 2012
Sébastien Bubeck
Nicolò Cesa-Bianchi
Sham Kakade
    OffRL
ArXivPDFHTML

Papers citing "Towards minimax policies for online linear optimization with bandit feedback"

27 / 27 papers shown
Title
Improved Regret of Linear Ensemble Sampling
Improved Regret of Linear Ensemble Sampling
Harin Lee
Min-hwan Oh
39
1
0
06 Nov 2024
Information-Theoretic Regret Bounds for Bandits with Fixed Expert Advice
Information-Theoretic Regret Bounds for Bandits with Fixed Expert Advice
Khaled Eldowa
Nicolò Cesa-Bianchi
Alberto Maria Metelli
Marcello Restelli
19
3
0
14 Mar 2023
A Blackbox Approach to Best of Both Worlds in Bandits and Beyond
A Blackbox Approach to Best of Both Worlds in Bandits and Beyond
Christoph Dann
Chen-Yu Wei
Julian Zimmert
26
22
0
20 Feb 2023
Refined Regret for Adversarial MDPs with Linear Function Approximation
Refined Regret for Adversarial MDPs with Linear Function Approximation
Yan Dai
Haipeng Luo
Chen-Yu Wei
Julian Zimmert
31
12
0
30 Jan 2023
Banker Online Mirror Descent: A Universal Approach for Delayed Online
  Bandit Learning
Banker Online Mirror Descent: A Universal Approach for Delayed Online Bandit Learning
Jiatai Huang
Yan Dai
Longbo Huang
27
6
0
25 Jan 2023
Tight Guarantees for Interactive Decision Making with the
  Decision-Estimation Coefficient
Tight Guarantees for Interactive Decision Making with the Decision-Estimation Coefficient
Dylan J. Foster
Noah Golowich
Yanjun Han
OffRL
33
29
0
19 Jan 2023
Faster Gradient-Free Algorithms for Nonsmooth Nonconvex Stochastic
  Optimization
Faster Gradient-Free Algorithms for Nonsmooth Nonconvex Stochastic Optimization
Le‐Yu Chen
Jing Xu
Luo Luo
36
15
0
16 Jan 2023
Contexts can be Cheap: Solving Stochastic Contextual Bandits with Linear
  Bandit Algorithms
Contexts can be Cheap: Solving Stochastic Contextual Bandits with Linear Bandit Algorithms
Osama A. Hanna
Lin F. Yang
Christina Fragouli
27
11
0
08 Nov 2022
Socially Fair Reinforcement Learning
Socially Fair Reinforcement Learning
Debmalya Mandal
Jiarui Gan
OffRL
30
13
0
26 Aug 2022
Corralling a Larger Band of Bandits: A Case Study on Switching Regret
  for Linear Bandits
Corralling a Larger Band of Bandits: A Case Study on Switching Regret for Linear Bandits
Haipeng Luo
Mengxiao Zhang
Peng Zhao
Zhi Zhou
34
17
0
12 Feb 2022
A bandit-learning approach to multifidelity approximation
A bandit-learning approach to multifidelity approximation
Yiming Xu
Vahid Keshavarzzadeh
Robert M. Kirby
A. Narayan
16
6
0
29 Mar 2021
Linear Bandits on Uniformly Convex Sets
Linear Bandits on Uniformly Convex Sets
Thomas Kerdreux
Christophe Roux
Alexandre d’Aspremont
Sebastian Pokutta
31
7
0
10 Mar 2021
Unifying mirror descent and dual averaging
Unifying mirror descent and dual averaging
A. Juditsky
Joon Kwon
Eric Moulines
17
29
0
30 Oct 2019
Adaptive Sampling for Stochastic Risk-Averse Learning
Adaptive Sampling for Stochastic Risk-Averse Learning
Sebastian Curi
Kfir Y. Levy
Stefanie Jegelka
Andreas Krause
24
52
0
28 Oct 2019
Regret Bounds for Batched Bandits
Regret Bounds for Batched Bandits
Hossein Esfandiari
Amin Karbasi
Abbas Mehrabian
Vahab Mirrokni
28
61
0
11 Oct 2019
Bandit Convex Optimization in Non-stationary Environments
Bandit Convex Optimization in Non-stationary Environments
Peng Zhao
G. Wang
Lijun Zhang
Zhi Zhou
36
41
0
29 Jul 2019
Exploration by Optimisation in Partial Monitoring
Exploration by Optimisation in Partial Monitoring
Tor Lattimore
Csaba Szepesvári
33
38
0
12 Jul 2019
Semiparametric Contextual Bandits
Semiparametric Contextual Bandits
A. Krishnamurthy
Zhiwei Steven Wu
Vasilis Syrgkanis
33
44
0
12 Mar 2018
Online Learning: A Comprehensive Survey
Online Learning: A Comprehensive Survey
Guosheng Lin
Doyen Sahoo
Jing Lu
P. Zhao
OffRL
31
634
0
08 Feb 2018
The Price of Differential Privacy For Online Learning
The Price of Differential Privacy For Online Learning
Naman Agarwal
Karan Singh
FedML
14
94
0
27 Jan 2017
Corralling a Band of Bandit Algorithms
Corralling a Band of Bandit Algorithms
Alekh Agarwal
Haipeng Luo
Behnam Neyshabur
Robert Schapire
30
154
0
19 Dec 2016
Explore no more: Improved high-probability regret bounds for
  non-stochastic bandits
Explore no more: Improved high-probability regret bounds for non-stochastic bandits
Gergely Neu
22
181
0
10 Jun 2015
The entropic barrier: a simple and optimal universal self-concordant
  barrier
The entropic barrier: a simple and optimal universal self-concordant barrier
Sébastien Bubeck
Ronen Eldan
26
65
0
04 Dec 2014
On the Complexity of Bandit Linear Optimization
On the Complexity of Bandit Linear Optimization
Ohad Shamir
37
14
0
11 Aug 2014
Combinatorial Multi-Armed Bandit and Its Extension to Probabilistically
  Triggered Arms
Combinatorial Multi-Armed Bandit and Its Extension to Probabilistically Triggered Arms
Wei Chen
Yajun Wang
Yang Yuan
Qinshi Wang
54
279
0
31 Jul 2014
On the Complexity of Bandit and Derivative-Free Stochastic Convex
  Optimization
On the Complexity of Bandit and Derivative-Free Stochastic Convex Optimization
Ohad Shamir
41
190
0
11 Sep 2012
Contextual Bandits with Similarity Information
Contextual Bandits with Similarity Information
Aleksandrs Slivkins
51
451
0
23 Jul 2009
1