Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1506.03271
Cited By
Explore no more: Improved high-probability regret bounds for non-stochastic bandits
10 June 2015
Gergely Neu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Explore no more: Improved high-probability regret bounds for non-stochastic bandits"
45 / 45 papers shown
Title
Online Episodic Convex Reinforcement Learning
B. Moreno
Khaled Eldowa
Pierre Gaillard
Margaux Brégère
Nadia Oudjane
OffRL
31
0
0
12 May 2025
A New Benchmark for Online Learning with Budget-Balancing Constraints
M. Braverman
Jingyi Liu
Jieming Mao
Jon Schneider
Eric Xue
60
0
0
19 Mar 2025
Online Planning of Power Flows for Power Systems Against Bushfires Using Spatial Context
Jianyu Xu
Qiuzhuang Sun
Yang Yang
Huadong Mo
Daoyi Dong
83
0
0
24 Feb 2025
Beyond Minimax Rates in Group Distributionally Robust Optimization via a Novel Notion of Sparsity
Quan Nguyen
Nishant A. Mehta
Cristóbal Guzmán
39
1
0
01 Oct 2024
Beyond Primal-Dual Methods in Bandits with Stochastic and Adversarial Constraints
Martino Bernasconi
Matteo Castiglioni
A. Celli
Federico Fusco
31
2
0
25 May 2024
No-Regret Algorithms in non-Truthful Auctions with Budget and ROI Constraints
Gagan Aggarwal
Giannis Fikioris
Mingfei Zhao
40
5
0
15 Apr 2024
Stochastic Online Optimization for Cyber-Physical and Robotic Systems
Hao Ma
Melanie Zeilinger
Michael Muehlebach
62
0
0
08 Apr 2024
Distributed No-Regret Learning for Multi-Stage Systems with End-to-End Bandit Feedback
I-Hong Hou
OffRL
44
0
0
06 Apr 2024
Learning Adversarial MDPs with Stochastic Hard Constraints
Francesco Emanuele Stradi
Matteo Castiglioni
A. Marchesi
Nicola Gatti
39
4
0
06 Mar 2024
CRIMED: Lower and Upper Bounds on Regret for Bandits with Unbounded Stochastic Corruption
Shubhada Agrawal
Timothée Mathieu
D. Basu
Odalric-Ambrym Maillard
30
2
0
28 Sep 2023
A Best-of-both-worlds Algorithm for Bandits with Delayed Feedback with Robustness to Excessive Delays
Saeed Masoudian
Julian Zimmert
Yevgeny Seldin
47
3
0
21 Aug 2023
Anytime Model Selection in Linear Bandits
Parnian Kassraie
N. Emmenegger
Andreas Krause
Aldo Pacchiano
54
2
0
24 Jul 2023
Meta-Learning Adversarial Bandit Algorithms
M. Khodak
Ilya Osadchiy
Keegan Harris
Maria-Florina Balcan
Kfir Y. Levy
Ron Meir
Zhiwei Steven Wu
FedML
30
2
0
05 Jul 2023
Bandits with Replenishable Knapsacks: the Best of both Worlds
Martino Bernasconi
Matteo Castiglioni
A. Celli
Federico Fusco
41
4
0
14 Jun 2023
Bandits for Sponsored Search Auctions under Unknown Valuation Model: Case Study in E-Commerce Advertising
Danil Provodin
Jérémie Joudioux
E. Duryev
24
0
0
31 Mar 2023
Uncoupled and Convergent Learning in Two-Player Zero-Sum Markov Games with Bandit Feedback
Yang Cai
Haipeng Luo
Chen-Yu Wei
Weiqiang Zheng
34
18
0
05 Mar 2023
Estimating Optimal Policy Value in General Linear Contextual Bandits
Jonathan Lee
Weihao Kong
Aldo Pacchiano
Vidya Muthukumar
Emma Brunskill
30
0
0
19 Feb 2023
Contextual Bandits and Optimistically Universal Learning
Moise Blanchard
Steve Hanneke
Patrick Jaillet
OffRL
28
1
0
31 Dec 2022
SLOPT: Bandit Optimization Framework for Mutation-Based Fuzzing
Yuki Koike
H. Katsura
Hiromu Yakura
Yuma Kurogome
31
5
0
07 Nov 2022
On-Demand Sampling: Learning Optimally from Multiple Distributions
Nika Haghtalab
Michael I. Jordan
Eric Zhao
FedML
55
35
0
22 Oct 2022
Improved High-Probability Regret for Adversarial Bandits with Time-Varying Feedback Graphs
Haipeng Luo
Hanghang Tong
Mengxiao Zhang
Yuheng Zhang
16
5
0
04 Oct 2022
Actor-Critic based Improper Reinforcement Learning
Mohammadi Zaki
Avinash Mohan
Aditya Gopalan
Shie Mannor
21
2
0
19 Jul 2022
Best of Both Worlds Model Selection
Aldo Pacchiano
Christoph Dann
Claudio Gentile
36
10
0
29 Jun 2022
The Complexity of Markov Equilibrium in Stochastic Games
C. Daskalakis
Noah Golowich
Kaipeng Zhang
41
56
0
08 Apr 2022
Generalized Bandit Regret Minimizer Framework in Imperfect Information Extensive-Form Game
Lin Meng
Yang Gao
52
1
0
11 Mar 2022
Near-Optimal Learning of Extensive-Form Games with Imperfect Information
Yunru Bai
Chi Jin
Song Mei
Tiancheng Yu
26
26
0
03 Feb 2022
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback
Tiancheng Jin
Tal Lancewicki
Haipeng Luo
Yishay Mansour
Aviv A. Rosenberg
74
21
0
31 Jan 2022
Uncoupled Bandit Learning towards Rationalizability: Benchmarks, Barriers, and Algorithms
Jibang Wu
Haifeng Xu
Fan Yao
35
1
0
10 Nov 2021
Decentralized Cooperative Reinforcement Learning with Hierarchical Information Structure
Hsu Kao
Chen-Yu Wei
V. Subramanian
33
12
0
01 Nov 2021
On Improving Model-Free Algorithms for Decentralized Multi-Agent Reinforcement Learning
Weichao Mao
Lin F. Yang
Kaipeng Zhang
Tamer Bacsar
46
57
0
12 Oct 2021
Provably Efficient Reinforcement Learning in Decentralized General-Sum Markov Games
Weichao Mao
Tamer Basar
36
66
0
12 Oct 2021
When Can We Learn General-Sum Markov Games with a Large Number of Players Sample-Efficiently?
Ziang Song
Song Mei
Yu Bai
74
67
0
08 Oct 2021
Bandit Algorithms for Precision Medicine
Yangyi Lu
Ziping Xu
Ambuj Tewari
66
11
0
10 Aug 2021
Policy Optimization in Adversarial MDPs: Improved Exploration via Dilated Bonuses
Haipeng Luo
Chen-Yu Wei
Chung-Wei Lee
38
44
0
18 Jul 2021
Model-Free Learning for Two-Player Zero-Sum Partially Observable Markov Games with Perfect Recall
Tadashi Kozuno
Pierre Ménard
Rémi Munos
Michal Valko
30
18
0
11 Jun 2021
Leveraging Good Representations in Linear Contextual Bandits
Matteo Papini
Andrea Tirinzoni
Marcello Restelli
A. Lazaric
Matteo Pirotta
35
26
0
08 Apr 2021
A Simple Approach for Non-stationary Linear Bandits
Peng Zhao
Lijun Zhang
Yuan Jiang
Zhi-Hua Zhou
36
81
0
09 Mar 2021
Near-Optimal Reinforcement Learning with Self-Play
Yunru Bai
Chi Jin
Tiancheng Yu
24
130
0
22 Jun 2020
Model selection for contextual bandits
Dylan J. Foster
A. Krishnamurthy
Haipeng Luo
OffRL
34
90
0
03 Jun 2019
Contextual Multi-armed Bandit Algorithm for Semiparametric Reward Model
Gi-Soo Kim
M. Paik
22
14
0
31 Jan 2019
Taming Non-stationary Bandits: A Bayesian Approach
Vishnu Raj
Sheetal Kalyani
32
76
0
31 Jul 2017
Online Learning with Abstention
Corinna Cortes
Giulia DeSalvo
Claudio Gentile
M. Mohri
Scott Yang
9
47
0
09 Mar 2017
Refined Lower Bounds for Adversarial Bandits
Sébastien Gerchinovitz
Tor Lattimore
AAML
25
58
0
24 May 2016
Delay and Cooperation in Nonstochastic Bandits
Nicolò Cesa-Bianchi
Claudio Gentile
Yishay Mansour
Alberto Minora
14
144
0
15 Feb 2016
Nonstochastic Multi-Armed Bandits with Graph-Structured Feedback
N. Alon
Nicolò Cesa-Bianchi
Claudio Gentile
Shie Mannor
Yishay Mansour
Ohad Shamir
OffRL
36
130
0
30 Sep 2014
1