Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1703.05449
Cited By
Minimax Regret Bounds for Reinforcement Learning
16 March 2017
M. G. Azar
Ian Osband
Rémi Munos
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Minimax Regret Bounds for Reinforcement Learning"
50 / 241 papers shown
Title
Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret
Jean Tarbouriech
Runlong Zhou
S. Du
Matteo Pirotta
M. Valko
A. Lazaric
70
35
0
22 Apr 2021
Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation
Andrea Zanette
Ching-An Cheng
Alekh Agarwal
49
53
0
24 Mar 2021
Reinforcement Learning, Bit by Bit
Xiuyuan Lu
Benjamin Van Roy
Vikranth Dwaracherla
M. Ibrahimi
Ian Osband
Zheng Wen
30
70
0
06 Mar 2021
UCB Momentum Q-learning: Correcting the bias without forgetting
Pierre Menard
O. D. Domingues
Xuedong Shang
Michal Valko
79
41
0
01 Mar 2021
Online Learning for Unknown Partially Observable MDPs
Mehdi Jafarnia-Jahromi
Rahul Jain
A. Nayyar
45
20
0
25 Feb 2021
Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games
Yu Bai
Chi Jin
Haiquan Wang
Caiming Xiong
51
68
0
23 Feb 2021
Reward Poisoning in Reinforcement Learning: Attacks Against Unknown Learners in Unknown Environments
Amin Rakhsha
Xuezhou Zhang
Xiaojin Zhu
Adish Singla
AAML
OffRL
44
37
0
16 Feb 2021
Causal Markov Decision Processes: Learning Good Interventions Efficiently
Yangyi Lu
A. Meisami
Ambuj Tewari
23
10
0
15 Feb 2021
Improved Corruption Robust Algorithms for Episodic Reinforcement Learning
Yifang Chen
S. Du
Kevin Jamieson
24
22
0
13 Feb 2021
Robust Policy Gradient against Strong Data Corruption
Xuezhou Zhang
Yiding Chen
Xiaojin Zhu
Wen Sun
AAML
57
37
0
11 Feb 2021
Simple Agent, Complex Environment: Efficient Reinforcement Learning with Agent States
Shi Dong
Benjamin Van Roy
Zhengyuan Zhou
37
29
0
10 Feb 2021
RL for Latent MDPs: Regret Guarantees and a Lower Bound
Jeongyeol Kwon
Yonathan Efroni
Constantine Caramanis
Shie Mannor
26
77
0
09 Feb 2021
Tactical Optimism and Pessimism for Deep Reinforcement Learning
Theodore H. Moskovitz
Jack Parker-Holder
Aldo Pacchiano
Michael Arbel
Michael I. Jordan
32
55
0
07 Feb 2021
Confidence-Budget Matching for Sequential Budgeted Learning
Yonathan Efroni
Nadav Merlis
Aadirupa Saha
Shie Mannor
40
10
0
05 Feb 2021
Provably Efficient Algorithms for Multi-Objective Competitive RL
Tiancheng Yu
Yi Tian
J.N. Zhang
S. Sra
37
20
0
05 Feb 2021
Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms
Chi Jin
Qinghua Liu
Sobhan Miryoosefi
OffRL
49
215
0
01 Feb 2021
Improved Variance-Aware Confidence Sets for Linear Bandits and Linear Mixture MDP
Zihan Zhang
Jiaqi Yang
Xiangyang Ji
S. Du
71
38
0
29 Jan 2021
Geometric Entropic Exploration
Z. Guo
M. G. Azar
Alaa Saade
S. Thakoor
Bilal Piot
Bernardo Avila-Pires
Michal Valko
Thomas Mesnard
Tor Lattimore
Rémi Munos
38
30
0
06 Jan 2021
Is Pessimism Provably Efficient for Offline RL?
Ying Jin
Zhuoran Yang
Zhaoran Wang
OffRL
32
350
0
30 Dec 2020
Learning Adversarial Markov Decision Processes with Delayed Feedback
Tal Lancewicki
Aviv A. Rosenberg
Yishay Mansour
43
32
0
29 Dec 2020
Minimax Regret for Stochastic Shortest Path with Adversarial Costs and Known Transition
Liyu Chen
Haipeng Luo
Chen-Yu Wei
34
32
0
07 Dec 2020
Model-based Reinforcement Learning for Continuous Control with Posterior Sampling
Ying Fan
Yifei Ming
33
17
0
20 Nov 2020
On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces
Zhuoran Yang
Chi Jin
Zhaoran Wang
Mengdi Wang
Michael I. Jordan
44
18
0
09 Nov 2020
Online Learning in Unknown Markov Games
Yi Tian
Yuanhao Wang
Tiancheng Yu
S. Sra
OffRL
19
13
0
28 Oct 2020
Efficient Learning in Non-Stationary Linear Markov Decision Processes
Ahmed Touati
Pascal Vincent
42
29
0
24 Oct 2020
Improved Worst-Case Regret Bounds for Randomized Least-Squares Value Iteration
Priyank Agrawal
Jinglin Chen
Nan Jiang
42
19
0
23 Oct 2020
A Sharp Analysis of Model-based Reinforcement Learning with Self-Play
Qinghua Liu
Tiancheng Yu
Yu Bai
Chi Jin
39
121
0
04 Oct 2020
Nearly Minimax Optimal Reinforcement Learning for Discounted MDPs
Jiafan He
Dongruo Zhou
Quanquan Gu
40
37
0
01 Oct 2020
Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon
Zihan Zhang
Xiangyang Ji
S. Du
OffRL
45
104
0
28 Sep 2020
Is Q-Learning Provably Efficient? An Extended Analysis
Kushagra Rastogi
Jonathan Lee
Fabrice Harel-Canada
Aditya Sunil Joglekar
OffRL
19
1
0
22 Sep 2020
Private Reinforcement Learning with PAC and Regret Guarantees
G. Vietri
Borja Balle
A. Krishnamurthy
Zhiwei Steven Wu
26
60
0
18 Sep 2020
Provably Efficient Reward-Agnostic Navigation with Linear Value Iteration
Andrea Zanette
A. Lazaric
Mykel J. Kochenderfer
Emma Brunskill
41
64
0
18 Aug 2020
On the Sample Complexity of Reinforcement Learning with Policy Space Generalization
Wenlong Mou
Zheng Wen
Xi Chen
21
10
0
17 Aug 2020
Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity
Kai Zhang
Sham Kakade
Tamer Bacsar
Lin F. Yang
73
120
0
15 Jul 2020
A Provably Efficient Sample Collection Strategy for Reinforcement Learning
Jean Tarbouriech
Matteo Pirotta
Michal Valko
A. Lazaric
OffRL
35
16
0
13 Jul 2020
Bandit Linear Control
Asaf B. Cassel
Tomer Koren
18
17
0
01 Jul 2020
Adaptive Discretization for Model-Based Reinforcement Learning
Sean R. Sinclair
Tianyu Wang
Gauri Jain
Siddhartha Banerjee
Chao Yu
OffRL
24
21
0
01 Jul 2020
Dynamic Regret of Policy Optimization in Non-stationary Environments
Yingjie Fei
Zhuoran Yang
Zhaoran Wang
Qiaomin Xie
34
54
0
30 Jun 2020
Provably Efficient Reinforcement Learning for Discounted MDPs with Feature Mapping
Dongruo Zhou
Jiafan He
Quanquan Gu
35
133
0
23 Jun 2020
Information Theoretic Regret Bounds for Online Nonlinear Control
Sham Kakade
A. Krishnamurthy
Kendall Lowrey
Motoya Ohnishi
Wen Sun
38
117
0
22 Jun 2020
Near-Optimal Reinforcement Learning with Self-Play
Yunru Bai
Chi Jin
Tiancheng Yu
24
130
0
22 Jun 2020
Task-agnostic Exploration in Reinforcement Learning
Xuezhou Zhang
Yuzhe Ma
Adish Singla
OffRL
31
49
0
16 Jun 2020
Q
Q
Q
-learning with Logarithmic Regret
Kunhe Yang
Lin F. Yang
S. Du
48
59
0
16 Jun 2020
Preference-based Reinforcement Learning with Finite-Time Guarantees
Yichong Xu
Ruosong Wang
Lin F. Yang
Aarti Singh
A. Dubrawski
36
53
0
16 Jun 2020
Efficient Model-Based Reinforcement Learning through Optimistic Policy Search and Planning
Sebastian Curi
Felix Berkenkamp
Andreas Krause
38
82
0
15 Jun 2020
Adaptive Reward-Free Exploration
E. Kaufmann
Pierre Ménard
O. D. Domingues
Anders Jonsson
Edouard Leurent
Michal Valko
30
80
0
11 Jun 2020
A Model-free Learning Algorithm for Infinite-horizon Average-reward MDPs with Near-optimal Regret
Mehdi Jafarnia-Jahromi
Chen-Yu Wei
Rahul Jain
Haipeng Luo
28
7
0
08 Jun 2020
Temporally-Extended ε-Greedy Exploration
Will Dabney
Georg Ostrovski
André Barreto
27
34
0
02 Jun 2020
Model-Based Reinforcement Learning with Value-Targeted Regression
Alex Ayoub
Zeyu Jia
Csaba Szepesvári
Mengdi Wang
Lin F. Yang
OffRL
62
300
0
01 Jun 2020
Breaking the Sample Size Barrier in Model-Based Reinforcement Learning with a Generative Model
Gen Li
Yuting Wei
Yuejie Chi
Yuxin Chen
60
125
0
26 May 2020
Previous
1
2
3
4
5
Next