Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1807.03765
Cited By
Is Q-learning Provably Efficient?
10 July 2018
Chi Jin
Zeyuan Allen-Zhu
Sébastien Bubeck
Michael I. Jordan
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Is Q-learning Provably Efficient?"
50 / 225 papers shown
Title
UCB Momentum Q-learning: Correcting the bias without forgetting
Pierre Menard
O. D. Domingues
Xuedong Shang
Michal Valko
79
41
0
01 Mar 2021
Online Learning for Unknown Partially Observable MDPs
Mehdi Jafarnia-Jahromi
Rahul Jain
A. Nayyar
39
20
0
25 Feb 2021
Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games
Yu Bai
Chi Jin
Haiquan Wang
Caiming Xiong
46
67
0
23 Feb 2021
Near-optimal Policy Optimization Algorithms for Learning Adversarial Linear Mixture MDPs
Jiafan He
Dongruo Zhou
Quanquan Gu
95
24
0
17 Feb 2021
Reward Poisoning in Reinforcement Learning: Attacks Against Unknown Learners in Unknown Environments
Amin Rakhsha
Xuezhou Zhang
Xiaojin Zhu
Adish Singla
AAML
OffRL
44
37
0
16 Feb 2021
Improved Corruption Robust Algorithms for Episodic Reinforcement Learning
Yifang Chen
S. Du
Kevin G. Jamieson
24
22
0
13 Feb 2021
Is Q-Learning Minimax Optimal? A Tight Sample Complexity Analysis
Gen Li
Changxiao Cai
Ee
Yuting Wei
Yuejie Chi
OffRL
55
75
0
12 Feb 2021
Robust Policy Gradient against Strong Data Corruption
Xuezhou Zhang
Yiding Chen
Xiaojin Zhu
Wen Sun
AAML
45
37
0
11 Feb 2021
Simple Agent, Complex Environment: Efficient Reinforcement Learning with Agent States
Shi Dong
Benjamin Van Roy
Zhengyuan Zhou
32
29
0
10 Feb 2021
Last-iterate Convergence of Decentralized Optimistic Gradient Descent/Ascent in Infinite-horizon Competitive Markov Games
Chen-Yu Wei
Chung-Wei Lee
Mengxiao Zhang
Haipeng Luo
35
82
0
08 Feb 2021
Tactical Optimism and Pessimism for Deep Reinforcement Learning
Theodore H. Moskovitz
Jack Parker-Holder
Aldo Pacchiano
Michael Arbel
Michael I. Jordan
27
55
0
07 Feb 2021
Confidence-Budget Matching for Sequential Budgeted Learning
Yonathan Efroni
Nadav Merlis
Aadirupa Saha
Shie Mannor
40
10
0
05 Feb 2021
Provably Efficient Algorithms for Multi-Objective Competitive RL
Tiancheng Yu
Yi Tian
J.N. Zhang
S. Sra
34
20
0
05 Feb 2021
A Lyapunov Theory for Finite-Sample Guarantees of Asynchronous Q-Learning and TD-Learning Variants
Zaiwei Chen
S. T. Maguluri
Sanjay Shakkottai
Karthikeyan Shanmugam
OffRL
105
54
0
02 Feb 2021
Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms
Chi Jin
Qinghua Liu
Sobhan Miryoosefi
OffRL
43
215
0
01 Feb 2021
Improved Variance-Aware Confidence Sets for Linear Bandits and Linear Mixture MDP
Zihan Zhang
Jiaqi Yang
Xiangyang Ji
S. Du
71
38
0
29 Jan 2021
Decoupled Exploration and Exploitation Policies for Sample-Efficient Reinforcement Learning
William F. Whitney
Michael Bloesch
Jost Tobias Springenberg
A. Abdolmaleki
Kyunghyun Cho
Martin Riedmiller
OffRL
29
13
0
23 Jan 2021
Is Pessimism Provably Efficient for Offline RL?
Ying Jin
Zhuoran Yang
Zhaoran Wang
OffRL
27
350
0
30 Dec 2020
Learning Adversarial Markov Decision Processes with Delayed Feedback
Tal Lancewicki
Aviv A. Rosenberg
Yishay Mansour
43
32
0
29 Dec 2020
Minimax Regret for Stochastic Shortest Path with Adversarial Costs and Known Transition
Liyu Chen
Haipeng Luo
Chen-Yu Wei
34
32
0
07 Dec 2020
Model-based Reinforcement Learning for Continuous Control with Posterior Sampling
Ying Fan
Yifei Ming
33
17
0
20 Nov 2020
On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces
Zhuoran Yang
Chi Jin
Zhaoran Wang
Mengdi Wang
Michael I. Jordan
44
18
0
09 Nov 2020
Control with adaptive Q-learning
J. Araújo
Mário A. T. Figueiredo
M. Botto
33
2
0
03 Nov 2020
Online Learning in Unknown Markov Games
Yi Tian
Yuanhao Wang
Tiancheng Yu
S. Sra
OffRL
17
13
0
28 Oct 2020
Improved Worst-Case Regret Bounds for Randomized Least-Squares Value Iteration
Priyank Agrawal
Jinglin Chen
Nan Jiang
36
19
0
23 Oct 2020
CoinDICE: Off-Policy Confidence Interval Estimation
Bo Dai
Ofir Nachum
Yinlam Chow
Lihong Li
Csaba Szepesvári
Dale Schuurmans
OffRL
29
84
0
22 Oct 2020
Sample Efficient Reinforcement Learning with REINFORCE
Junzi Zhang
Jongho Kim
Brendan O'Donoghue
Stephen P. Boyd
46
101
0
22 Oct 2020
A Sharp Analysis of Model-based Reinforcement Learning with Self-Play
Qinghua Liu
Tiancheng Yu
Yu Bai
Chi Jin
34
121
0
04 Oct 2020
Nearly Minimax Optimal Reinforcement Learning for Discounted MDPs
Jiafan He
Dongruo Zhou
Quanquan Gu
26
37
0
01 Oct 2020
Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon
Zihan Zhang
Xiangyang Ji
S. Du
OffRL
39
104
0
28 Sep 2020
Is Q-Learning Provably Efficient? An Extended Analysis
Kushagra Rastogi
Jonathan Lee
Fabrice Harel-Canada
Aditya Sunil Joglekar
OffRL
19
1
0
22 Sep 2020
Provably Efficient Reward-Agnostic Navigation with Linear Value Iteration
Andrea Zanette
A. Lazaric
Mykel J. Kochenderfer
Emma Brunskill
36
64
0
18 Aug 2020
Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity
Kai Zhang
Sham Kakade
Tamer Bacsar
Lin F. Yang
52
120
0
15 Jul 2020
Single-partition adaptive Q-learning
J. Araújo
Mário A. T. Figueiredo
M. Botto
OffRL
20
2
0
14 Jul 2020
A Provably Efficient Sample Collection Strategy for Reinforcement Learning
Jean Tarbouriech
Matteo Pirotta
Michal Valko
A. Lazaric
OffRL
35
16
0
13 Jul 2020
Near-Optimal Provable Uniform Convergence in Offline Policy Evaluation for Reinforcement Learning
Ming Yin
Yu Bai
Yu Wang
OffRL
44
31
0
07 Jul 2020
Adaptive Discretization for Model-Based Reinforcement Learning
Sean R. Sinclair
Tianyu Wang
Gauri Jain
Siddhartha Banerjee
Chao Yu
OffRL
19
21
0
01 Jul 2020
Near-Optimal Reinforcement Learning with Self-Play
Yunru Bai
Chi Jin
Tiancheng Yu
24
130
0
22 Jun 2020
Task-agnostic Exploration in Reinforcement Learning
Xuezhou Zhang
Yuzhe Ma
Adish Singla
OffRL
31
49
0
16 Jun 2020
Q
Q
Q
-learning with Logarithmic Regret
Kunhe Yang
Lin F. Yang
S. Du
48
59
0
16 Jun 2020
Preference-based Reinforcement Learning with Finite-Time Guarantees
Yichong Xu
Ruosong Wang
Lin F. Yang
Aarti Singh
A. Dubrawski
36
53
0
16 Jun 2020
Adaptive Reward-Free Exploration
E. Kaufmann
Pierre Ménard
O. D. Domingues
Anders Jonsson
Edouard Leurent
Michal Valko
30
80
0
11 Jun 2020
Model-Based Reinforcement Learning with Value-Targeted Regression
Alex Ayoub
Zeyu Jia
Csaba Szepesvári
Mengdi Wang
Lin F. Yang
OffRL
59
299
0
01 Jun 2020
Breaking the Sample Size Barrier in Model-Based Reinforcement Learning with a Generative Model
Gen Li
Yuting Wei
Yuejie Chi
Yuxin Chen
39
125
0
26 May 2020
Reinforcement Learning with General Value Function Approximation: Provably Efficient Approach via Bounded Eluder Dimension
Ruosong Wang
Ruslan Salakhutdinov
Lin F. Yang
25
55
0
21 May 2020
A Finite Time Analysis of Two Time-Scale Actor Critic Methods
Yue Wu
Weitong Zhang
Pan Xu
Quanquan Gu
95
146
0
04 May 2020
Adaptive Reward-Poisoning Attacks against Reinforcement Learning
Xuezhou Zhang
Yuzhe Ma
Adish Singla
Xiaojin Zhu
AAML
29
124
0
27 Mar 2020
Provably Efficient Model-Free Algorithm for MDPs with Peak Constraints
Qinbo Bai
Vaneet Aggarwal
Ather Gattami
24
7
0
11 Mar 2020
Provably Efficient Safe Exploration via Primal-Dual Policy Optimization
Dongsheng Ding
Xiaohan Wei
Zhuoran Yang
Zhaoran Wang
M. Jovanović
35
159
0
01 Mar 2020
Learning Near Optimal Policies with Low Inherent Bellman Error
Andrea Zanette
A. Lazaric
Mykel Kochenderfer
Emma Brunskill
OffRL
27
221
0
29 Feb 2020
Previous
1
2
3
4
5
Next