Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.00633
Cited By
Settling the Horizon-Dependence of Sample Complexity in Reinforcement Learning
1 November 2021
Yuanzhi Li
Ruosong Wang
Lin F. Yang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Settling the Horizon-Dependence of Sample Complexity in Reinforcement Learning"
29 / 29 papers shown
Title
Rapidly Adapting Policies to the Real World via Simulation-Guided Fine-Tuning
Patrick Yin
Tyler Westenbroek
Simran Bagaria
Kevin Huang
Ching-an Cheng
Andrey Kobolov
Abhishek Gupta
144
3
0
04 Feb 2025
Settling the Sample Complexity of Online Reinforcement Learning
Zihan Zhang
Yuxin Chen
Jason D. Lee
S. Du
OffRL
188
23
0
25 Jul 2023
Nearly Horizon-Free Offline Reinforcement Learning
Zhaolin Ren
Jialian Li
Bo Dai
S. Du
Sujay Sanghavi
OffRL
69
49
0
25 Mar 2021
UCB Momentum Q-learning: Correcting the bias without forgetting
Pierre Menard
O. D. Domingues
Xuedong Shang
Michal Valko
118
42
0
01 Mar 2021
Nearly Minimax Optimal Reward-free Reinforcement Learning
Zihan Zhang
S. Du
Xiangyang Ji
OffRL
60
31
0
12 Oct 2020
Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon
Zihan Zhang
Xiangyang Ji
S. Du
OffRL
106
105
0
28 Sep 2020
A Unifying View of Optimism in Episodic Reinforcement Learning
Gergely Neu
Ciara Pike-Burke
60
67
0
03 Jul 2020
Q
Q
Q
-learning with Logarithmic Regret
Kunhe Yang
Lin F. Yang
S. Du
74
61
0
16 Jun 2020
Breaking the Sample Size Barrier in Model-Based Reinforcement Learning with a Generative Model
Gen Li
Yuting Wei
Yuejie Chi
Yuxin Chen
102
129
0
26 May 2020
Is Long Horizon Reinforcement Learning More Difficult Than Short Horizon Reinforcement Learning?
Ruosong Wang
S. Du
Lin F. Yang
Sham Kakade
OffRL
71
52
0
01 May 2020
Almost Optimal Model-Free Reinforcement Learning via Reference-Advantage Decomposition
Zihan Zhang
Yuanshuo Zhou
Xiangyang Ji
OffRL
65
156
0
21 Apr 2020
Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function
Zihan Zhang
Xiangyang Ji
60
72
0
12 Jun 2019
Worst-Case Regret Bounds for Exploration via Randomized Value Functions
Daniel Russo
OffRL
45
86
0
07 Jun 2019
Non-Asymptotic Gap-Dependent Regret Bounds for Tabular MDPs
Max Simchowitz
Kevin Jamieson
63
145
0
09 May 2019
Q-learning with UCB Exploration is Sample Efficient for Infinite-Horizon MDP
Kefan Dong
Yuanhao Wang
Xiaoyu Chen
Liwei Wang
OffRL
60
96
0
27 Jan 2019
Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds
Andrea Zanette
Emma Brunskill
OffRL
104
276
0
01 Jan 2019
Policy Certificates: Towards Accountable Reinforcement Learning
Christoph Dann
Ashutosh Adhikari
Wei Wei
Jimmy J. Lin
OffRL
120
146
0
07 Nov 2018
Is Q-learning Provably Efficient?
Chi Jin
Zeyuan Allen-Zhu
Sébastien Bubeck
Michael I. Jordan
OffRL
70
807
0
10 Jul 2018
Variance-Aware Regret Bounds for Undiscounted Reinforcement Learning in MDPs
M. S. Talebi
Odalric-Ambrym Maillard
56
72
0
05 Mar 2018
Variance Reduced Value Iteration and Faster Algorithms for Solving Markov Decision Processes
Aaron Sidford
Mengdi Wang
X. Wu
Yinyu Ye
56
127
0
27 Oct 2017
Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning
Christoph Dann
Tor Lattimore
Emma Brunskill
76
309
0
22 Mar 2017
Minimax Regret Bounds for Reinforcement Learning
M. G. Azar
Ian Osband
Rémi Munos
88
775
0
16 Mar 2017
On Lower Bounds for Regret in Reinforcement Learning
Ian Osband
Benjamin Van Roy
81
101
0
09 Aug 2016
Why is Posterior Sampling Better than Optimism for Reinforcement Learning?
Ian Osband
Benjamin Van Roy
BDL
83
261
0
01 Jul 2016
Sample Complexity of Episodic Fixed-Horizon Reinforcement Learning
Christoph Dann
Emma Brunskill
69
249
0
29 Oct 2015
(More) Efficient Reinforcement Learning via Posterior Sampling
Ian Osband
Daniel Russo
Benjamin Van Roy
119
535
0
04 Jun 2013
On the Sample Complexity of Reinforcement Learning with a Generative Model
M. G. Azar
Rémi Munos
H. Kappen
74
156
0
27 Jun 2012
REGAL: A Regularization based Algorithm for Reinforcement Learning in Weakly Communicating MDPs
Peter L. Bartlett
Ambuj Tewari
91
284
0
09 May 2012
PAC Bounds for Discounted MDPs
Tor Lattimore
Marcus Hutter
94
189
0
17 Feb 2012
1