ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.00633
  4. Cited By
Settling the Horizon-Dependence of Sample Complexity in Reinforcement
  Learning

Settling the Horizon-Dependence of Sample Complexity in Reinforcement Learning

1 November 2021
Yuanzhi Li
Ruosong Wang
Lin F. Yang
ArXiv (abs)PDFHTML

Papers citing "Settling the Horizon-Dependence of Sample Complexity in Reinforcement Learning"

29 / 29 papers shown
Title
Rapidly Adapting Policies to the Real World via Simulation-Guided Fine-Tuning
Rapidly Adapting Policies to the Real World via Simulation-Guided Fine-Tuning
Patrick Yin
Tyler Westenbroek
Simran Bagaria
Kevin Huang
Ching-an Cheng
Andrey Kobolov
Abhishek Gupta
144
3
0
04 Feb 2025
Settling the Sample Complexity of Online Reinforcement Learning
Settling the Sample Complexity of Online Reinforcement Learning
Zihan Zhang
Yuxin Chen
Jason D. Lee
S. Du
OffRL
188
23
0
25 Jul 2023
Nearly Horizon-Free Offline Reinforcement Learning
Nearly Horizon-Free Offline Reinforcement Learning
Zhaolin Ren
Jialian Li
Bo Dai
S. Du
Sujay Sanghavi
OffRL
69
49
0
25 Mar 2021
UCB Momentum Q-learning: Correcting the bias without forgetting
UCB Momentum Q-learning: Correcting the bias without forgetting
Pierre Menard
O. D. Domingues
Xuedong Shang
Michal Valko
118
42
0
01 Mar 2021
Nearly Minimax Optimal Reward-free Reinforcement Learning
Nearly Minimax Optimal Reward-free Reinforcement Learning
Zihan Zhang
S. Du
Xiangyang Ji
OffRL
60
31
0
12 Oct 2020
Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal
  Algorithm Escaping the Curse of Horizon
Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon
Zihan Zhang
Xiangyang Ji
S. Du
OffRL
106
105
0
28 Sep 2020
A Unifying View of Optimism in Episodic Reinforcement Learning
A Unifying View of Optimism in Episodic Reinforcement Learning
Gergely Neu
Ciara Pike-Burke
60
67
0
03 Jul 2020
$Q$-learning with Logarithmic Regret
QQQ-learning with Logarithmic Regret
Kunhe Yang
Lin F. Yang
S. Du
74
61
0
16 Jun 2020
Breaking the Sample Size Barrier in Model-Based Reinforcement Learning
  with a Generative Model
Breaking the Sample Size Barrier in Model-Based Reinforcement Learning with a Generative Model
Gen Li
Yuting Wei
Yuejie Chi
Yuxin Chen
102
129
0
26 May 2020
Is Long Horizon Reinforcement Learning More Difficult Than Short Horizon
  Reinforcement Learning?
Is Long Horizon Reinforcement Learning More Difficult Than Short Horizon Reinforcement Learning?
Ruosong Wang
S. Du
Lin F. Yang
Sham Kakade
OffRL
71
52
0
01 May 2020
Almost Optimal Model-Free Reinforcement Learning via Reference-Advantage
  Decomposition
Almost Optimal Model-Free Reinforcement Learning via Reference-Advantage Decomposition
Zihan Zhang
Yuanshuo Zhou
Xiangyang Ji
OffRL
65
156
0
21 Apr 2020
Regret Minimization for Reinforcement Learning by Evaluating the Optimal
  Bias Function
Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function
Zihan Zhang
Xiangyang Ji
60
72
0
12 Jun 2019
Worst-Case Regret Bounds for Exploration via Randomized Value Functions
Worst-Case Regret Bounds for Exploration via Randomized Value Functions
Daniel Russo
OffRL
45
86
0
07 Jun 2019
Non-Asymptotic Gap-Dependent Regret Bounds for Tabular MDPs
Non-Asymptotic Gap-Dependent Regret Bounds for Tabular MDPs
Max Simchowitz
Kevin Jamieson
63
145
0
09 May 2019
Q-learning with UCB Exploration is Sample Efficient for Infinite-Horizon
  MDP
Q-learning with UCB Exploration is Sample Efficient for Infinite-Horizon MDP
Kefan Dong
Yuanhao Wang
Xiaoyu Chen
Liwei Wang
OffRL
60
96
0
27 Jan 2019
Tighter Problem-Dependent Regret Bounds in Reinforcement Learning
  without Domain Knowledge using Value Function Bounds
Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds
Andrea Zanette
Emma Brunskill
OffRL
104
276
0
01 Jan 2019
Policy Certificates: Towards Accountable Reinforcement Learning
Policy Certificates: Towards Accountable Reinforcement Learning
Christoph Dann
Ashutosh Adhikari
Wei Wei
Jimmy J. Lin
OffRL
120
146
0
07 Nov 2018
Is Q-learning Provably Efficient?
Is Q-learning Provably Efficient?
Chi Jin
Zeyuan Allen-Zhu
Sébastien Bubeck
Michael I. Jordan
OffRL
70
807
0
10 Jul 2018
Variance-Aware Regret Bounds for Undiscounted Reinforcement Learning in
  MDPs
Variance-Aware Regret Bounds for Undiscounted Reinforcement Learning in MDPs
M. S. Talebi
Odalric-Ambrym Maillard
56
72
0
05 Mar 2018
Variance Reduced Value Iteration and Faster Algorithms for Solving
  Markov Decision Processes
Variance Reduced Value Iteration and Faster Algorithms for Solving Markov Decision Processes
Aaron Sidford
Mengdi Wang
X. Wu
Yinyu Ye
56
127
0
27 Oct 2017
Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement
  Learning
Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning
Christoph Dann
Tor Lattimore
Emma Brunskill
76
309
0
22 Mar 2017
Minimax Regret Bounds for Reinforcement Learning
Minimax Regret Bounds for Reinforcement Learning
M. G. Azar
Ian Osband
Rémi Munos
88
775
0
16 Mar 2017
On Lower Bounds for Regret in Reinforcement Learning
On Lower Bounds for Regret in Reinforcement Learning
Ian Osband
Benjamin Van Roy
81
101
0
09 Aug 2016
Why is Posterior Sampling Better than Optimism for Reinforcement
  Learning?
Why is Posterior Sampling Better than Optimism for Reinforcement Learning?
Ian Osband
Benjamin Van Roy
BDL
83
261
0
01 Jul 2016
Sample Complexity of Episodic Fixed-Horizon Reinforcement Learning
Sample Complexity of Episodic Fixed-Horizon Reinforcement Learning
Christoph Dann
Emma Brunskill
69
249
0
29 Oct 2015
(More) Efficient Reinforcement Learning via Posterior Sampling
(More) Efficient Reinforcement Learning via Posterior Sampling
Ian Osband
Daniel Russo
Benjamin Van Roy
119
535
0
04 Jun 2013
On the Sample Complexity of Reinforcement Learning with a Generative
  Model
On the Sample Complexity of Reinforcement Learning with a Generative Model
M. G. Azar
Rémi Munos
H. Kappen
74
156
0
27 Jun 2012
REGAL: A Regularization based Algorithm for Reinforcement Learning in
  Weakly Communicating MDPs
REGAL: A Regularization based Algorithm for Reinforcement Learning in Weakly Communicating MDPs
Peter L. Bartlett
Ambuj Tewari
91
284
0
09 May 2012
PAC Bounds for Discounted MDPs
PAC Bounds for Discounted MDPs
Tor Lattimore
Marcus Hutter
94
189
0
17 Feb 2012
1