Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2108.02717
Cited By
v1
v2 (latest)
Beyond No Regret: Instance-Dependent PAC Reinforcement Learning
5 August 2021
Andrew Wagenmaker
Max Simchowitz
Kevin Jamieson
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Beyond No Regret: Instance-Dependent PAC Reinforcement Learning"
16 / 16 papers shown
Title
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Zhong Zheng
Haochen Zhang
Lingzhou Xue
OffRL
127
2
0
10 Oct 2024
Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis
Qining Zhang
Honghao Wei
Lei Ying
OffRL
112
2
0
11 Jun 2024
What Are the Odds? Improving the foundations of Statistical Model Checking
Tobias Meggendorfer
Maximilian Weininger
Patrick Wienhoft
104
4
0
08 Apr 2024
Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds for Episodic Reinforcement Learning
Christoph Dann
T. V. Marinov
M. Mohri
Julian Zimmert
OffRL
48
30
0
02 Jul 2021
Task-Optimal Exploration in Linear Dynamical Systems
Andrew Wagenmaker
Max Simchowitz
Kevin Jamieson
67
18
0
10 Feb 2021
Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon
Zihan Zhang
Xiangyang Ji
S. Du
OffRL
103
105
0
28 Sep 2020
Fast active learning for pure exploration in reinforcement learning
Pierre Ménard
O. D. Domingues
Anders Jonsson
E. Kaufmann
Edouard Leurent
Michal Valko
45
97
0
27 Jul 2020
Planning in Markov Decision Processes with Gap-Dependent Sample Complexity
Anders Jonsson
E. Kaufmann
Pierre Ménard
O. D. Domingues
Edouard Leurent
Michal Valko
46
33
0
10 Jun 2020
Breaking the Sample Size Barrier in Model-Based Reinforcement Learning with a Generative Model
Gen Li
Yuting Wei
Yuejie Chi
Yuxin Chen
99
129
0
26 May 2020
Model-Based Reinforcement Learning with a Generative Model is Minimax Optimal
Alekh Agarwal
Sham Kakade
Lin F. Yang
OffRL
89
172
0
10 Jun 2019
Non-Asymptotic Gap-Dependent Regret Bounds for Tabular MDPs
Max Simchowitz
Kevin Jamieson
63
145
0
09 May 2019
Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds
Andrea Zanette
Emma Brunskill
OffRL
104
276
0
01 Jan 2019
Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning
Christoph Dann
Tor Lattimore
Emma Brunskill
74
309
0
22 Mar 2017
On the Complexity of Best Arm Identification in Multi-Armed Bandit Models
E. Kaufmann
Olivier Cappé
Aurélien Garivier
193
1,025
0
16 Jul 2014
On the Sample Complexity of Reinforcement Learning with a Generative Model
M. G. Azar
Rémi Munos
H. Kappen
71
156
0
27 Jun 2012
Empirical Bernstein Bounds and Sample Variance Penalization
Andreas Maurer
Massimiliano Pontil
395
545
0
21 Jul 2009
1