Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2102.04692
Cited By
Fine-Grained Gap-Dependent Bounds for Tabular MDPs via Adaptive Multi-Step Bootstrap
9 February 2021
Haike Xu
Tengyu Ma
S. Du
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Fine-Grained Gap-Dependent Bounds for Tabular MDPs via Adaptive Multi-Step Bootstrap"
14 / 14 papers shown
Title
Automatic Reward Shaping from Confounded Offline Data
Mingxuan Li
Junzhe Zhang
Elias Bareinboim
OffRL
OnRL
33
0
0
16 May 2025
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Zhong Zheng
Haochen Zhang
Lingzhou Xue
OffRL
78
2
0
10 Oct 2024
Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis
Qining Zhang
Honghao Wei
Lei Ying
OffRL
67
1
0
11 Jun 2024
Settling the Sample Complexity of Online Reinforcement Learning
Zihan Zhang
Yuxin Chen
Jason D. Lee
S. Du
OffRL
98
22
0
25 Jul 2023
Robust Knowledge Transfer in Tiered Reinforcement Learning
Jiawei Huang
Niao He
OffRL
32
1
0
10 Feb 2023
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments
Runlong Zhou
Zihan Zhang
S. Du
44
10
0
31 Jan 2023
On Instance-Dependent Bounds for Offline Reinforcement Learning with Linear Function Approximation
Thanh Nguyen-Tang
Ming Yin
Sunil R. Gupta
Svetha Venkatesh
R. Arora
OffRL
58
16
0
23 Nov 2022
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning
Andrea Zanette
Martin J. Wainwright
OOD
45
5
0
01 Jun 2022
Logarithmic regret bounds for continuous-time average-reward Markov decision processes
Xuefeng Gao
X. Zhou
39
8
0
23 May 2022
Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies
Zihan Zhang
Xiangyang Ji
S. Du
30
21
0
24 Mar 2022
Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection
Matteo Papini
Andrea Tirinzoni
Aldo Pacchiano
Marcello Restelli
A. Lazaric
Matteo Pirotta
19
18
0
27 Oct 2021
Gap-Dependent Unsupervised Exploration for Reinforcement Learning
Jingfeng Wu
Vladimir Braverman
Lin F. Yang
33
12
0
11 Aug 2021
Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret
Jean Tarbouriech
Runlong Zhou
S. Du
Matteo Pirotta
M. Valko
A. Lazaric
65
35
0
22 Apr 2021
Improved Corruption Robust Algorithms for Episodic Reinforcement Learning
Yifang Chen
S. Du
Kevin G. Jamieson
24
22
0
13 Feb 2021
1