Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2301.13446
Cited By
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments
31 January 2023
Runlong Zhou
Zihan Zhang
S. Du
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments"
12 / 12 papers shown
Title
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Zhong Zheng
Haochen Zhang
Lingzhou Xue
OffRL
70
2
0
10 Oct 2024
State-free Reinforcement Learning
Mingyu Chen
Aldo Pacchiano
Xuezhou Zhang
61
0
0
27 Sep 2024
Utilizing Maximum Mean Discrepancy Barycenter for Propagating the Uncertainty of Value Functions in Reinforcement Learning
Srinjoy Roy
Swagatam Das
27
0
0
31 Mar 2024
More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning
Kaiwen Wang
Owen Oertell
Alekh Agarwal
Nathan Kallus
Wen Sun
OffRL
82
12
0
11 Feb 2024
Horizon-Free and Instance-Dependent Regret Bounds for Reinforcement Learning with General Function Approximation
Jiayi Huang
Han Zhong
Liwei Wang
Lin F. Yang
35
2
0
07 Dec 2023
Settling the Sample Complexity of Online Reinforcement Learning
Zihan Zhang
Yuxin Chen
Jason D. Lee
S. Du
OffRL
92
21
0
25 Jul 2023
Tackling Heavy-Tailed Rewards in Reinforcement Learning with Function Approximation: Minimax Optimal and Instance-Dependent Regret Bounds
Jiayi Huang
Han Zhong
Liwei Wang
Lin F. Yang
24
6
0
12 Jun 2023
Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement Learning: Adaptivity and Computational Efficiency
Heyang Zhao
Jiafan He
Dongruo Zhou
Tong Zhang
Quanquan Gu
24
27
0
21 Feb 2023
Optimal Online Generalized Linear Regression with Stochastic Noise and Its Application to Heteroscedastic Bandits
Heyang Zhao
Dongruo Zhou
Jiafan He
Quanquan Gu
36
2
0
28 Feb 2022
First-Order Regret in Reinforcement Learning with Linear Function Approximation: A Robust Estimation Approach
Andrew Wagenmaker
Yifang Chen
Max Simchowitz
S. Du
Kevin G. Jamieson
73
36
0
07 Dec 2021
Improved Variance-Aware Confidence Sets for Linear Bandits and Linear Mixture MDP
Zihan Zhang
Jiaqi Yang
Xiangyang Ji
S. Du
65
36
0
29 Jan 2021
Reward-Free Exploration for Reinforcement Learning
Chi Jin
A. Krishnamurthy
Max Simchowitz
Tiancheng Yu
OffRL
112
194
0
07 Feb 2020
1