Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.07368
Cited By
The Efficacy of Pessimism in Asynchronous Q-Learning
14 March 2022
Yuling Yan
Gen Li
Yuxin Chen
Jianqing Fan
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Efficacy of Pessimism in Asynchronous Q-Learning"
13 / 13 papers shown
Title
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Zhong Zheng
Haochen Zhang
Lingzhou Xue
OffRL
75
2
0
10 Oct 2024
Federated Offline Reinforcement Learning: Collaborative Single-Policy Coverage Suffices
Jiin Woo
Laixi Shi
Gauri Joshi
Yuejie Chi
OffRL
26
3
0
08 Feb 2024
Settling the Sample Complexity of Online Reinforcement Learning
Zihan Zhang
Yuxin Chen
Jason D. Lee
S. Du
OffRL
98
21
0
25 Jul 2023
Regret-Optimal Model-Free Reinforcement Learning for Discounted MDPs with Short Burn-In Time
Xiang Ji
Gen Li
OffRL
32
7
0
24 May 2023
Importance Weighted Actor-Critic for Optimal Conservative Offline Reinforcement Learning
Hanlin Zhu
Paria Rashidinejad
Jiantao Jiao
OffRL
38
15
0
30 Jan 2023
Policy learning "without'' overlap: Pessimism and generalized empirical Bernstein's inequality
Ying Jin
Zhimei Ren
Zhuoran Yang
Zhaoran Wang
OffRL
32
25
0
19 Dec 2022
Strategic Decision-Making in the Presence of Information Asymmetry: Provably Efficient RL with Algorithmic Instruments
Mengxin Yu
Zhuoran Yang
Jianqing Fan
OffRL
21
8
0
23 Aug 2022
A Near-Optimal Primal-Dual Method for Off-Policy Learning in CMDP
Fan Chen
Junyu Zhang
Zaiwen Wen
OffRL
39
8
0
13 Jul 2022
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning
Andrea Zanette
Martin J. Wainwright
OOD
36
5
0
01 Jun 2022
Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes
Miao Lu
Yifei Min
Zhaoran Wang
Zhuoran Yang
OffRL
54
22
0
26 May 2022
Pessimistic Model-based Offline Reinforcement Learning under Partial Coverage
Masatoshi Uehara
Wen Sun
OffRL
96
144
0
13 Jul 2021
UCB Momentum Q-learning: Correcting the bias without forgetting
Pierre Menard
O. D. Domingues
Xuedong Shang
Michal Valko
79
40
0
01 Mar 2021
COMBO: Conservative Offline Model-Based Policy Optimization
Tianhe Yu
Aviral Kumar
Rafael Rafailov
Aravind Rajeswaran
Sergey Levine
Chelsea Finn
OffRL
219
413
0
16 Feb 2021
1