Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2202.13890
Cited By
Pessimistic Q-Learning for Offline Reinforcement Learning: Towards Optimal Sample Complexity
28 February 2022
Laixi Shi
Gen Li
Yuting Wei
Yuxin Chen
Yuejie Chi
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Pessimistic Q-Learning for Offline Reinforcement Learning: Towards Optimal Sample Complexity"
29 / 29 papers shown
Title
Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF
Shicong Cen
Jincheng Mei
Katayoon Goshvadi
Hanjun Dai
Tong Yang
Sherry Yang
Dale Schuurmans
Yuejie Chi
Bo Dai
OffRL
65
23
0
20 Feb 2025
Nearly Optimal Sample Complexity of Offline KL-Regularized Contextual Bandits under Single-Policy Concentrability
Qingyue Zhao
Kaixuan Ji
Heyang Zhao
Tong Zhang
Q. Gu
OffRL
45
0
0
09 Feb 2025
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Zhong Zheng
Haochen Zhang
Lingzhou Xue
OffRL
72
2
0
10 Oct 2024
Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis
Qining Zhang
Honghao Wei
Lei Ying
OffRL
67
1
0
11 Jun 2024
What Are the Odds? Improving the foundations of Statistical Model Checking
Tobias Meggendorfer
Maximilian Weininger
Patrick Wienhoft
39
4
0
08 Apr 2024
Federated Offline Reinforcement Learning: Collaborative Single-Policy Coverage Suffices
Jiin Woo
Laixi Shi
Gauri Joshi
Yuejie Chi
OffRL
26
3
0
08 Feb 2024
MoMA: Model-based Mirror Ascent for Offline Reinforcement Learning
Mao Hong
Zhiyue Zhang
Yue Wu
Yan Xu
OffRL
48
0
0
21 Jan 2024
Settling the Sample Complexity of Online Reinforcement Learning
Zihan Zhang
Yuxin Chen
Jason D. Lee
S. Du
OffRL
98
21
0
25 Jul 2023
Offline Meta Reinforcement Learning with In-Distribution Online Adaptation
Jianhao Wang
Jin Zhang
Haozhe Jiang
Junyu Zhang
Liwei Wang
Chongjie Zhang
OffRL
26
9
0
31 May 2023
High-probability sample complexities for policy evaluation with linear function approximation
Gen Li
Weichen Wu
Yuejie Chi
Cong Ma
Alessandro Rinaldo
Yuting Wei
OffRL
25
6
0
30 May 2023
Reinforcement Learning with Human Feedback: Learning Dynamic Choices via Pessimism
Zihao Li
Zhuoran Yang
Mengdi Wang
OffRL
29
54
0
29 May 2023
Regret-Optimal Model-Free Reinforcement Learning for Discounted MDPs with Short Burn-In Time
Xiang Ji
Gen Li
OffRL
32
7
0
24 May 2023
Double Pessimism is Provably Efficient for Distributionally Robust Offline Reinforcement Learning: Generic Algorithm and Robust Partial Coverage
Jose H. Blanchet
Miao Lu
Tong Zhang
Han Zhong
OffRL
42
29
0
16 May 2023
Provably Efficient Offline Goal-Conditioned Reinforcement Learning with General Function Approximation and Single-Policy Concentrability
Hanlin Zhu
Amy Zhang
OffRL
18
2
0
07 Feb 2023
Offline Learning in Markov Games with General Function Approximation
Yuheng Zhang
Yunru Bai
Nan Jiang
OffRL
18
8
0
06 Feb 2023
Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage
Masatoshi Uehara
Nathan Kallus
Jason D. Lee
Wen Sun
OffRL
42
5
0
05 Feb 2023
Policy learning "without'' overlap: Pessimism and generalized empirical Bernstein's inequality
Ying Jin
Zhimei Ren
Zhuoran Yang
Zhaoran Wang
OffRL
29
25
0
19 Dec 2022
Offline Estimation of Controlled Markov Chains: Minimaxity and Sample Complexity
Imon Banerjee
Harsha Honnappa
Vinayak A. Rao
OffRL
11
0
0
14 Nov 2022
Strategic Decision-Making in the Presence of Information Asymmetry: Provably Efficient RL with Algorithmic Instruments
Mengxin Yu
Zhuoran Yang
Jianqing Fan
OffRL
18
8
0
23 Aug 2022
A Near-Optimal Primal-Dual Method for Off-Policy Learning in CMDP
Fan Chen
Junyu Zhang
Zaiwen Wen
OffRL
36
8
0
13 Jul 2022
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning
Andrea Zanette
Martin J. Wainwright
OOD
36
5
0
01 Jun 2022
Non-Markovian policies occupancy measures
Romain Laroche
Rémi Tachet des Combes
Jacob Buckman
OffRL
31
1
0
27 May 2022
The Efficacy of Pessimism in Asynchronous Q-Learning
Yuling Yan
Gen Li
Yuxin Chen
Jianqing Fan
OffRL
75
40
0
14 Mar 2022
Near-optimal Offline Reinforcement Learning with Linear Representation: Leveraging Variance Information with Pessimism
Ming Yin
Yaqi Duan
Mengdi Wang
Yu-Xiang Wang
OffRL
32
65
0
11 Mar 2022
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning
Gen Li
Laixi Shi
Yuxin Chen
Yuejie Chi
OffRL
42
50
0
09 Oct 2021
Pessimistic Model-based Offline Reinforcement Learning under Partial Coverage
Masatoshi Uehara
Wen Sun
OffRL
96
144
0
13 Jul 2021
COMBO: Conservative Offline Model-Based Policy Optimization
Tianhe Yu
Aviral Kumar
Rafael Rafailov
Aravind Rajeswaran
Sergey Levine
Chelsea Finn
OffRL
219
413
0
16 Feb 2021
Breaking the Sample Size Barrier in Model-Based Reinforcement Learning with a Generative Model
Gen Li
Yuting Wei
Yuejie Chi
Yuxin Chen
31
124
0
26 May 2020
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRL
GP
340
1,960
0
04 May 2020
1