Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.14885
Cited By
Projected Off-Policy Q-Learning (POP-QL) for Stabilizing Offline Reinforcement Learning
25 November 2023
Melrose Roderick
Gaurav Manek
Felix Berkenkamp
J. Zico Kolter
OffRL
OnRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Projected Off-Policy Q-Learning (POP-QL) for Stabilizing Offline Reinforcement Learning"
16 / 16 papers shown
Title
OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation
Jongmin Lee
Wonseok Jeon
Byung-Jun Lee
J. Pineau
Kee-Eung Kim
OffRL
170
100
0
21 Jun 2021
Conservative Q-Learning for Offline Reinforcement Learning
Aviral Kumar
Aurick Zhou
George Tucker
Sergey Levine
OffRL
OnRL
143
1,835
0
08 Jun 2020
MOReL : Model-Based Offline Reinforcement Learning
Rahul Kidambi
Aravind Rajeswaran
Praneeth Netrapalli
Thorsten Joachims
OffRL
101
676
0
12 May 2020
D4RL: Datasets for Deep Data-Driven Reinforcement Learning
Justin Fu
Aviral Kumar
Ofir Nachum
George Tucker
Sergey Levine
GP
OffRL
229
1,381
0
15 Apr 2020
DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction
Aviral Kumar
Abhishek Gupta
Sergey Levine
OffRL
54
102
0
16 Mar 2020
AlgaeDICE: Policy Gradient from Arbitrary Experience
Ofir Nachum
Bo Dai
Ilya Kostrikov
Yinlam Chow
Lihong Li
Dale Schuurmans
OffRL
161
244
0
04 Dec 2019
Behavior Regularized Offline Reinforcement Learning
Yifan Wu
George Tucker
Ofir Nachum
OffRL
97
689
0
26 Nov 2019
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections
Ofir Nachum
Yinlam Chow
Bo Dai
Lihong Li
OffRL
151
338
0
10 Jun 2019
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Aviral Kumar
Justin Fu
George Tucker
Sergey Levine
OffRL
OnRL
137
1,066
0
03 Jun 2019
Off-Policy Deep Reinforcement Learning by Bootstrapping the Covariate Shift
Carles Gelada
Marc G. Bellemare
OffRL
66
99
0
27 Jan 2019
Off-Policy Deep Reinforcement Learning without Exploration
Scott Fujimoto
David Meger
Doina Precup
OffRL
BDL
249
1,624
0
07 Dec 2018
Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation
Qiang Liu
Lihong Li
Ziyang Tang
Dengyong Zhou
OffRL
161
356
0
29 Oct 2018
Addressing Function Approximation Error in Actor-Critic Methods
Scott Fujimoto
H. V. Hoof
David Meger
OffRL
189
5,218
0
26 Feb 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
317
8,406
0
04 Jan 2018
Deep Reinforcement Learning with Double Q-learning
H. V. Hasselt
A. Guez
David Silver
OffRL
172
7,665
0
22 Sep 2015
An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning
R. Sutton
A. R. Mahmood
Martha White
91
272
0
14 Mar 2015
1