Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1902.10250
Cited By
Diagnosing Bottlenecks in Deep Q-learning Algorithms
26 February 2019
Justin Fu
Aviral Kumar
Matthew Soh
Sergey Levine
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Diagnosing Bottlenecks in Deep Q-learning Algorithms"
20 / 20 papers shown
Title
Domain Adaptation for Offline Reinforcement Learning with Limited Samples
Weiqin Chen
Sandipan Mishra
Santiago Paternain
OffRL
65
2
0
22 Aug 2024
Soft Actor-Critic Algorithms and Applications
Tuomas Haarnoja
Aurick Zhou
Kristian Hartikainen
George Tucker
Sehoon Ha
...
Vikash Kumar
Henry Zhu
Abhishek Gupta
Pieter Abbeel
Sergey Levine
94
2,391
0
13 Dec 2018
Provably Efficient Maximum Entropy Exploration
Elad Hazan
Sham Kakade
Karan Singh
A. V. Soest
47
295
0
06 Dec 2018
Deep Reinforcement Learning and the Deadly Triad
H. V. Hasselt
Yotam Doron
Florian Strub
Matteo Hessel
Nicolas Sonnerat
Joseph Modayil
OffRL
56
226
0
06 Dec 2018
The Utility of Sparse Representations for Control in Reinforcement Learning
Vincent Liu
Raksha Kumaraswamy
Lei Le
Martha White
25
61
0
15 Nov 2018
Policy Optimization via Importance Sampling
Alberto Maria Metelli
Matteo Papini
Francesco Faccio
Marcello Restelli
OffRL
67
89
0
17 Sep 2018
QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation
Dmitry Kalashnikov
A. Irpan
P. Pastor
Julian Ibarz
Alexander Herzog
...
Deirdre Quillen
E. Holly
Mrinal Kalakrishnan
Vincent Vanhoucke
Sergey Levine
92
1,454
0
27 Jun 2018
The Unusual Effectiveness of Averaging in GAN Training
Yasin Yazici
Chuan-Sheng Foo
Stefan Winkler
Kim-Hui Yap
Georgios Piliouras
V. Chandrasekhar
97
173
0
12 Jun 2018
Addressing Function Approximation Error in Actor-Critic Methods
Scott Fujimoto
H. V. Hoof
David Meger
OffRL
136
5,121
0
26 Feb 2018
Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research
Matthias Plappert
Marcin Andrychowicz
Alex Ray
Bob McGrew
Bowen Baker
...
Joshua Tobin
Maciek Chociej
Peter Welinder
Vikash Kumar
Wojciech Zaremba
48
562
0
26 Feb 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
180
8,236
0
04 Jan 2018
A Deeper Look at Experience Replay
Shangtong Zhang
R. Sutton
OffRL
VLM
58
271
0
04 Dec 2017
Training GANs with Optimism
C. Daskalakis
Andrew Ilyas
Vasilis Syrgkanis
Haoyang Zeng
78
514
0
31 Oct 2017
Reinforcement Learning with Deep Energy-Based Policies
Tuomas Haarnoja
Haoran Tang
Pieter Abbeel
Sergey Levine
55
1,329
0
27 Feb 2017
Safe and Efficient Off-Policy Reinforcement Learning
Rémi Munos
T. Stepleton
Anna Harutyunyan
Marc G. Bellemare
OffRL
109
611
0
08 Jun 2016
Prioritized Experience Replay
Tom Schaul
John Quan
Ioannis Antonoglou
David Silver
OffRL
185
3,777
0
18 Nov 2015
Continuous control with deep reinforcement learning
Timothy Lillicrap
Jonathan J. Hunt
Alexander Pritzel
N. Heess
Tom Erez
Yuval Tassa
David Silver
Daan Wierstra
168
13,174
0
09 Sep 2015
An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning
R. Sutton
A. R. Mahmood
Martha White
50
267
0
14 Mar 2015
An MDP-based Recommender System
Guy Shani
Ronen I. Brafman
David Heckerman
LRM
55
968
0
12 Dec 2012
Should one compute the Temporal Difference fix point or minimize the Bellman Residual? The unified oblique projection view
B. Scherrer
58
102
0
19 Nov 2010
1