ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1902.10250
  4. Cited By
Diagnosing Bottlenecks in Deep Q-learning Algorithms

Diagnosing Bottlenecks in Deep Q-learning Algorithms

26 February 2019
Justin Fu
Aviral Kumar
Matthew Soh
Sergey Levine
    OffRL
ArXivPDFHTML

Papers citing "Diagnosing Bottlenecks in Deep Q-learning Algorithms"

20 / 20 papers shown
Title
Domain Adaptation for Offline Reinforcement Learning with Limited Samples
Domain Adaptation for Offline Reinforcement Learning with Limited Samples
Weiqin Chen
Sandipan Mishra
Santiago Paternain
OffRL
65
2
0
22 Aug 2024
Soft Actor-Critic Algorithms and Applications
Soft Actor-Critic Algorithms and Applications
Tuomas Haarnoja
Aurick Zhou
Kristian Hartikainen
George Tucker
Sehoon Ha
...
Vikash Kumar
Henry Zhu
Abhishek Gupta
Pieter Abbeel
Sergey Levine
94
2,391
0
13 Dec 2018
Provably Efficient Maximum Entropy Exploration
Provably Efficient Maximum Entropy Exploration
Elad Hazan
Sham Kakade
Karan Singh
A. V. Soest
47
295
0
06 Dec 2018
Deep Reinforcement Learning and the Deadly Triad
Deep Reinforcement Learning and the Deadly Triad
H. V. Hasselt
Yotam Doron
Florian Strub
Matteo Hessel
Nicolas Sonnerat
Joseph Modayil
OffRL
56
226
0
06 Dec 2018
The Utility of Sparse Representations for Control in Reinforcement
  Learning
The Utility of Sparse Representations for Control in Reinforcement Learning
Vincent Liu
Raksha Kumaraswamy
Lei Le
Martha White
25
61
0
15 Nov 2018
Policy Optimization via Importance Sampling
Policy Optimization via Importance Sampling
Alberto Maria Metelli
Matteo Papini
Francesco Faccio
Marcello Restelli
OffRL
67
89
0
17 Sep 2018
QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic
  Manipulation
QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation
Dmitry Kalashnikov
A. Irpan
P. Pastor
Julian Ibarz
Alexander Herzog
...
Deirdre Quillen
E. Holly
Mrinal Kalakrishnan
Vincent Vanhoucke
Sergey Levine
92
1,454
0
27 Jun 2018
The Unusual Effectiveness of Averaging in GAN Training
The Unusual Effectiveness of Averaging in GAN Training
Yasin Yazici
Chuan-Sheng Foo
Stefan Winkler
Kim-Hui Yap
Georgios Piliouras
V. Chandrasekhar
97
173
0
12 Jun 2018
Addressing Function Approximation Error in Actor-Critic Methods
Addressing Function Approximation Error in Actor-Critic Methods
Scott Fujimoto
H. V. Hoof
David Meger
OffRL
136
5,121
0
26 Feb 2018
Multi-Goal Reinforcement Learning: Challenging Robotics Environments and
  Request for Research
Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research
Matthias Plappert
Marcin Andrychowicz
Alex Ray
Bob McGrew
Bowen Baker
...
Joshua Tobin
Maciek Chociej
Peter Welinder
Vikash Kumar
Wojciech Zaremba
48
562
0
26 Feb 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
180
8,236
0
04 Jan 2018
A Deeper Look at Experience Replay
A Deeper Look at Experience Replay
Shangtong Zhang
R. Sutton
OffRL
VLM
58
271
0
04 Dec 2017
Training GANs with Optimism
Training GANs with Optimism
C. Daskalakis
Andrew Ilyas
Vasilis Syrgkanis
Haoyang Zeng
78
514
0
31 Oct 2017
Reinforcement Learning with Deep Energy-Based Policies
Reinforcement Learning with Deep Energy-Based Policies
Tuomas Haarnoja
Haoran Tang
Pieter Abbeel
Sergey Levine
55
1,329
0
27 Feb 2017
Safe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement Learning
Rémi Munos
T. Stepleton
Anna Harutyunyan
Marc G. Bellemare
OffRL
109
611
0
08 Jun 2016
Prioritized Experience Replay
Prioritized Experience Replay
Tom Schaul
John Quan
Ioannis Antonoglou
David Silver
OffRL
185
3,777
0
18 Nov 2015
Continuous control with deep reinforcement learning
Continuous control with deep reinforcement learning
Timothy Lillicrap
Jonathan J. Hunt
Alexander Pritzel
N. Heess
Tom Erez
Yuval Tassa
David Silver
Daan Wierstra
168
13,174
0
09 Sep 2015
An Emphatic Approach to the Problem of Off-policy Temporal-Difference
  Learning
An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning
R. Sutton
A. R. Mahmood
Martha White
50
267
0
14 Mar 2015
An MDP-based Recommender System
An MDP-based Recommender System
Guy Shani
Ronen I. Brafman
David Heckerman
LRM
55
968
0
12 Dec 2012
Should one compute the Temporal Difference fix point or minimize the
  Bellman Residual? The unified oblique projection view
Should one compute the Temporal Difference fix point or minimize the Bellman Residual? The unified oblique projection view
B. Scherrer
58
102
0
19 Nov 2010
1