Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1704.06440
Cited By
Equivalence Between Policy Gradients and Soft Q-Learning
21 April 2017
John Schulman
Xi Chen
Pieter Abbeel
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Equivalence Between Policy Gradients and Soft Q-Learning"
39 / 89 papers shown
Title
A Tutorial on Sparse Gaussian Processes and Variational Inference
Felix Leibfried
Vincent Dutordoir
S. T. John
N. Durrande
GP
42
49
0
27 Dec 2020
Behavior Priors for Efficient Reinforcement Learning
Dhruva Tirumala
Alexandre Galashov
Hyeonwoo Noh
Leonard Hasenclever
Razvan Pascanu
...
Guillaume Desjardins
Wojciech M. Czarnecki
Arun Ahuja
Yee Whye Teh
N. Heess
37
39
0
27 Oct 2020
Sample Efficient Reinforcement Learning with REINFORCE
Junzi Zhang
Jongho Kim
Brendan O'Donoghue
Stephen P. Boyd
42
101
0
22 Oct 2020
SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning
Kimin Lee
Michael Laskin
A. Srinivas
Pieter Abbeel
OffRL
25
199
0
09 Jul 2020
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory
Yufeng Zhang
Qi Cai
Zhuoran Yang
Yongxin Chen
Zhaoran Wang
OOD
MLT
156
11
0
08 Jun 2020
Leverage the Average: an Analysis of KL Regularization in RL
Nino Vieillard
Tadashi Kozuno
B. Scherrer
Olivier Pietquin
Rémi Munos
M. Geist
25
43
0
31 Mar 2020
Comprehensive Review of Deep Reinforcement Learning Methods and Applications in Economics
Amir H. Mosavi
Pedram Ghamisi
Yaser Faghan
Puhong Duan
OffRL
27
152
0
21 Mar 2020
Off-Policy Deep Reinforcement Learning with Analogous Disentangled Exploration
Guy Van den Broeck
Yitao Liang
Mathias Niepert
OffRL
22
3
0
25 Feb 2020
Distributional Soft Actor-Critic: Off-Policy Reinforcement Learning for Addressing Value Estimation Errors
Jingliang Duan
Yang Guan
Shengbo Eben Li
Yangang Ren
B. Cheng
OffRL
25
174
0
09 Jan 2020
A Survey of Deep Reinforcement Learning in Video Games
Kun Shao
Zhentao Tang
Yuanheng Zhu
Nannan Li
Dongbin Zhao
OffRL
AI4TS
43
188
0
23 Dec 2019
Direct and indirect reinforcement learning
Yang Guan
Shengbo Eben Li
Jingliang Duan
Jie Li
Yangang Ren
Qi Sun
B. Cheng
OffRL
38
34
0
23 Dec 2019
Wield: Systematic Reinforcement Learning With Progressive Randomization
Michael Schaarschmidt
Kai Fricke
Eiko Yoneki
19
2
0
15 Sep 2019
A Unified Bellman Optimality Principle Combining Reward Maximization and Empowerment
Felix Leibfried
Sergio Pascual-Diaz
Jordi Grau-Moya
25
27
0
26 Jul 2019
Neural Temporal-Difference and Q-Learning Provably Converge to Global Optima
Qi Cai
Zhuoran Yang
Jason D. Lee
Zhaoran Wang
42
29
0
24 May 2019
Maximum Entropy-Regularized Multi-Goal Reinforcement Learning
Rui Zhao
Xudong Sun
Volker Tresp
29
80
0
21 May 2019
A Regularized Opponent Model with Maximum Entropy Objective
Zheng Tian
Ying Wen
Zhichen Gong
Faiz Punakkath
Shihao Zou
Jun Wang
30
31
0
17 May 2019
Tsallis Reinforcement Learning: A Unified Framework for Maximum Entropy Reinforcement Learning
Kyungjae Lee
Sungyub Kim
Sungbin Lim
Sungjoon Choi
Songhwai Oh
19
28
0
31 Jan 2019
Learning to Walk via Deep Reinforcement Learning
Tuomas Haarnoja
Sehoon Ha
Aurick Zhou
Jie Tan
George Tucker
Sergey Levine
54
433
0
26 Dec 2018
Learning Montezuma's Revenge from a Single Demonstration
Tim Salimans
Richard J. Chen
42
136
0
08 Dec 2018
Connecting the Dots Between MLE and RL for Sequence Prediction
Bowen Tan
Zhiting Hu
Zichao Yang
Ruslan Salakhutdinov
Eric Xing
28
24
0
24 Nov 2018
VIREL: A Variational Inference Framework for Reinforcement Learning
M. Fellows
Anuj Mahajan
Tim G. J. Rudner
Shimon Whiteson
DRL
38
54
0
03 Nov 2018
Preparing for the Unexpected: Diversity Improves Planning Resilience in Evolutionary Algorithms
Thomas Gabor
Lenz Belzner
Thomy Phan
Kyrill Schmid
19
14
0
30 Oct 2018
A Survey and Critique of Multiagent Deep Reinforcement Learning
Pablo Hernandez-Leal
Bilal Kartal
Matthew E. Taylor
OffRL
48
553
0
12 Oct 2018
Policy Optimization as Wasserstein Gradient Flows
Ruiyi Zhang
Changyou Chen
Chunyuan Li
Lawrence Carin
14
66
0
09 Aug 2018
Maximum a Posteriori Policy Optimisation
A. Abdolmaleki
Jost Tobias Springenberg
Yuval Tassa
Rémi Munos
N. Heess
Martin Riedmiller
48
471
0
14 Jun 2018
Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings
John D. Co-Reyes
YuXuan Liu
Abhishek Gupta
Benjamin Eysenbach
Pieter Abbeel
Sergey Levine
SSL
BDL
AIFin
37
142
0
07 Jun 2018
Efficient Entropy for Policy Gradient with Multidimensional Action Space
Yiming Zhang
Q. Vuong
Kenny Song
Xiao-Yue Gong
Keith Ross
27
17
0
02 Jun 2018
Supervised Policy Update for Deep Reinforcement Learning
Q. Vuong
Yiming Zhang
Keith Ross
19
20
0
29 May 2018
Variational Inverse Control with Events: A General Framework for Data-Driven Reward Definition
Justin Fu
Avi Singh
Dibya Ghosh
Larry Yang
Sergey Levine
BDL
14
125
0
29 May 2018
Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review
Sergey Levine
AI4CE
BDL
33
662
0
02 May 2018
Composable Deep Reinforcement Learning for Robotic Manipulation
Tuomas Haarnoja
Vitchyr H. Pong
Aurick Zhou
Murtaza Dalal
Pieter Abbeel
Sergey Levine
30
230
0
19 Mar 2018
Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research
Matthias Plappert
Marcin Andrychowicz
Alex Ray
Bob McGrew
Bowen Baker
...
Joshua Tobin
Maciek Chociej
Peter Welinder
Vikash Kumar
Wojciech Zaremba
33
557
0
26 Feb 2018
Evolved Policy Gradients
Rein Houthooft
Richard Y. Chen
Phillip Isola
Bradly C. Stadie
Filip Wolski
Jonathan Ho
Pieter Abbeel
49
227
0
13 Feb 2018
SBEED: Convergent Reinforcement Learning with Nonlinear Function Approximation
Bo Dai
Albert Eaton Shaw
Lihong Li
Lin Xiao
Niao He
Zhen Liu
Jianshu Chen
Le Song
34
25
0
29 Dec 2017
A short variational proof of equivalence between policy gradients and soft Q learning
Pierre Harvey Richemond
B. Maginnis
16
5
0
22 Dec 2017
A Brief Survey of Deep Reinforcement Learning
Kai Arulkumaran
M. Deisenroth
Miles Brundage
Anil Anthony Bharath
OffRL
65
2,780
0
19 Aug 2017
An Information-Theoretic Optimality Principle for Deep Reinforcement Learning
Felix Leibfried
Jordi Grau-Moya
Haitham Bou-Ammar
38
24
0
06 Aug 2017
Distral: Robust Multitask Reinforcement Learning
Yee Whye Teh
V. Bapst
Wojciech M. Czarnecki
John Quan
J. Kirkpatrick
R. Hadsell
N. Heess
Razvan Pascanu
44
544
0
13 Jul 2017
Deep Reinforcement Learning: An Overview
Yuxi Li
OffRL
VLM
104
1,505
0
25 Jan 2017
Previous
1
2