ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1801.01290
  4. Cited By
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
v1v2 (latest)

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
ArXiv (abs)PDFHTML

Papers citing "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"

50 / 4,128 papers shown
Title
Sample-Efficient Model-Free Reinforcement Learning with Off-Policy
  Critics
Sample-Efficient Model-Free Reinforcement Learning with Off-Policy Critics
Denis Steckelmacher
Hélène Plisnier
D. Roijers
A. Nowé
OffRL
71
17
0
11 Mar 2019
Sim-to-Real Transfer for Biped Locomotion
Sim-to-Real Transfer for Biped Locomotion
Wenhao Yu
Visak C. V. Kumar
Greg Turk
Chenxi Liu
62
115
0
04 Mar 2019
A Regularized Approach to Sparse Optimal Policy in Reinforcement
  Learning
A Regularized Approach to Sparse Optimal Policy in Reinforcement Learning
Xiang Li
Wenhao Yang
Zhihua Zhang
29
2
0
02 Mar 2019
Catalyst.RL: A Distributed Framework for Reproducible RL Research
Catalyst.RL: A Distributed Framework for Reproducible RL Research
Sergey Kolesnikov
Oleksii Hrinchuk
OffRL
42
8
0
28 Feb 2019
Diagnosing Bottlenecks in Deep Q-learning Algorithms
Diagnosing Bottlenecks in Deep Q-learning Algorithms
Justin Fu
Aviral Kumar
Matthew Soh
Sergey Levine
OffRL
85
142
0
26 Feb 2019
Distributionally Robust Reinforcement Learning
Distributionally Robust Reinforcement Learning
E. Smirnova
Elvis Dohmatob
Jérémie Mary
OffRL
70
60
0
23 Feb 2019
Investigating Generalisation in Continuous Deep Reinforcement Learning
Investigating Generalisation in Continuous Deep Reinforcement Learning
Chenyang Zhao
Olivier Sigaud
F. Stulp
Timothy M. Hospedales
OffRL
89
48
0
19 Feb 2019
CrossQ: Batch Normalization in Deep Reinforcement Learning for Greater
  Sample Efficiency and Simplicity
CrossQ: Batch Normalization in Deep Reinforcement Learning for Greater Sample Efficiency and Simplicity
Aditya Bhatt
Daniel Palenicek
Boris Belousov
Max Argus
Artemij Amiranashvili
Thomas Brox
Jan Peters
134
57
0
14 Feb 2019
Off-Policy Actor-Critic in an Ensemble: Achieving Maximum General
  Entropy and Effective Environment Exploration in Deep Reinforcement Learning
Off-Policy Actor-Critic in an Ensemble: Achieving Maximum General Entropy and Effective Environment Exploration in Deep Reinforcement Learning
Gang Chen
Yiming Peng
40
8
0
14 Feb 2019
Simultaneously Learning Vision and Feature-based Control Policies for
  Real-world Ball-in-a-Cup
Simultaneously Learning Vision and Feature-based Control Policies for Real-world Ball-in-a-Cup
Devin Schwab
Tobias Springenberg
M. Martins
Thomas Lampe
Michael Neunert
A. Abdolmaleki
Tim Hertweck
Roland Hafner
F. Nori
Martin Riedmiller
76
22
0
13 Feb 2019
Artificial Intelligence for Prosthetics - challenge solutions
Artificial Intelligence for Prosthetics - challenge solutions
L. Kidzinski
Carmichael F. Ong
Sharada Mohanty
Jennifer Hicks
Sean F. Carroll
...
E. Tumer
J. Watson
M. Salathé
Sergey Levine
Scott L. Delp
55
42
0
07 Feb 2019
Tsallis Reinforcement Learning: A Unified Framework for Maximum Entropy
  Reinforcement Learning
Tsallis Reinforcement Learning: A Unified Framework for Maximum Entropy Reinforcement Learning
Kyungjae Lee
Sungyub Kim
Sungbin Lim
Sungjoon Choi
Songhwai Oh
150
28
0
31 Jan 2019
A Theory of Regularized Markov Decision Processes
A Theory of Regularized Markov Decision Processes
Matthieu Geist
B. Scherrer
Olivier Pietquin
147
333
0
31 Jan 2019
InfoBot: Transfer and Exploration via the Information Bottleneck
InfoBot: Transfer and Exploration via the Information Bottleneck
Anirudh Goyal
Riashat Islam
Daniel Strouse
Zafarali Ahmed
M. Botvinick
Hugo Larochelle
Yoshua Bengio
Sergey Levine
OffRL
141
167
0
30 Jan 2019
Discretizing Continuous Action Space for On-Policy Optimization
Discretizing Continuous Action Space for On-Policy Optimization
Yunhao Tang
Shipra Agrawal
OffRL
114
124
0
29 Jan 2019
Trust Region-Guided Proximal Policy Optimization
Trust Region-Guided Proximal Policy Optimization
Yuhui Wang
Hao He
Xiaoyang Tan
Yaozhong Gan
OffRL
89
57
0
29 Jan 2019
Self-organization of action hierarchy and compositionality by
  reinforcement learning with recurrent neural networks
Self-organization of action hierarchy and compositionality by reinforcement learning with recurrent neural networks
Dongqi Han
Kenji Doya
Jun Tani
AI4CE
126
20
0
29 Jan 2019
Modelling Bounded Rationality in Multi-Agent Interactions by Generalized
  Recursive Reasoning
Modelling Bounded Rationality in Multi-Agent Interactions by Generalized Recursive Reasoning
Ying Wen
Yaodong Yang
Rui Luo
Jun Wang
LRM
99
52
0
26 Jan 2019
Credit Assignment Techniques in Stochastic Computation Graphs
Credit Assignment Techniques in Stochastic Computation Graphs
T. Weber
N. Heess
Lars Buesing
David Silver
103
45
0
07 Jan 2019
Hierarchical Reinforcement Learning via Advantage-Weighted Information
  Maximization
Hierarchical Reinforcement Learning via Advantage-Weighted Information Maximization
Takayuki Osa
Voot Tangkaratt
Masashi Sugiyama
OffRL
65
27
0
05 Jan 2019
Adversarial Learning of a Sampler Based on an Unnormalized Distribution
Adversarial Learning of a Sampler Based on an Unnormalized Distribution
Chunyuan Li
Ke Bai
Jianqiao Li
Guoyin Wang
Changyou Chen
Lawrence Carin
157
10
0
03 Jan 2019
Learning to Walk via Deep Reinforcement Learning
Learning to Walk via Deep Reinforcement Learning
Tuomas Haarnoja
Sehoon Ha
Aurick Zhou
Jie Tan
George Tucker
Sergey Levine
148
442
0
26 Dec 2018
TD-Regularized Actor-Critic Methods
TD-Regularized Actor-Critic Methods
Simone Parisi
Voot Tangkaratt
Jan Peters
Mohammad Emtiyaz Khan
OffRL
61
31
0
19 Dec 2018
Soft Actor-Critic Algorithms and Applications
Soft Actor-Critic Algorithms and Applications
Tuomas Haarnoja
Aurick Zhou
Kristian Hartikainen
George Tucker
Sehoon Ha
...
Vikash Kumar
Henry Zhu
Abhishek Gupta
Pieter Abbeel
Sergey Levine
162
2,461
0
13 Dec 2018
Residual Reinforcement Learning for Robot Control
Residual Reinforcement Learning for Robot Control
T. Johannink
Shikhar Bahl
Ashvin Nair
Jianlan Luo
Avinash Kumar
M. Loskyll
J. A. Ojea
Eugen Solowjow
Sergey Levine
OffRL
90
420
0
07 Dec 2018
Provably Efficient Maximum Entropy Exploration
Provably Efficient Maximum Entropy Exploration
Elad Hazan
Sham Kakade
Karan Singh
A. V. Soest
98
305
0
06 Dec 2018
Relative Entropy Regularized Policy Iteration
Relative Entropy Regularized Policy Iteration
A. Abdolmaleki
Jost Tobias Springenberg
Jonas Degrave
Steven Bohez
Yuval Tassa
Dan Belov
N. Heess
Martin Riedmiller
70
72
0
05 Dec 2018
Composing Entropic Policies using Divergence Correction
Composing Entropic Policies using Divergence Correction
Jonathan J. Hunt
André Barreto
Timothy Lillicrap
N. Heess
52
2
0
05 Dec 2018
Exploration versus exploitation in reinforcement learning: a stochastic
  control approach
Exploration versus exploitation in reinforcement learning: a stochastic control approach
Haoran Wang
T. Zariphopoulou
X. Zhou
97
49
0
04 Dec 2018
Generative Adversarial Self-Imitation Learning
Generative Adversarial Self-Imitation Learning
Yijie Guo
Junhyuk Oh
Satinder Singh
Honglak Lee
GAN
102
59
0
03 Dec 2018
VIREL: A Variational Inference Framework for Reinforcement Learning
VIREL: A Variational Inference Framework for Reinforcement Learning
M. Fellows
Anuj Mahajan
Tim G. J. Rudner
Shimon Whiteson
DRL
129
56
0
03 Nov 2018
Horizon: Facebook's Open Source Applied Reinforcement Learning Platform
Horizon: Facebook's Open Source Applied Reinforcement Learning Platform
J. Gauci
Edoardo Conti
Yitao Liang
Kittipat Virochsiri
Yuchen He
Zachary Kaden
Vivek Narayanan
Xiaohui Ye
Zhengxing Chen
Scott Fujimoto
95
139
0
01 Nov 2018
Relative Importance Sampling For Off-Policy Actor-Critic in Deep
  Reinforcement Learning
Relative Importance Sampling For Off-Policy Actor-Critic in Deep Reinforcement Learning
Mahammad Humayoo
Xueqi Cheng
BDLOffRL
49
5
0
30 Oct 2018
Model-Based Active Exploration
Model-Based Active Exploration
Pranav Shyam
Wojciech Ja'skowski
Faustino J. Gomez
103
179
0
29 Oct 2018
Variational Inference with Tail-adaptive f-Divergence
Variational Inference with Tail-adaptive f-Divergence
Dilin Wang
Hao Liu
Qiang Liu
111
55
0
29 Oct 2018
Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy
  Improvement
Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy Improvement
Samuel Neumann
Sungsu Lim
A. Joseph
Yangchen Pan
Adam White
Martha White
128
7
0
22 Oct 2018
Establishing Appropriate Trust via Critical States
Establishing Appropriate Trust via Critical States
Sandy H. Huang
Kush S. Bhatia
Pieter Abbeel
Anca Dragan
OffRL
100
114
0
18 Oct 2018
Deep Reinforcement Learning
Deep Reinforcement Learning
Yuxi Li
VLMOffRL
194
144
0
15 Oct 2018
A Survey and Critique of Multiagent Deep Reinforcement Learning
A Survey and Critique of Multiagent Deep Reinforcement Learning
Pablo Hernandez-Leal
Bilal Kartal
Matthew E. Taylor
OffRL
121
570
0
12 Oct 2018
Actor-Attention-Critic for Multi-Agent Reinforcement Learning
Actor-Attention-Critic for Multi-Agent Reinforcement Learning
Shariq Iqbal
Fei Sha
74
761
0
05 Oct 2018
Boosting Trust Region Policy Optimization by Normalizing Flows Policy
Boosting Trust Region Policy Optimization by Normalizing Flows Policy
Yunhao Tang
Shipra Agrawal
TPM
110
31
0
27 Sep 2018
S-RL Toolbox: Environments, Datasets and Evaluation Metrics for State
  Representation Learning
S-RL Toolbox: Environments, Datasets and Evaluation Metrics for State Representation Learning
Antonin Raffin
Ashley Hill
Kalifou René Traoré
Timothée Lesort
Natalia Díaz Rodríguez
David Filliat
OffRL
112
35
0
25 Sep 2018
A Learning Framework for High Precision Industrial Assembly
A Learning Framework for High Precision Industrial Assembly
Yongxiang Fan
Jieliang Luo
Masayoshi Tomizuka
OffRL
92
50
0
23 Sep 2018
Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward
  Bias in Adversarial Imitation Learning
Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward Bias in Adversarial Imitation Learning
Ilya Kostrikov
Kumar Krishna Agrawal
Debidatta Dwibedi
Sergey Levine
Jonathan Tompson
111
259
0
09 Sep 2018
Unity: A General Platform for Intelligent Agents
Unity: A General Platform for Intelligent Agents
Arthur Juliani
Vincent-Pierre Berges
Esh Vckay
Andrew Cohen
Jonathan Harper
...
Chris Goy
Yuan Gao
Hunter Henry
Marwan Mattar
Danny Lange
96
822
0
07 Sep 2018
Policy Optimization as Wasserstein Gradient Flows
Policy Optimization as Wasserstein Gradient Flows
Ruiyi Zhang
Changyou Chen
Chunyuan Li
Lawrence Carin
88
68
0
09 Aug 2018
Variational Option Discovery Algorithms
Variational Option Discovery Algorithms
Joshua Achiam
Harrison Edwards
Dario Amodei
Pieter Abbeel
DRL
81
180
0
26 Jul 2018
ToriLLE: Learning Environment for Hand-to-Hand Combat
ToriLLE: Learning Environment for Hand-to-Hand Combat
Anssi Kanervisto
Ville Hautamaki
62
2
0
26 Jul 2018
Generative Adversarial Imitation from Observation
Generative Adversarial Imitation from Observation
F. Torabi
Garrett A. Warnell
Peter Stone
GAN
106
245
0
17 Jul 2018
Remember and Forget for Experience Replay
Remember and Forget for Experience Replay
G. Novati
Petros Koumoutsakos
OffRL
108
92
0
16 Jul 2018
Previous
123...818283
Next