ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1801.01290
  4. Cited By
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
ArXivPDFHTML

Papers citing "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"

50 / 1,645 papers shown
Title
Online Meta-Critic Learning for Off-Policy Actor-Critic Methods
Online Meta-Critic Learning for Off-Policy Actor-Critic Methods
Wei Zhou
Yiying Li
Yongxin Yang
Huaimin Wang
Timothy M. Hospedales
OffRL
34
46
0
11 Mar 2020
SQUIRL: Robust and Efficient Learning from Video Demonstration of
  Long-Horizon Robotic Manipulation Tasks
SQUIRL: Robust and Efficient Learning from Video Demonstration of Long-Horizon Robotic Manipulation Tasks
Bohan Wu
Feng Xu
Zhanpeng He
Abhi Gupta
Peter K. Allen
OffRL
23
13
0
10 Mar 2020
Stable Policy Optimization via Off-Policy Divergence Regularization
Stable Policy Optimization via Off-Policy Divergence Regularization
Ahmed Touati
Amy Zhang
Joelle Pineau
Pascal Vincent
OffRL
36
17
0
09 Mar 2020
Scaling MAP-Elites to Deep Neuroevolution
Scaling MAP-Elites to Deep Neuroevolution
Cédric Colas
Joost Huizinga
Vashisht Madhavan
Jeff Clune
33
86
0
03 Mar 2020
Efficient Exploration in Constrained Environments with Goal-Oriented
  Reference Path
Efficient Exploration in Constrained Environments with Goal-Oriented Reference Path
Keita Ota
Y. Sasaki
Devesh K. Jha
Yusuke Yoshiyasu
Asako Kanezaki
27
18
0
03 Mar 2020
Out-of-Distribution Generalization via Risk Extrapolation (REx)
Out-of-Distribution Generalization via Risk Extrapolation (REx)
David M. Krueger
Ethan Caballero
J. Jacobsen
Amy Zhang
Jonathan Binas
Dinghuai Zhang
Rémi Le Priol
Aaron Courville
OOD
215
908
0
02 Mar 2020
PlaNet of the Bayesians: Reconsidering and Improving Deep Planning
  Network by Incorporating Bayesian Inference
PlaNet of the Bayesians: Reconsidering and Improving Deep Planning Network by Incorporating Bayesian Inference
Masashi Okada
Norio Kosaka
T. Taniguchi
13
43
0
01 Mar 2020
Reinforcement Learning through Active Inference
Reinforcement Learning through Active Inference
Alexander Tschantz
Beren Millidge
A. Seth
Christopher L. Buckley
AI4CE
36
69
0
28 Feb 2020
Exploration-efficient Deep Reinforcement Learning with Demonstration
  Guidance for Robot Control
Exploration-efficient Deep Reinforcement Learning with Demonstration Guidance for Robot Control
Ke Lin
Liang Gong
Xudong Li
Te Sun
Binhao Chen
Chengliang Liu
Zhengfeng Zhang
Jian Pu
Junping Zhang
24
8
0
27 Feb 2020
Policy Evaluation Networks
Policy Evaluation Networks
J. Harb
Tom Schaul
Doina Precup
Pierre-Luc Bacon
OffRL
20
36
0
26 Feb 2020
Rewriting History with Inverse RL: Hindsight Inference for Policy
  Improvement
Rewriting History with Inverse RL: Hindsight Inference for Policy Improvement
Benjamin Eysenbach
Xinyang Geng
Sergey Levine
Ruslan Salakhutdinov
OffRL
18
86
0
25 Feb 2020
Off-Policy Deep Reinforcement Learning with Analogous Disentangled
  Exploration
Off-Policy Deep Reinforcement Learning with Analogous Disentangled Exploration
Hoang Trung-Dung
Yitao Liang
Guy Van den Broeck
OffRL
22
3
0
25 Feb 2020
Learning to Walk in the Real World with Minimal Human Effort
Learning to Walk in the Real World with Minimal Human Effort
Sehoon Ha
P. Xu
Zhenyu Tan
Sergey Levine
Jie Tan
31
169
0
20 Feb 2020
Keep Doing What Worked: Behavioral Modelling Priors for Offline
  Reinforcement Learning
Keep Doing What Worked: Behavioral Modelling Priors for Offline Reinforcement Learning
Noah Y. Siegel
Jost Tobias Springenberg
Felix Berkenkamp
A. Abdolmaleki
Michael Neunert
Thomas Lampe
Roland Hafner
Nicolas Heess
Martin Riedmiller
OffRL
22
282
0
19 Feb 2020
Value-driven Hindsight Modelling
Value-driven Hindsight Modelling
A. Guez
Fabio Viola
T. Weber
Lars Buesing
Steven Kapturowski
Doina Precup
David Silver
N. Heess
OffRL
32
12
0
19 Feb 2020
Maxmin Q-learning: Controlling the Estimation Bias of Q-learning
Maxmin Q-learning: Controlling the Estimation Bias of Q-learning
Qingfeng Lan
Yangchen Pan
Alona Fyshe
Martha White
24
176
0
16 Feb 2020
Universal Value Density Estimation for Imitation Learning and
  Goal-Conditioned Reinforcement Learning
Universal Value Density Estimation for Imitation Learning and Goal-Conditioned Reinforcement Learning
Yannick Schroecker
Charles Isbell
OffRL
36
12
0
15 Feb 2020
Learning Pregrasp Manipulation of Objects from Ungraspable Poses
Learning Pregrasp Manipulation of Objects from Ungraspable Poses
Zhaole Sun
Kai Yuan
Wenbin Hu
Chuanyu Yang
Zhibin Li
SSL
27
28
0
15 Feb 2020
Robust Reinforcement Learning via Adversarial training with Langevin
  Dynamics
Robust Reinforcement Learning via Adversarial training with Langevin Dynamics
Parameswaran Kamalaruban
Yu-ting Huang
Ya-Ping Hsieh
Paul Rolland
C. Shi
V. Cevher
31
60
0
14 Feb 2020
On the Sensory Commutativity of Action Sequences for Embodied Agents
On the Sensory Commutativity of Action Sequences for Embodied Agents
Hugo Caselles-Dupré
Michael Garcia Ortiz
David Filliat
21
4
0
13 Feb 2020
BRPO: Batch Residual Policy Optimization
BRPO: Batch Residual Policy Optimization
Kentaro Kanamori
Yinlam Chow
Takuya Takagi
Hiroki Arimura
Honglak Lee
Ken Kobayashi
Craig Boutilier
OffRL
141
46
0
08 Feb 2020
Representation of Reinforcement Learning Policies in Reproducing Kernel
  Hilbert Spaces
Representation of Reinforcement Learning Policies in Reproducing Kernel Hilbert Spaces
Bogdan Mazoure
T. Doan
Tianyu Li
V. Makarenkov
Joelle Pineau
Doina Precup
Guillaume Rabusseau
OffRL
21
1
0
07 Feb 2020
Off-policy Maximum Entropy Reinforcement Learning : Soft Actor-Critic
  with Advantage Weighted Mixture Policy(SAC-AWMP)
Off-policy Maximum Entropy Reinforcement Learning : Soft Actor-Critic with Advantage Weighted Mixture Policy(SAC-AWMP)
Zhimin Hou
Kuangen Zhang
Yi Wan
Dongyu Li
Chenglong Fu
Haoyong Yu
27
15
0
07 Feb 2020
Ready Policy One: World Building Through Active Learning
Ready Policy One: World Building Through Active Learning
Philip J. Ball
Jack Parker-Holder
Aldo Pacchiano
K. Choromanski
Stephen J. Roberts
OffRL
32
49
0
07 Feb 2020
Interpretable End-to-end Urban Autonomous Driving with Latent Deep
  Reinforcement Learning
Interpretable End-to-end Urban Autonomous Driving with Latent Deep Reinforcement Learning
Jianyu Chen
Shengbo Eben Li
Masayoshi Tomizuka
57
226
0
23 Jan 2020
Cooperative Highway Work Zone Merge Control based on Reinforcement
  Learning in A Connected and Automated Environment
Cooperative Highway Work Zone Merge Control based on Reinforcement Learning in A Connected and Automated Environment
Tianzhu Ren
Yuanchang Xie
Liming Jiang
22
31
0
21 Jan 2020
Gradient Surgery for Multi-Task Learning
Gradient Surgery for Multi-Task Learning
Tianhe Yu
Saurabh Kumar
Abhishek Gupta
Sergey Levine
Karol Hausman
Chelsea Finn
41
1,175
0
19 Jan 2020
Population-Guided Parallel Policy Search for Reinforcement Learning
Population-Guided Parallel Policy Search for Reinforcement Learning
Whiyoung Jung
Giseung Park
Y. Sung
OffRL
24
38
0
09 Jan 2020
Distributional Soft Actor-Critic: Off-Policy Reinforcement Learning for
  Addressing Value Estimation Errors
Distributional Soft Actor-Critic: Off-Policy Reinforcement Learning for Addressing Value Estimation Errors
Jingliang Duan
Yang Guan
Shengbo Eben Li
Yangang Ren
B. Cheng
OffRL
25
174
0
09 Jan 2020
Perception and Navigation in Autonomous Systems in the Era of Learning:
  A Survey
Perception and Navigation in Autonomous Systems in the Era of Learning: A Survey
Yang Tang
Chaoqiang Zhao
Jianrui Wang
Chongzhen Zhang
Qiyu Sun
Weixing Zheng
W. Du
Feng Qian
Jürgen Kurths
20
65
0
08 Jan 2020
High-speed Autonomous Drifting with Deep Reinforcement Learning
High-speed Autonomous Drifting with Deep Reinforcement Learning
Peide Cai
Xiaodong Mei
L. Tai
Yuxiang Sun
Ming Liu
19
109
0
06 Jan 2020
Making Sense of Reinforcement Learning and Probabilistic Inference
Making Sense of Reinforcement Learning and Probabilistic Inference
Brendan O'Donoghue
Ian Osband
Catalin Ionescu
OffRL
29
48
0
03 Jan 2020
Joint Goal and Strategy Inference across Heterogeneous Demonstrators via
  Reward Network Distillation
Joint Goal and Strategy Inference across Heterogeneous Demonstrators via Reward Network Distillation
Letian Chen
Rohan R. Paleja
Muyleng Ghuy
Matthew C. Gombolay
30
38
0
02 Jan 2020
SLM Lab: A Comprehensive Benchmark and Modular Software Framework for
  Reproducible Deep Reinforcement Learning
SLM Lab: A Comprehensive Benchmark and Modular Software Framework for Reproducible Deep Reinforcement Learning
Keng Wah Loon
L. Graesser
Milan Cvitkovic
OffRL
26
13
0
28 Dec 2019
Discrete and Continuous Action Representation for Practical RL in Video
  Games
Discrete and Continuous Action Representation for Practical RL in Video Games
Olivier Delalleau
Maxim Peter
Eloi Alonso
Adrien Logut
25
52
0
23 Dec 2019
A Survey of Deep Reinforcement Learning in Video Games
A Survey of Deep Reinforcement Learning in Video Games
Kun Shao
Zhentao Tang
Yuanheng Zhu
Nannan Li
Dongbin Zhao
OffRL
AI4TS
43
188
0
23 Dec 2019
Variational Recurrent Models for Solving Partially Observable Control
  Tasks
Variational Recurrent Models for Solving Partially Observable Control Tasks
Dongqi Han
Kenji Doya
Jun Tani
DRL
OffRL
21
59
0
23 Dec 2019
Direct and indirect reinforcement learning
Direct and indirect reinforcement learning
Yang Guan
Shengbo Eben Li
Jingliang Duan
Jie Li
Yangang Ren
Qi Sun
B. Cheng
OffRL
38
34
0
23 Dec 2019
Recruitment-imitation Mechanism for Evolutionary Reinforcement Learning
Recruitment-imitation Mechanism for Evolutionary Reinforcement Learning
Shuai Lu
Shuai Han
Wenbo Zhou
Junwei Zhang
29
26
0
13 Dec 2019
Combining Q-Learning and Search with Amortized Value Estimates
Combining Q-Learning and Search with Amortized Value Estimates
Jessica B. Hamrick
V. Bapst
Alvaro Sanchez-Gonzalez
Tobias Pfaff
T. Weber
Lars Buesing
Peter W. Battaglia
OffRL
32
47
0
05 Dec 2019
Human-Robot Collaboration via Deep Reinforcement Learning of Real-World
  Interactions
Human-Robot Collaboration via Deep Reinforcement Learning of Real-World Interactions
Jonas Tjomsland
A. Shafti
Aldo A. Faisal
19
6
0
02 Dec 2019
Multi-Vehicle Mixed-Reality Reinforcement Learning for Autonomous
  Multi-Lane Driving
Multi-Vehicle Mixed-Reality Reinforcement Learning for Autonomous Multi-Lane Driving
Rupert Mitchell
Jenny Fletcher
Jacopo Panerati
Amanda Prorok
32
17
0
26 Nov 2019
Adaptive dynamic programming for nonaffine nonlinear optimal control
  problem with state constraints
Adaptive dynamic programming for nonaffine nonlinear optimal control problem with state constraints
Jingliang Duan
Zhengyu Liu
Shengbo Eben Li
Qi Sun
Zhenzhong Jia
B. Cheng
23
64
0
26 Nov 2019
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and
  Algorithms
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms
Kaipeng Zhang
Zhuoran Yang
Tamer Basar
68
1,184
0
24 Nov 2019
Implicit Generative Modeling for Efficient Exploration
Implicit Generative Modeling for Efficient Exploration
Neale Ratzlaff
Qinxun Bai
Fuxin Li
Wenyuan Xu
27
12
0
19 Nov 2019
IKEA Furniture Assembly Environment for Long-Horizon Complex
  Manipulation Tasks
IKEA Furniture Assembly Environment for Long-Horizon Complex Manipulation Tasks
Youngwoon Lee
E. Hu
Zhengyu Yang
Alexander Yin
Joseph J. Lim
36
122
0
17 Nov 2019
Deep Reinforcement Learning Attitude Control of Fixed-Wing UAVs Using
  Proximal Policy Optimization
Deep Reinforcement Learning Attitude Control of Fixed-Wing UAVs Using Proximal Policy Optimization
Eivind Bøhn
E. M. Coates
Signe Moe
T. Johansen
28
129
0
13 Nov 2019
Combinatorial Optimization by Graph Pointer Networks and Hierarchical
  Reinforcement Learning
Combinatorial Optimization by Graph Pointer Networks and Hierarchical Reinforcement Learning
Qiang Ma
Suwen Ge
Danyang He
D. Thaker
Iddo Drori
11
186
0
12 Nov 2019
Real-Time Reinforcement Learning
Real-Time Reinforcement Learning
Simon Ramstedt
C. Pal
AI4CE
19
62
0
11 Nov 2019
Multi-Path Policy Optimization
Multi-Path Policy Optimization
L. Pan
Qingpeng Cai
Longbo Huang
18
2
0
11 Nov 2019
Previous
123...30313233
Next