ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1801.01290
  4. Cited By
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
v1v2 (latest)

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
ArXiv (abs)PDFHTML

Papers citing "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"

50 / 4,128 papers shown
Title
Learning Action-Transferable Policy with Action Embedding
Learning Action-Transferable Policy with Action Embedding
Yu Chen
Yingfeng Chen
Zhipeng Hu
Tianpei Yang
Changjie Fan
Yang Yu
Jianye Hao
86
0
0
05 Sep 2019
rlpyt: A Research Code Base for Deep Reinforcement Learning in PyTorch
rlpyt: A Research Code Base for Deep Reinforcement Learning in PyTorch
Adam Stooke
Pieter Abbeel
OffRL
79
98
0
03 Sep 2019
Generalization in Transfer Learning
Generalization in Transfer Learning
S. E. Ada
Emre Ugur
H. L. Akin
76
18
0
03 Sep 2019
Reinforcement learning with world model
Reinforcement learning with world model
Jingbin Liu
Xinyang Gu
Shuai Liu
31
0
0
30 Aug 2019
Tutorial and Survey on Probabilistic Graphical Model and Variational
  Inference in Deep Reinforcement Learning
Tutorial and Survey on Probabilistic Graphical Model and Variational Inference in Deep Reinforcement Learning
Xudong Sun
B. Bischl
BDL
67
9
0
25 Aug 2019
Dynamics-aware Embeddings
Dynamics-aware Embeddings
William F. Whitney
Rajat Agarwal
Kyunghyun Cho
Abhinav Gupta
SSL
100
53
0
25 Aug 2019
A Comparison of Action Spaces for Learning Manipulation Tasks
A Comparison of Action Spaces for Learning Manipulation Tasks
Patrick Varin
Lev Grossman
S. Kuindersma
67
34
0
23 Aug 2019
Inverse Rational Control with Partially Observable Continuous Nonlinear
  Dynamics
Inverse Rational Control with Partially Observable Continuous Nonlinear Dynamics
Saurabh Daptardar
Paul Schrater
Xaq Pitkow
76
39
0
13 Aug 2019
Multi-Agent Manipulation via Locomotion using Hierarchical Sim2Real
Multi-Agent Manipulation via Locomotion using Hierarchical Sim2Real
Ofir Nachum
Michael Ahn
Hugo Ponte
S. Gu
Vikash Kumar
68
91
0
13 Aug 2019
A Review of Cooperative Multi-Agent Deep Reinforcement Learning
A Review of Cooperative Multi-Agent Deep Reinforcement Learning
Afshin Oroojlooyjadid
Davood Hajinezhad
124
439
0
11 Aug 2019
A physics-informed reinforcement learning approach for the interfacial
  area transport in two-phase flow
A physics-informed reinforcement learning approach for the interfacial area transport in two-phase flow
Z. Dang
M. Ishii
AI4CE
36
9
0
06 Aug 2019
DoorGym: A Scalable Door Opening Environment And Baseline Agent
DoorGym: A Scalable Door Opening Environment And Baseline Agent
Y. Urakami
Alec Hodgkinson
Casey Carlin
Randall Leu
Luca Rigazio
Pieter Abbeel
OffRL
109
57
0
05 Aug 2019
A View on Deep Reinforcement Learning in System Optimization
A View on Deep Reinforcement Learning in System Optimization
Ameer Haj-Ali
Nesreen Ahmed
Theodore L. Willke
Joseph E. Gonzalez
Krste Asanović
Ion Stoica
OffRL
69
8
0
04 Aug 2019
Reinforcement Learning for Personalized Dialogue Management
Reinforcement Learning for Personalized Dialogue Management
Floris den Hengst
Mark Hoogendoorn
F. V. Harmelen
Joost Bosman
OffRL
49
20
0
01 Aug 2019
Making Sense of Vision and Touch: Learning Multimodal Representations
  for Contact-Rich Tasks
Making Sense of Vision and Touch: Learning Multimodal Representations for Contact-Rich Tasks
Michelle A. Lee
Yuke Zhu
Peter Zachares
Matthew Tan
K. Srinivasan
Silvio Savarese
Fei-Fei Li
Animesh Garg
Jeannette Bohg
SSL
92
213
0
28 Jul 2019
Self-Imitation Learning of Locomotion Movements through Termination
  Curriculum
Self-Imitation Learning of Locomotion Movements through Termination Curriculum
Amin Babadi
Kourosh Naderi
Perttu Hämäläinen
55
7
0
27 Jul 2019
A Unified Bellman Optimality Principle Combining Reward Maximization and
  Empowerment
A Unified Bellman Optimality Principle Combining Reward Maximization and Empowerment
Felix Leibfried
Sergio Pascual-Diaz
Jordi Grau-Moya
123
29
0
26 Jul 2019
Deep Reinforcement Learning for Autonomous Internet of Things: Model,
  Applications and Challenges
Deep Reinforcement Learning for Autonomous Internet of Things: Model, Applications and Challenges
Lei Lei
Yue Tan
Kan Zheng
Shiwen Liu
K. Zheng
Xuemin Shen
Shen
OffRL
89
205
0
22 Jul 2019
Characterizing Attacks on Deep Reinforcement Learning
Characterizing Attacks on Deep Reinforcement Learning
Xinlei Pan
Chaowei Xiao
Warren He
Shuang Yang
Jian Peng
...
Jinfeng Yi
Zijiang Yang
Mingyan D. Liu
Yue Liu
Basel Alomair
AAML
104
70
0
21 Jul 2019
Potential-Based Advice for Stochastic Policy Learning
Potential-Based Advice for Stochastic Policy Learning
Baicen Xiao
Bhaskar Ramasubramanian
Andrew Clark
Hannaneh Hajishirzi
L. Bushnell
Radha Poovendran
OffRL
29
5
0
20 Jul 2019
Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill
  Discovery
Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill Discovery
Kristian Hartikainen
Xinyang Geng
Tuomas Haarnoja
Sergey Levine
SSL
116
82
0
18 Jul 2019
Proximal Policy Optimization with Mixed Distributed Training
Proximal Policy Optimization with Mixed Distributed Training
Zhenyu Zhang
Xiangfeng Luo
Tong Liu
Shaorong Xie
Jianshu Wang
Wei Wang
Yongbin Li
Yan Peng
OffRL
41
21
0
15 Jul 2019
Neural Embedding for Physical Manipulations
Neural Embedding for Physical Manipulations
Lingzhi Zhang
Andong Cao
Rui Li
Jianbo Shi
DRL
31
0
0
13 Jul 2019
A Convergence Result for Regularized Actor-Critic Methods
A Convergence Result for Regularized Actor-Critic Methods
Wesley A Suttle
Zhuoran Yang
Jianchao Tan
Ji Liu
30
1
0
13 Jul 2019
Learning Self-Correctable Policies and Value Functions from
  Demonstrations with Negative Sampling
Learning Self-Correctable Policies and Value Functions from Demonstrations with Negative Sampling
Yuping Luo
Huazhe Xu
Tengyu Ma
SSL
87
13
0
12 Jul 2019
Deep Active Inference as Variational Policy Gradients
Deep Active Inference as Variational Policy Gradients
Beren Millidge
BDL
97
103
0
08 Jul 2019
Data Efficient Reinforcement Learning for Legged Robots
Data Efficient Reinforcement Learning for Legged Robots
Yuxiang Yang
Ken Caluwaerts
Atil Iscen
Tingnan Zhang
Jie Tan
Vikas Sindhwani
85
141
0
08 Jul 2019
On-Policy Robot Imitation Learning from a Converging Supervisor
On-Policy Robot Imitation Learning from a Converging Supervisor
Ashwin Balakrishna
Brijen Thananjeyan
Jonathan Lee
Felix Li
Arsh Zahed
Joseph E. Gonzalez
Ken Goldberg
141
17
0
08 Jul 2019
Variational Inference MPC for Bayesian Model-based Reinforcement
  Learning
Variational Inference MPC for Bayesian Model-based Reinforcement Learning
Masashi Okada
T. Taniguchi
79
80
0
08 Jul 2019
A Review of Robot Learning for Manipulation: Challenges,
  Representations, and Algorithms
A Review of Robot Learning for Manipulation: Challenges, Representations, and Algorithms
Oliver Kroemer
S. Niekum
George Konidaris
153
369
0
06 Jul 2019
Benchmarking Model-Based Reinforcement Learning
Benchmarking Model-Based Reinforcement Learning
Tingwu Wang
Xuchan Bao
I. Clavera
Jerrick Hoang
Yeming Wen
Eric D. Langlois
Matthew Shunshi Zhang
Guodong Zhang
Pieter Abbeel
Jimmy Ba
OffRL
122
365
0
03 Jul 2019
Dynamics-Aware Unsupervised Discovery of Skills
Dynamics-Aware Unsupervised Discovery of Skills
Archit Sharma
S. Gu
Sergey Levine
Vikash Kumar
Karol Hausman
134
414
0
02 Jul 2019
Modified Actor-Critics
Modified Actor-Critics
Erinc Merdivan
S. Hanke
Matthieu Geist
45
2
0
02 Jul 2019
Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a
  Latent Variable Model
Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model
Alex X. Lee
Anusha Nagabandi
Pieter Abbeel
Sergey Levine
OffRLBDL
115
383
0
01 Jul 2019
FiDi-RL: Incorporating Deep Reinforcement Learning with Finite-Difference Policy Search for Efficient Learning of Continuous Control
Longxiang Shi
Shijian Li
LongBing Cao
Long Yang
Gang Zheng
Gang Pan
28
5
0
01 Jul 2019
Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human
  Preferences in Dialog
Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog
Natasha Jaques
Asma Ghandeharioun
J. Shen
Craig Ferguson
Àgata Lapedriza
Noah J. Jones
S. Gu
Rosalind W. Picard
OffRL
159
344
0
30 Jun 2019
Learning Policies through Quantile Regression
Learning Policies through Quantile Regression
Oliver Richter
Roger Wattenhofer
53
0
0
27 Jun 2019
Uncertainty-aware Model-based Policy Optimization
Uncertainty-aware Model-based Policy Optimization
Tung-Long Vuong
Kenneth Tran
45
11
0
25 Jun 2019
Policy Optimization with Stochastic Mirror Descent
Policy Optimization with Stochastic Mirror Descent
Long Yang
Yu Zhang
Gang Zheng
Qian Zheng
Pengfei Li
Jianhang Huang
Jun Wen
Gang Pan
128
34
0
25 Jun 2019
Neural Proximal/Trust Region Policy Optimization Attains Globally
  Optimal Policy
Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy
Boyi Liu
Qi Cai
Zhuoran Yang
Zhaoran Wang
103
111
0
25 Jun 2019
Deep Conservative Policy Iteration
Deep Conservative Policy Iteration
Nino Vieillard
Olivier Pietquin
Matthieu Geist
42
26
0
24 Jun 2019
Ranking Policy Gradient
Ranking Policy Gradient
Kaixiang Lin
Jiayu Zhou
OffRL
67
7
0
24 Jun 2019
Disentangled Skill Embeddings for Reinforcement Learning
Disentangled Skill Embeddings for Reinforcement Learning
Janith C. Petangoda
Sergio Pascual-Diaz
Vincent Adam
Peter Vrancx
Jordi Grau-Moya
DRLOffRL
65
15
0
21 Jun 2019
Exploring Model-based Planning with Policy Networks
Exploring Model-based Planning with Policy Networks
Tingwu Wang
Jimmy Ba
128
150
0
20 Jun 2019
Calibrated Model-Based Deep Reinforcement Learning
Calibrated Model-Based Deep Reinforcement Learning
Ali Malik
Volodymyr Kuleshov
Jiaming Song
Danny Nemer
Harlan Seymour
Stefano Ermon
159
55
0
19 Jun 2019
When to Trust Your Model: Model-Based Policy Optimization
When to Trust Your Model: Model-Based Policy Optimization
Michael Janner
Justin Fu
Marvin Zhang
Sergey Levine
OffRL
131
965
0
19 Jun 2019
Reward Prediction Error as an Exploration Objective in Deep RL
Reward Prediction Error as an Exploration Objective in Deep RL
Riley Simmons-Edler
Ben Eisner
Daniel Yang
Anthony Bisulco
E. Mitchell
Sebastian Seung
Daniel D. Lee
69
5
0
19 Jun 2019
Language as an Abstraction for Hierarchical Deep Reinforcement Learning
Language as an Abstraction for Hierarchical Deep Reinforcement Learning
Yiding Jiang
S. Gu
Kevin Patrick Murphy
Chelsea Finn
OffRL
69
225
0
18 Jun 2019
Hierarchical Soft Actor-Critic: Adversarial Exploration via Mutual
  Information Optimization
Hierarchical Soft Actor-Critic: Adversarial Exploration via Mutual Information Optimization
Ari Azarafrooz
John Brock
26
3
0
17 Jun 2019
Is the Policy Gradient a Gradient?
Is the Policy Gradient a Gradient?
Chris Nota
Philip S. Thomas
101
58
0
17 Jun 2019
Previous
123...7980818283
Next