ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1801.01290
  4. Cited By
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
ArXivPDFHTML

Papers citing "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"

50 / 1,748 papers shown
Title
Twice Regularized Markov Decision Processes: The Equivalence between
  Robustness and Regularization
Twice Regularized Markov Decision Processes: The Equivalence between Robustness and Regularization
E. Derman
Yevgeniy Men
Matthieu Geist
Shie Mannor
50
1
0
12 Mar 2023
Inference on Optimal Dynamic Policies via Softmax Approximation
Inference on Optimal Dynamic Policies via Softmax Approximation
Qizhao Chen
Morgane Austern
Vasilis Syrgkanis
OffRL
50
1
0
08 Mar 2023
Soft Actor-Critic Algorithm with Truly-satisfied Inequality Constraint
Soft Actor-Critic Algorithm with Truly-satisfied Inequality Constraint
Taisuke Kobayashi
58
3
0
08 Mar 2023
A Multiplicative Value Function for Safe and Efficient Reinforcement
  Learning
A Multiplicative Value Function for Safe and Efficient Reinforcement Learning
Nick Bührer
Zhejun Zhang
Alexander Liniger
Feng Yu
Luc Van Gool
33
1
0
07 Mar 2023
Environment Transformer and Policy Optimization for Model-Based Offline
  Reinforcement Learning
Environment Transformer and Policy Optimization for Model-Based Offline Reinforcement Learning
Pengqin Wang
Meixin Zhu
Shaojie Shen
OffRL
38
1
0
07 Mar 2023
Efficient Skill Acquisition for Complex Manipulation Tasks in Obstructed
  Environments
Efficient Skill Acquisition for Complex Manipulation Tasks in Obstructed Environments
Jun Yamada
J. Collins
Ingmar Posner
55
8
0
06 Mar 2023
Seq2Seq Imitation Learning for Tactile Feedback-based Manipulation
Seq2Seq Imitation Learning for Tactile Feedback-based Manipulation
Wenyan Yang
A. Angleraud
R. Pieters
Joni Pajarinen
Joni-Kristian Kämäräinen
55
6
0
05 Mar 2023
Virtual Guidance as a Mid-level Representation for Navigation with Augmented Reality
Virtual Guidance as a Mid-level Representation for Navigation with Augmented Reality
Hsuan-Kung Yang
Tsung-Chih Chiang
Tingxin Liu
Chun-Wei Huang
Jou-Min Liu
Tsu-Ching Hsiao
Chun-Yi Lee
39
1
0
05 Mar 2023
Wasserstein Actor-Critic: Directed Exploration via Optimism for
  Continuous-Actions Control
Wasserstein Actor-Critic: Directed Exploration via Optimism for Continuous-Actions Control
Amarildo Likmeta
Matteo Sacco
Alberto Maria Metelli
Marcello Restelli
OffRL
31
3
0
04 Mar 2023
Hindsight States: Blending Sim and Real Task Elements for Efficient
  Reinforcement Learning
Hindsight States: Blending Sim and Real Task Elements for Efficient Reinforcement Learning
Simon Guist
Jan Schneider-Barnes
Alexander Dittrich
V. Berenz
Bernhard Schölkopf
Le Chen
41
3
0
03 Mar 2023
Guarded Policy Optimization with Imperfect Online Demonstrations
Guarded Policy Optimization with Imperfect Online Demonstrations
Zhenghai Xue
Zhenghao Peng
Quanyi Li
Zhihan Liu
Bolei Zhou
OffRL
53
10
0
03 Mar 2023
Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement
  Learning
Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement Learning
Archit Sharma
Ahmed M. Ahmed
Rehaan Ahmad
Chelsea Finn
SSL
61
17
0
02 Mar 2023
The Ladder in Chaos: A Simple and Effective Improvement to General DRL
  Algorithms by Policy Path Trimming and Boosting
The Ladder in Chaos: A Simple and Effective Improvement to General DRL Algorithms by Policy Path Trimming and Boosting
Hongyao Tang
Hao Fei
Jianye Hao
31
1
0
02 Mar 2023
Hallucinated Adversarial Control for Conservative Offline Policy
  Evaluation
Hallucinated Adversarial Control for Conservative Offline Policy Evaluation
Jonas Rothfuss
Bhavya Sukhija
Tobias Birchler
Parnian Kassraie
Andreas Krause
OffRL
34
10
0
02 Mar 2023
LS-IQ: Implicit Reward Regularization for Inverse Reinforcement Learning
LS-IQ: Implicit Reward Regularization for Inverse Reinforcement Learning
Firas Al-Hafez
Davide Tateo
Oleg Arenz
Guoping Zhao
Jan Peters
36
22
0
01 Mar 2023
A Variational Approach to Mutual Information-Based Coordination for
  Multi-Agent Reinforcement Learning
A Variational Approach to Mutual Information-Based Coordination for Multi-Agent Reinforcement Learning
Woojun Kim
Whiyoung Jung
Myungsik Cho
Young-Jin Sung
35
7
0
01 Mar 2023
Learning to Control Autonomous Fleets from Observation via Offline
  Reinforcement Learning
Learning to Control Autonomous Fleets from Observation via Offline Reinforcement Learning
Carolin Schmidt
Daniele Gammelli
Francisco Câmara Pereira
Filipe Rodrigues
OffRL
24
4
0
28 Feb 2023
Human-Inspired Framework to Accelerate Reinforcement Learning
Human-Inspired Framework to Accelerate Reinforcement Learning
Ali Beikmohammadi
Sindri Magnússon
OffRL
34
4
0
28 Feb 2023
(Re)$^2$H2O: Autonomous Driving Scenario Generation via Reversely
  Regularized Hybrid Offline-and-Online Reinforcement Learning
(Re)2^22H2O: Autonomous Driving Scenario Generation via Reversely Regularized Hybrid Offline-and-Online Reinforcement Learning
Haoyi Niu
Kun Ren
Yi Tian Xu
Ziyuan Yang
Yi-Hsin Lin
Yan Zhang
Jianming Hu
OffRL
26
9
0
27 Feb 2023
High-Precise Robot Arm Manipulation based on Online Iterative Learning
  and Forward Simulation with Positioning Error Below End-Effector Physical
  Minimum Displacement
High-Precise Robot Arm Manipulation based on Online Iterative Learning and Forward Simulation with Positioning Error Below End-Effector Physical Minimum Displacement
Weiming Qu
Tianlin Liu
D. Luo
31
2
0
26 Feb 2023
Diffusion Model-Augmented Behavioral Cloning
Diffusion Model-Augmented Behavioral Cloning
Shangcheng Chen
Hsiang-Chun Wang
Ming-Hao Hsu
Chun-Mao Lai
Shao-Hua Sun
DiffM
70
31
0
26 Feb 2023
Reinforcement Learning Based Pushing and Grasping Objects from
  Ungraspable Poses
Reinforcement Learning Based Pushing and Grasping Objects from Ungraspable Poses
Hao Zhang
Hongzhuo Liang
Lin Cong
Jianzhi Lyu
Long Zeng
Pingfa Feng
Jian-Wei Zhang
SSL
DRL
39
9
0
26 Feb 2023
Gauss-Newton Temporal Difference Learning with Nonlinear Function
  Approximation
Gauss-Newton Temporal Difference Learning with Nonlinear Function Approximation
Zhifa Ke
Junyu Zhang
Zaiwen Wen
29
0
0
25 Feb 2023
The Dormant Neuron Phenomenon in Deep Reinforcement Learning
The Dormant Neuron Phenomenon in Deep Reinforcement Learning
Ghada Sokar
Rishabh Agarwal
Pablo Samuel Castro
Utku Evci
CLL
53
91
0
24 Feb 2023
VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function
  Approximation
VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function Approximation
Thanh Nguyen-Tang
R. Arora
OffRL
58
5
0
24 Feb 2023
Neural Laplace Control for Continuous-time Delayed Systems
Neural Laplace Control for Continuous-time Delayed Systems
Samuel Holt
Alihan Huyuk
Zhaozhi Qian
Hao Sun
M. Schaar
OffRL
49
10
0
24 Feb 2023
Why Target Networks Stabilise Temporal Difference Methods
Why Target Networks Stabilise Temporal Difference Methods
Matt Fellows
Matthew Smith
Shimon Whiteson
OOD
AAML
26
7
0
24 Feb 2023
Model-Based Uncertainty in Value Functions
Model-Based Uncertainty in Value Functions
Carlos E. Luis
A. Bottero
Julia Vinogradska
Felix Berkenkamp
Jan Peters
43
14
0
24 Feb 2023
A Supervisory Learning Control Framework for Autonomous & Real-time Task
  Planning for an Underactuated Cooperative Robotic task
A Supervisory Learning Control Framework for Autonomous & Real-time Task Planning for an Underactuated Cooperative Robotic task
Sander De Witte
Tom Lefebvre
Thijs Van Hauwermeiren
Guillaume Crevecoeur
26
0
0
22 Feb 2023
Learning Agile Flights through Narrow Gaps with Varying Angles using
  Onboard Sensing
Learning Agile Flights through Narrow Gaps with Varying Angles using Onboard Sensing
Yuhan Xie
Minghao Lu
Rui Peng
Peng Lu
42
9
0
22 Feb 2023
Reinforcement Learning for Block Decomposition of CAD Models
Reinforcement Learning for Block Decomposition of CAD Models
Benjamin C. DiPrete
R. Garimella
Cristina Garcia-Cardona
Navamita Ray
33
1
0
21 Feb 2023
Improving Deep Policy Gradients with Value Function Search
Improving Deep Policy Gradients with Value Function Search
Enrico Marchesini
Chris Amato
31
9
0
20 Feb 2023
Differentiable Arbitrating in Zero-sum Markov Games
Differentiable Arbitrating in Zero-sum Markov Games
Jing Wang
Meichen Song
Feng Gao
Boyi Liu
Zhaoran Wang
Yi Wu
54
2
0
20 Feb 2023
Demonstration-Guided Reinforcement Learning with Efficient Exploration
  for Task Automation of Surgical Robot
Demonstration-Guided Reinforcement Learning with Efficient Exploration for Task Automation of Surgical Robot
Tao Huang
Kai-xiang Chen
Bin Li
Yunhui Liu
Qingxu Dou
40
23
0
20 Feb 2023
Stochastic Generative Flow Networks
Stochastic Generative Flow Networks
L. Pan
Dinghuai Zhang
Moksh Jain
Longbo Huang
Yoshua Bengio
BDL
55
31
0
19 Feb 2023
Reinforcement Learning in the Wild with Maximum Likelihood-based Model
  Transfer
Reinforcement Learning in the Wild with Maximum Likelihood-based Model Transfer
Hannes Eriksson
D. Basu
Tommy Tram
Mina Alibeigi
Christos Dimitrakakis
26
1
0
18 Feb 2023
Meta-Reinforcement Learning via Exploratory Task Clustering
Meta-Reinforcement Learning via Exploratory Task Clustering
Zhendong Chu
Hongning Wang
OffRL
41
5
0
15 Feb 2023
When Demonstrations Meet Generative World Models: A Maximum Likelihood
  Framework for Offline Inverse Reinforcement Learning
When Demonstrations Meet Generative World Models: A Maximum Likelihood Framework for Offline Inverse Reinforcement Learning
Siliang Zeng
Chenliang Li
Alfredo García
Min-Fong Hong
OffRL
44
13
0
15 Feb 2023
Learning a model is paramount for sample efficiency in reinforcement
  learning control of PDEs
Learning a model is paramount for sample efficiency in reinforcement learning control of PDEs
Stefan Werner
Sebastian Peitz
58
9
0
14 Feb 2023
Unlabeled Imperfect Demonstrations in Adversarial Imitation Learning
Unlabeled Imperfect Demonstrations in Adversarial Imitation Learning
Yunke Wang
Bo Du
Chang Xu
43
8
0
13 Feb 2023
Improving robot navigation in crowded environments using intrinsic rewards
Improving robot navigation in crowded environments using intrinsic rewards
Diego Martínez Baselga
L. Riazuelo
Luis Montano
45
13
0
13 Feb 2023
MANSA: Learning Fast and Slow in Multi-Agent Systems
MANSA: Learning Fast and Slow in Multi-Agent Systems
D. Mguni
Hao Chen
Taher Jafferjee
Jianhong Wang
Long Fei
Xidong Feng
Stephen Marcus McAleer
Feifei Tong
Jun Wang
Yaodong Yang
35
1
0
12 Feb 2023
Distributional GFlowNets with Quantile Flows
Distributional GFlowNets with Quantile Flows
Dinghuai Zhang
L. Pan
Ricky T. Q. Chen
Aaron Courville
Yoshua Bengio
41
25
0
11 Feb 2023
Cross-domain Random Pre-training with Prototypes for Reinforcement Learning
Cross-domain Random Pre-training with Prototypes for Reinforcement Learning
Xin Liu
Yaran Chen
Haoran Li
Boyu Li
Dong Zhao
SSL
75
10
0
11 Feb 2023
Scalability Bottlenecks in Multi-Agent Reinforcement Learning Systems
Scalability Bottlenecks in Multi-Agent Reinforcement Learning Systems
Kailash Gogineni
Peng Wei
Tian-Shing Lan
Guru Venkataramani
29
8
0
10 Feb 2023
CLARE: Conservative Model-Based Reward Learning for Offline Inverse
  Reinforcement Learning
CLARE: Conservative Model-Based Reward Learning for Offline Inverse Reinforcement Learning
Sheng Yue
Guan-Bo Wang
Wei Shao
Zhaofeng Zhang
Sen Lin
Junkai Ren
Junshan Zhang
OffRL
59
20
0
09 Feb 2023
RayNet: A Simulation Platform for Developing Reinforcement
  Learning-Driven Network Protocols
RayNet: A Simulation Platform for Developing Reinforcement Learning-Driven Network Protocols
Luca Giacomoni
Basil Benny
G. Parisis
33
3
0
09 Feb 2023
Learning Interaction-aware Motion Prediction Model for Decision-making
  in Autonomous Driving
Learning Interaction-aware Motion Prediction Model for Decision-making in Autonomous Driving
Zhiyu Huang
Haochen Liu
Jingda Wu
Wenhui Huang
Chen Lv
41
17
0
08 Feb 2023
Predictable MDP Abstraction for Unsupervised Model-Based RL
Predictable MDP Abstraction for Unsupervised Model-Based RL
Seohong Park
Sergey Levine
32
9
0
08 Feb 2023
NeuronsGym: A Hybrid Framework and Benchmark for Robot Tasks with
  Sim2Real Policy Learning
NeuronsGym: A Hybrid Framework and Benchmark for Robot Tasks with Sim2Real Policy Learning
Haoran Li
Shasha Liu
Mingjun Ma
Guangzheng Hu
Yaran Chen
Dong Zhao
41
3
0
07 Feb 2023
Previous
123...111213...333435
Next