ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1801.01290
  4. Cited By
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
v1v2 (latest)

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
ArXiv (abs)PDFHTML

Papers citing "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"

50 / 4,130 papers shown
Title
Real-time scheduling of renewable power systems through planning-based
  reinforcement learning
Real-time scheduling of renewable power systems through planning-based reinforcement learning
Shao-Wei Liu
Jinbo Liu
Weirui Ye
Nan Yang
Guanglu Zhang
...
C. Kang
Qirong Jiang
Xuri Song
Fangchun Di
Yang Gao
73
4
0
09 Mar 2023
GOATS: Goal Sampling Adaptation for Scooping with Curriculum
  Reinforcement Learning
GOATS: Goal Sampling Adaptation for Scooping with Curriculum Reinforcement Learning
Yaru Niu
Shiyu Jin
Zeqing Zhang
Jiacheng Zhu
Ding Zhao
Liangjun Zhang
106
7
0
09 Mar 2023
Inference on Optimal Dynamic Policies via Softmax Approximation
Inference on Optimal Dynamic Policies via Softmax Approximation
Qizhao Chen
Morgane Austern
Vasilis Syrgkanis
OffRL
93
1
0
08 Mar 2023
Soft Actor-Critic Algorithm with Truly-satisfied Inequality Constraint
Soft Actor-Critic Algorithm with Truly-satisfied Inequality Constraint
Taisuke Kobayashi
119
3
0
08 Mar 2023
ConBaT: Control Barrier Transformer for Safe Policy Learning
ConBaT: Control Barrier Transformer for Safe Policy Learning
Yue Meng
Sai H. Vemprala
Rogerio Bonatti
Chuchu Fan
Ashish Kapoor
OffRL
80
3
0
07 Mar 2023
A Strategy-Oriented Bayesian Soft Actor-Critic Model
A Strategy-Oriented Bayesian Soft Actor-Critic Model
Qin Yang
Ramviyas Parasuraman
73
8
0
07 Mar 2023
A Multiplicative Value Function for Safe and Efficient Reinforcement
  Learning
A Multiplicative Value Function for Safe and Efficient Reinforcement Learning
Nick Bührer
Zhejun Zhang
Alexander Liniger
Feng Yu
Luc Van Gool
67
1
0
07 Mar 2023
Decoupling Skill Learning from Robotic Control for Generalizable Object
  Manipulation
Decoupling Skill Learning from Robotic Control for Generalizable Object Manipulation
Kai Lu
Bo Yang
Bing Wang
Andrew Markham
83
4
0
07 Mar 2023
MAP-Elites with Descriptor-Conditioned Gradients and Archive
  Distillation into a Single Policy
MAP-Elites with Descriptor-Conditioned Gradients and Archive Distillation into a Single Policy
Maxence Faldor
Félix Chalumeau
Manon Flageat
Antoine Cully
92
19
0
07 Mar 2023
Environment Transformer and Policy Optimization for Model-Based Offline
  Reinforcement Learning
Environment Transformer and Policy Optimization for Model-Based Offline Reinforcement Learning
Pengqin Wang
Meixin Zhu
Shaojie Shen
OffRL
90
1
0
07 Mar 2023
Sample-efficient Real-time Planning with Curiosity Cross-Entropy Method
  and Contrastive Learning
Sample-efficient Real-time Planning with Curiosity Cross-Entropy Method and Contrastive Learning
Mostafa Kotb
C. Weber
S. Wermter
76
4
0
07 Mar 2023
End-to-End Learning of Deep Visuomotor Policy for Needle Picking
End-to-End Learning of Deep Visuomotor Policy for Needle Picking
Hongbin Lin
Bin Li
Xiangyu Chu
Qi Dou
Yunhui Liu
K. W. S. Au
OffRL
85
5
0
07 Mar 2023
Controlled Diversity with Preference : Towards Learning a Diverse Set of
  Desired Skills
Controlled Diversity with Preference : Towards Learning a Diverse Set of Desired Skills
Maxence Hussonnois
Thommen George Karimpanal
Santu Rana
78
5
0
07 Mar 2023
Evolutionary Reinforcement Learning: A Survey
Evolutionary Reinforcement Learning: A Survey
Hui Bai
Ran Cheng
Yaochu Jin
OffRL
142
56
0
07 Mar 2023
Dexterous In-hand Manipulation by Guiding Exploration with Simple
  Sub-skill Controllers
Dexterous In-hand Manipulation by Guiding Exploration with Simple Sub-skill Controllers
Gagan Khandate
C. Mehlman
Xingsheng Wei
M. Ciocarlie
64
3
0
06 Mar 2023
Sampling-based Exploration for Reinforcement Learning of Dexterous
  Manipulation
Sampling-based Exploration for Reinforcement Learning of Dexterous Manipulation
Gagan Khandate
Siqi Shang
Eric Chang
Tristan L. Saidi
Yang Liu
Seth Matthew Dennis
Johnson Adams
M. Ciocarlie
99
32
0
06 Mar 2023
Efficient Skill Acquisition for Complex Manipulation Tasks in Obstructed
  Environments
Efficient Skill Acquisition for Complex Manipulation Tasks in Obstructed Environments
Jun Yamada
J. Collins
Ingmar Posner
78
8
0
06 Mar 2023
Seq2Seq Imitation Learning for Tactile Feedback-based Manipulation
Seq2Seq Imitation Learning for Tactile Feedback-based Manipulation
Wenyan Yang
A. Angleraud
R. Pieters
Joni Pajarinen
Joni-Kristian Kämäräinen
98
11
0
05 Mar 2023
Bounding the Optimal Value Function in Compositional Reinforcement
  Learning
Bounding the Optimal Value Function in Compositional Reinforcement Learning
Jacob Adamczyk
Volodymyr Makarenko
A. Arriojas
Stas Tiomkin
R. Kulkarni
OffRL
73
2
0
05 Mar 2023
Virtual Guidance as a Mid-level Representation for Navigation with Augmented Reality
Virtual Guidance as a Mid-level Representation for Navigation with Augmented Reality
Hsuan-Kung Yang
Tsung-Chih Chiang
Tingxin Liu
Chun-Wei Huang
Jou-Min Liu
Tsu-Ching Hsiao
Chun-Yi Lee
69
1
0
05 Mar 2023
CFlowNets: Continuous Control with Generative Flow Networks
CFlowNets: Continuous Control with Generative Flow Networks
Yinchuan Li
Shuang Luo
Haozhi Wang
Jianye Hao
148
23
0
04 Mar 2023
Demonstration-guided Deep Reinforcement Learning for Coordinated Ramp
  Metering and Perimeter Control in Large Scale Networks
Demonstration-guided Deep Reinforcement Learning for Coordinated Ramp Metering and Perimeter Control in Large Scale Networks
Zijian Hu
Wei-Ying Ma
39
5
0
04 Mar 2023
Wasserstein Actor-Critic: Directed Exploration via Optimism for
  Continuous-Actions Control
Wasserstein Actor-Critic: Directed Exploration via Optimism for Continuous-Actions Control
Amarildo Likmeta
Matteo Sacco
Alberto Maria Metelli
Marcello Restelli
OffRL
74
5
0
04 Mar 2023
FluidLab: A Differentiable Environment for Benchmarking Complex Fluid
  Manipulation
FluidLab: A Differentiable Environment for Benchmarking Complex Fluid Manipulation
Zhou Xian
Bo Zhu
Zhenjia Xu
H. Tung
Antonio Torralba
Katerina Fragkiadaki
Chuang Gan
98
47
0
04 Mar 2023
Hindsight States: Blending Sim and Real Task Elements for Efficient
  Reinforcement Learning
Hindsight States: Blending Sim and Real Task Elements for Efficient Reinforcement Learning
Simon Guist
Jan Schneider-Barnes
Alexander Dittrich
V. Berenz
Bernhard Schölkopf
Le Chen
100
3
0
03 Mar 2023
How To Guide Your Learner: Imitation Learning with Active Adaptive
  Expert Involvement
How To Guide Your Learner: Imitation Learning with Active Adaptive Expert Involvement
Xu-Hui Liu
Feng Xu
Xinyu Zhang
Tianyuan Liu
Shengyi Jiang
Rui Chen
Zongzhang Zhang
Yang Yu
123
12
0
03 Mar 2023
Decision Transformer under Random Frame Dropping
Decision Transformer under Random Frame Dropping
Kaizhe Hu
Rachel Zheng
Yang Gao
Huazhe Xu
OffRL
177
13
0
03 Mar 2023
Guarded Policy Optimization with Imperfect Online Demonstrations
Guarded Policy Optimization with Imperfect Online Demonstrations
Zhenghai Xue
Zhenghao Peng
Quanyi Li
Zhihan Liu
Bolei Zhou
OffRL
80
12
0
03 Mar 2023
RePreM: Representation Pre-training with Masked Model for Reinforcement
  Learning
RePreM: Representation Pre-training with Masked Model for Reinforcement Learning
Yuanying Cai
Wei Shen
Wei Shen
Xuyun Zhang
Wenjie Ruan
Longbo Huang
OffRL
99
5
0
03 Mar 2023
Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement
  Learning
Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement Learning
Archit Sharma
Ahmed M. Ahmed
Rehaan Ahmad
Chelsea Finn
SSL
137
18
0
02 Mar 2023
The Ladder in Chaos: A Simple and Effective Improvement to General DRL
  Algorithms by Policy Path Trimming and Boosting
The Ladder in Chaos: A Simple and Effective Improvement to General DRL Algorithms by Policy Path Trimming and Boosting
Hongyao Tang
Hao Fei
Jianye Hao
69
1
0
02 Mar 2023
PlaNet-ClothPick: Effective Fabric Flattening Based on Latent Dynamic
  Planning
PlaNet-ClothPick: Effective Fabric Flattening Based on Latent Dynamic Planning
Halid Abdulrahim Kadi
K. Terzic
73
1
0
02 Mar 2023
Resource-Constrained Station-Keeping for Helium Balloons using
  Reinforcement Learning
Resource-Constrained Station-Keeping for Helium Balloons using Reinforcement Learning
Jack D. Saunders
Loïc Prenevost
Özgür Simsek
Alan Hunter
Wenbin Li
18
1
0
02 Mar 2023
Hallucinated Adversarial Control for Conservative Offline Policy
  Evaluation
Hallucinated Adversarial Control for Conservative Offline Policy Evaluation
Jonas Rothfuss
Bhavya Sukhija
Tobias Birchler
Parnian Kassraie
Andreas Krause
OffRL
90
10
0
02 Mar 2023
Reshaping Viscoelastic-String Path-Planner (RVP)
Reshaping Viscoelastic-String Path-Planner (RVP)
Sarvesh Mayilvahanan
Akshay Sarvesh
Swaminathan Gopalswamy
37
0
0
02 Mar 2023
UniDexGrasp: Universal Robotic Dexterous Grasping via Learning Diverse
  Proposal Generation and Goal-Conditioned Policy
UniDexGrasp: Universal Robotic Dexterous Grasping via Learning Diverse Proposal Generation and Goal-Conditioned Policy
Yinzhen Xu
Weikang Wan
Jialiang Zhang
Haoran Liu
Zikang Shan
...
Yijia Weng
Jiayi Chen
Tengyu Liu
Li Yi
He Wang
189
120
0
02 Mar 2023
LS-IQ: Implicit Reward Regularization for Inverse Reinforcement Learning
LS-IQ: Implicit Reward Regularization for Inverse Reinforcement Learning
Firas Al-Hafez
Davide Tateo
Oleg Arenz
Guoping Zhao
Jan Peters
71
24
0
01 Mar 2023
A Variational Approach to Mutual Information-Based Coordination for
  Multi-Agent Reinforcement Learning
A Variational Approach to Mutual Information-Based Coordination for Multi-Agent Reinforcement Learning
Woojun Kim
Whiyoung Jung
Myungsik Cho
Young-Jin Sung
53
7
0
01 Mar 2023
AR3n: A Reinforcement Learning-based Assist-As-Needed Controller for
  Robotic Rehabilitation
AR3n: A Reinforcement Learning-based Assist-As-Needed Controller for Robotic Rehabilitation
Shrey Pareek
Harris J. Nisar
T. Kesavadas
23
9
0
28 Feb 2023
Learning to Control Autonomous Fleets from Observation via Offline
  Reinforcement Learning
Learning to Control Autonomous Fleets from Observation via Offline Reinforcement Learning
Carolin Schmidt
Daniele Gammelli
Francisco Câmara Pereira
Filipe Rodrigues
OffRL
77
5
0
28 Feb 2023
Human-Inspired Framework to Accelerate Reinforcement Learning
Human-Inspired Framework to Accelerate Reinforcement Learning
Ali Beikmohammadi
Sindri Magnússon
OffRL
86
4
0
28 Feb 2023
Policy Dispersion in Non-Markovian Environment
B. Qu
Xiaofeng Cao
Jielong Yang
Hechang Chen
Chang Yi
Ivor W.Tsang
Yew-Soon Ong
63
0
0
28 Feb 2023
Learning Sparse Control Tasks from Pixels by Latent
  Nearest-Neighbor-Guided Explorations
Learning Sparse Control Tasks from Pixels by Latent Nearest-Neighbor-Guided Explorations
Ruihan Zhao
Ufuk Topcu
Sandeep Chinchali
Mariano Phielipp
53
4
0
28 Feb 2023
Taylor TD-learning
Taylor TD-learning
Michele Garibbo
Maxime Robeyns
Laurence Aitchison
OffRL
67
1
0
27 Feb 2023
(Re)$^2$H2O: Autonomous Driving Scenario Generation via Reversely
  Regularized Hybrid Offline-and-Online Reinforcement Learning
(Re)2^22H2O: Autonomous Driving Scenario Generation via Reversely Regularized Hybrid Offline-and-Online Reinforcement Learning
Haoyi Niu
Kun Ren
Yi Tian Xu
Ziyuan Yang
Yi-Hsin Lin
Yan Zhang
Jianming Hu
OffRL
87
9
0
27 Feb 2023
High-Precise Robot Arm Manipulation based on Online Iterative Learning
  and Forward Simulation with Positioning Error Below End-Effector Physical
  Minimum Displacement
High-Precise Robot Arm Manipulation based on Online Iterative Learning and Forward Simulation with Positioning Error Below End-Effector Physical Minimum Displacement
Weiming Qu
Tianlin Liu
D. Luo
77
2
0
26 Feb 2023
Diffusion Model-Augmented Behavioral Cloning
Diffusion Model-Augmented Behavioral Cloning
Shangcheng Chen
Hsiang-Chun Wang
Ming-Hao Hsu
Chun-Mao Lai
Shao-Hua Sun
DiffM
153
31
0
26 Feb 2023
Reinforcement Learning Based Pushing and Grasping Objects from
  Ungraspable Poses
Reinforcement Learning Based Pushing and Grasping Objects from Ungraspable Poses
Hao Zhang
Hongzhuo Liang
Lin Cong
Jianzhi Lyu
Long Zeng
Pingfa Feng
Jian-Wei Zhang
SSLDRL
62
10
0
26 Feb 2023
Hierarchical Needs-driven Agent Learning Systems: From Deep
  Reinforcement Learning To Diverse Strategies
Hierarchical Needs-driven Agent Learning Systems: From Deep Reinforcement Learning To Diverse Strategies
Qin Yang
38
2
0
25 Feb 2023
Gauss-Newton Temporal Difference Learning with Nonlinear Function
  Approximation
Gauss-Newton Temporal Difference Learning with Nonlinear Function Approximation
Zhifa Ke
Junyu Zhang
Zaiwen Wen
81
0
0
25 Feb 2023
Previous
123...363738...818283
Next