ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1801.01290
  4. Cited By
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
ArXivPDFHTML

Papers citing "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"

50 / 1,645 papers shown
Title
DAG-Plan: Generating Directed Acyclic Dependency Graphs for Dual-Arm Cooperative Planning
DAG-Plan: Generating Directed Acyclic Dependency Graphs for Dual-Arm Cooperative Planning
Zeyu Gao
Yao Mu
Jinye Qu
Mengkang Hu
Lingyue Guo
Ping Luo
Yanfeng Lu
Ping Luo
Shanghang Zhang
Yanfeng Lu
59
10
0
14 Jun 2024
AutomaChef: A Physics-informed Demonstration-guided Learning Framework
  for Granular Material Manipulation
AutomaChef: A Physics-informed Demonstration-guided Learning Framework for Granular Material Manipulation
Minglun Wei
Xintong Yang
Yu-Kun Lai
S. A. Tafrishi
Ze Ji
AI4CE
33
0
0
13 Jun 2024
Residual Learning and Context Encoding for Adaptive Offline-to-Online
  Reinforcement Learning
Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning
Mohammadreza Nakhaei
Aidan Scannell
Joni Pajarinen
OffRL
60
1
0
12 Jun 2024
Unifying Interpretability and Explainability for Alzheimer's Disease
  Progression Prediction
Unifying Interpretability and Explainability for Alzheimer's Disease Progression Prediction
Raja Farrukh Ali
Stephanie Milani
John Woods
Emmanuel Adenij
Ayesha Farooq
Clayton Mansel
Jeffrey Burns
William Hsu
35
0
0
11 Jun 2024
CDSA: Conservative Denoising Score-based Algorithm for Offline
  Reinforcement Learning
CDSA: Conservative Denoising Score-based Algorithm for Offline Reinforcement Learning
Zeyuan Liu
Kai Yang
Xiu Li
OffRL
49
0
0
11 Jun 2024
Decoupling regularization from the action space
Decoupling regularization from the action space
Sobhan Mohammadpour
Emma Frejinger
Pierre-Luc Bacon
37
0
0
10 Jun 2024
Boosting Robustness in Preference-Based Reinforcement Learning with Dynamic Sparsity
Boosting Robustness in Preference-Based Reinforcement Learning with Dynamic Sparsity
Calarina Muslimani
Bram Grooten
Deepak Ranganatha Sastry Mamillapalli
Mykola Pechenizkiy
Decebal Constantin Mocanu
Matthew E. Taylor
59
0
0
10 Jun 2024
ICU-Sepsis: A Benchmark MDP Built from Real Medical Data
ICU-Sepsis: A Benchmark MDP Built from Real Medical Data
Kartik Choudhary
Dhawal Gupta
Philip S. Thomas
OOD
VLM
28
0
0
09 Jun 2024
LGR2: Language Guided Reward Relabeling for Accelerating Hierarchical Reinforcement Learning
LGR2: Language Guided Reward Relabeling for Accelerating Hierarchical Reinforcement Learning
Utsav Singh
Pramit Bhattacharyya
Vinay P. Namboodiri
LM&Ro
49
1
0
09 Jun 2024
ATraDiff: Accelerating Online Reinforcement Learning with Imaginary
  Trajectories
ATraDiff: Accelerating Online Reinforcement Learning with Imaginary Trajectories
Qianlan Yang
Yu-Xiong Wang
OnRL
47
1
0
06 Jun 2024
AC4MPC: Actor-Critic Reinforcement Learning for Nonlinear Model
  Predictive Control
AC4MPC: Actor-Critic Reinforcement Learning for Nonlinear Model Predictive Control
Rudolf Reiter
Andrea Ghezzi
Katrin Baumgärtner
Jasper Hoffmann
Robert D. McAllister
Moritz Diehl
39
6
0
06 Jun 2024
Robust Deep Reinforcement Learning against Adversarial Behavior Manipulation
Robust Deep Reinforcement Learning against Adversarial Behavior Manipulation
Shojiro Yamabe
Kazuto Fukuchi
Jun Sakuma
AAML
68
0
0
06 Jun 2024
DEER: A Delay-Resilient Framework for Reinforcement Learning with
  Variable Delays
DEER: A Delay-Resilient Framework for Reinforcement Learning with Variable Delays
Bo Xia
Yilun Kong
Yongzhe Chang
Bo Yuan
Zhiheng Li
Xueqian Wang
Bin Liang
OffRL
55
3
0
05 Jun 2024
Value Improved Actor Critic Algorithms
Value Improved Actor Critic Algorithms
Yaniv Oren
Moritz A. Zanger
Pascal R. van der Vaart
M. Spaan
Wendelin Bohmer
Wendelin Bohmer
OffRL
33
0
0
03 Jun 2024
Looking Backward: Retrospective Backward Synthesis for Goal-Conditioned GFlowNets
Looking Backward: Retrospective Backward Synthesis for Goal-Conditioned GFlowNets
Haoran He
C. Chang
Huazhe Xu
Ling Pan
94
6
0
03 Jun 2024
Do's and Don'ts: Learning Desirable Skills with Instruction Videos
Do's and Don'ts: Learning Desirable Skills with Instruction Videos
Hyunseung Kim
ByungKun Lee
Hojoon Lee
Dongyoon Hwang
Donghu Kim
Jaegul Choo
39
1
0
01 Jun 2024
HOPE: A Reinforcement Learning-based Hybrid Policy Path Planner for Diverse Parking Scenarios
HOPE: A Reinforcement Learning-based Hybrid Policy Path Planner for Diverse Parking Scenarios
Mingyang Jiang
Yueyuan Li
Songan Zhang
Siyuan Chen
Chunxiang Wang
Ming Yang
51
4
0
31 May 2024
Video-Language Critic: Transferable Reward Functions for
  Language-Conditioned Robotics
Video-Language Critic: Transferable Reward Functions for Language-Conditioned Robotics
Minttu Alakuijala
Reginald McLean
Isaac Woungang
Nariman Farsad
Samuel Kaski
Pekka Marttinen
Kai Yuan
LM&Ro
48
1
0
30 May 2024
May the Dance be with You: Dance Generation Framework for Non-Humanoids
May the Dance be with You: Dance Generation Framework for Non-Humanoids
Hyemin Ahn
DiffM
VGen
33
1
0
30 May 2024
Bilevel reinforcement learning via the development of hyper-gradient without lower-level convexity
Bilevel reinforcement learning via the development of hyper-gradient without lower-level convexity
Yan Yang
Bin Gao
Ya-xiang Yuan
50
2
0
30 May 2024
Trust the Model Where It Trusts Itself -- Model-Based Actor-Critic with
  Uncertainty-Aware Rollout Adaption
Trust the Model Where It Trusts Itself -- Model-Based Actor-Critic with Uncertainty-Aware Rollout Adaption
Bernd Frauenknecht
Artur Eisele
Devdutt Subhasish
Friedrich Solowjow
Sebastian Trimpe
54
5
0
29 May 2024
Spectral-Risk Safe Reinforcement Learning with Convergence Guarantees
Spectral-Risk Safe Reinforcement Learning with Convergence Guarantees
Dohyeong Kim
Taehyun Cho
Seung Han
Hojun Chung
Kyungjae Lee
Songhwai Oh
39
1
0
29 May 2024
RLeXplore: Accelerating Research in Intrinsically-Motivated Reinforcement Learning
RLeXplore: Accelerating Research in Intrinsically-Motivated Reinforcement Learning
Mingqi Yuan
Roger Creus Castanyer
Bo Li
Xin Jin
Glen Berseth
Wenjun Zeng
45
0
0
29 May 2024
Counterfactual Explanations for Multivariate Time-Series without
  Training Datasets
Counterfactual Explanations for Multivariate Time-Series without Training Datasets
Xiangyu Sun
Raquel Aoki
Kevin H. Wilson
29
1
0
28 May 2024
A Pontryagin Perspective on Reinforcement Learning
A Pontryagin Perspective on Reinforcement Learning
Onno Eberhard
Claire Vernade
Michael Muehlebach
45
2
0
28 May 2024
Fast TRAC: A Parameter-Free Optimizer for Lifelong Reinforcement
  Learning
Fast TRAC: A Parameter-Free Optimizer for Lifelong Reinforcement Learning
Aneesh Muppidi
Zhiyu Zhang
Heng Yang
39
4
0
26 May 2024
Pausing Policy Learning in Non-stationary Reinforcement Learning
Pausing Policy Learning in Non-stationary Reinforcement Learning
Hyunin Lee
Ming Jin
Javad Lavaei
Somayeh Sojoudi
OffRL
47
2
0
25 May 2024
Efficient Recurrent Off-Policy RL Requires a Context-Encoder-Specific
  Learning Rate
Efficient Recurrent Off-Policy RL Requires a Context-Encoder-Specific Learning Rate
Fan Luo
Zuolin Tu
Zefang Huang
Yang Yu
OffRL
45
0
0
24 May 2024
How to Leverage Diverse Demonstrations in Offline Imitation Learning
How to Leverage Diverse Demonstrations in Offline Imitation Learning
Sheng Yue
Jiani Liu
Xingyuan Hua
Ju Ren
Sen Lin
Junshan Zhang
Yaoxue Zhang
OffRL
36
3
0
24 May 2024
Model-free reinforcement learning with noisy actions for automated experimental control in optics
Model-free reinforcement learning with noisy actions for automated experimental control in optics
Lea Richtmann
Viktoria-S. Schmiesing
Dennis Wilken
Jan Heine
Aaron Tranter
Avishek Anand
Tobias J. Osborne
M. Heurs
40
2
0
24 May 2024
Interpretable and Editable Programmatic Tree Policies for Reinforcement
  Learning
Interpretable and Editable Programmatic Tree Policies for Reinforcement Learning
Hector Kohler
Quentin Delfosse
R. Akrour
Kristian Kersting
Philippe Preux
67
14
0
23 May 2024
Reinforcing Language Agents via Policy Optimization with Action
  Decomposition
Reinforcing Language Agents via Policy Optimization with Action Decomposition
Muning Wen
Bo Liu
Weinan Zhang
Jun Wang
Ying Wen
51
8
0
23 May 2024
Exclusively Penalized Q-learning for Offline Reinforcement Learning
Exclusively Penalized Q-learning for Offline Reinforcement Learning
Junghyuk Yeom
Yonghyeon Jo
Jungmo Kim
Sanghyeon Lee
Seungyul Han
OffRL
56
2
0
23 May 2024
A Survey on Vision-Language-Action Models for Embodied AI
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
82
45
0
23 May 2024
Learning Future Representation with Synthetic Observations for
  Sample-efficient Reinforcement Learning
Learning Future Representation with Synthetic Observations for Sample-efficient Reinforcement Learning
Xin Liu
Yaran Chen
Dong Zhao
50
1
0
20 May 2024
An Efficient Learning Control Framework With Sim-to-Real for String-Type Artificial Muscle-Driven Robotic Systems
An Efficient Learning Control Framework With Sim-to-Real for String-Type Artificial Muscle-Driven Robotic Systems
Jiyue Tao
Yunsong Zhang
S. Rajendran
Feitian Zhang
38
0
0
17 May 2024
Neural Network Compression for Reinforcement Learning Tasks
Neural Network Compression for Reinforcement Learning Tasks
Dmitry A. Ivanov
D. Larionov
Oleg V. Maslennikov
V. Voevodin
OffRL
AI4CE
55
0
0
13 May 2024
Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback
Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback
Asaf B. Cassel
Haipeng Luo
Aviv A. Rosenberg
Dmitry Sotnikov
OffRL
36
3
0
13 May 2024
Learning Reward for Robot Skills Using Large Language Models via
  Self-Alignment
Learning Reward for Robot Skills Using Large Language Models via Self-Alignment
Yuwei Zeng
Yao Mu
Lin Shao
44
12
0
12 May 2024
Contrastive Representation for Data Filtering in Cross-Domain Offline
  Reinforcement Learning
Contrastive Representation for Data Filtering in Cross-Domain Offline Reinforcement Learning
Xiaoyu Wen
Chenjia Bai
Kang Xu
Xudong Yu
Yang Zhang
Xuelong Li
Zhen Wang
46
2
0
10 May 2024
The Curse of Diversity in Ensemble-Based Exploration
The Curse of Diversity in Ensemble-Based Exploration
Zhixuan Lin
P. DÓro
Evgenii Nikishin
Rameswar Panda
55
1
0
07 May 2024
Genetic Drift Regularization: on preventing Actor Injection from
  breaking Evolution Strategies
Genetic Drift Regularization: on preventing Actor Injection from breaking Evolution Strategies
Paul Templier
Emmanuel Rachelson
Antoine Cully
Dennis G. Wilson
31
0
0
07 May 2024
Logic-Skill Programming: An Optimization-based Approach to Sequential
  Skill Planning
Logic-Skill Programming: An Optimization-based Approach to Sequential Skill Planning
Teng Xue
Amirreza Razmjoo
Suhan Shetty
Sylvain Calinon
37
4
0
07 May 2024
Linear Convergence of Independent Natural Policy Gradient in Games with
  Entropy Regularization
Linear Convergence of Independent Natural Policy Gradient in Games with Entropy Regularization
Youbang Sun
Tao-Wen Liu
P. R. Kumar
Shahin Shahrampour
42
0
0
04 May 2024
CTD4 -- A Deep Continuous Distributional Actor-Critic Agent with a Kalman Fusion of Multiple Critics
CTD4 -- A Deep Continuous Distributional Actor-Critic Agent with a Kalman Fusion of Multiple Critics
David Valencia
Henry Williams
Trevor Gee
Bruce A MacDonaland
Minas V. Liarokapis
Minas Liarokapis
OffRL
40
2
0
04 May 2024
S$^2$AC: Energy-Based Reinforcement Learning with Stein Soft Actor
  Critic
S2^22AC: Energy-Based Reinforcement Learning with Stein Soft Actor Critic
Safa Messaoud
Billel Mokeddem
Zhenghai Xue
Linsey Pang
Bo An
Haipeng Chen
Sanjay Chawla
51
3
0
02 May 2024
Employing Federated Learning for Training Autonomous HVAC Systems
Employing Federated Learning for Training Autonomous HVAC Systems
Fredrik Hagström
Vikas K. Garg
Fabricio Oliveira
AI4CE
76
0
0
01 May 2024
Leveraging Sub-Optimal Data for Human-in-the-Loop Reinforcement Learning
Leveraging Sub-Optimal Data for Human-in-the-Loop Reinforcement Learning
Calarina Muslimani
Matthew E. Taylor
OffRL
51
2
0
30 Apr 2024
Generalize by Touching: Tactile Ensemble Skill Transfer for Robotic
  Furniture Assembly
Generalize by Touching: Tactile Ensemble Skill Transfer for Robotic Furniture Assembly
Hao-ming Lin
Radu Corcodel
Ding Zhao
45
7
0
26 Apr 2024
A Dual Perspective of Reinforcement Learning for Imposing Policy Constraints
A Dual Perspective of Reinforcement Learning for Imposing Policy Constraints
Bram De Cooman
Johan A. K. Suykens
43
0
0
25 Apr 2024
Previous
123...567...313233
Next