ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1801.01290
  4. Cited By
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
v1v2 (latest)

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
ArXiv (abs)PDFHTML

Papers citing "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"

50 / 4,130 papers shown
Title
Constrained Reinforcement Learning with Smoothed Log Barrier Function
Constrained Reinforcement Learning with Smoothed Log Barrier Function
Baohe Zhang
Yuan Zhang
Lilli Frison
Thomas Brox
Joschka Bödecker
100
8
0
21 Mar 2024
Heuristic Algorithm-based Action Masking Reinforcement Learning
  (HAAM-RL) with Ensemble Inference Method
Heuristic Algorithm-based Action Masking Reinforcement Learning (HAAM-RL) with Ensemble Inference Method
Kyuwon Choi
Cheolkyun Rho
Taeyoun Kim
D. Choi
OffRL
65
0
0
21 Mar 2024
Waypoint-Based Reinforcement Learning for Robot Manipulation Tasks
Waypoint-Based Reinforcement Learning for Robot Manipulation Tasks
Shaunak A. Mehta
Soheil Habibian
Dylan P. Losey
SSL
122
3
0
20 Mar 2024
Equivariant Ensembles and Regularization for Reinforcement Learning in
  Map-based Path Planning
Equivariant Ensembles and Regularization for Reinforcement Learning in Map-based Path Planning
Mirco Theile
Hongpeng Cao
Marco Caccamo
Alberto L. Sangiovanni-Vincentelli
79
2
0
19 Mar 2024
Policy Bifurcation in Safe Reinforcement Learning
Policy Bifurcation in Safe Reinforcement Learning
Wenjun Zou
Yao Lyu
Jie Li
Yujie Yang
Shengbo Eben Li
Jingliang Duan
Xianyuan Zhan
Jingjing Liu
Yaqin Zhang
Keqiang Li
144
1
0
19 Mar 2024
FootstepNet: an Efficient Actor-Critic Method for Fast On-line Bipedal
  Footstep Planning and Forecasting
FootstepNet: an Efficient Actor-Critic Method for Fast On-line Bipedal Footstep Planning and Forecasting
Clément Gaspard
G. Passault
Mélodie Daniel
Olivier Ly
47
1
0
19 Mar 2024
Reinforcement Learning from Delayed Observations via World Models
Reinforcement Learning from Delayed Observations via World Models
Armin Karamzade
Kyungmin Kim
Montek Kalsi
Roy Fox
107
8
0
18 Mar 2024
Phasic Diversity Optimization for Population-Based Reinforcement
  Learning
Phasic Diversity Optimization for Population-Based Reinforcement Learning
Jingcheng Jiang
Haiyin Piao
Yu Fu
Yihang Hao
Chuanlu Jiang
Ziqi Wei
Xin Yang
67
0
0
17 Mar 2024
A Simple Mixture Policy Parameterization for Improving Sample Efficiency
  of CVaR Optimization
A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization
Yudong Luo
Yangchen Pan
Han Wang
Philip Torr
Pascal Poupart
134
3
0
17 Mar 2024
Bridging the Gap between Discrete Agent Strategies in Game Theory and Continuous Motion Planning in Dynamic Environments
Bridging the Gap between Discrete Agent Strategies in Game Theory and Continuous Motion Planning in Dynamic Environments
Hongrui Zheng
Zhijun Zhuang
Stephanie Wu
Shuo Yang
Rahul Mangharam
63
1
0
17 Mar 2024
Dreaming of Many Worlds: Learning Contextual World Models Aids Zero-Shot
  Generalization
Dreaming of Many Worlds: Learning Contextual World Models Aids Zero-Shot Generalization
Sai Prasanna
Karim Farid
Raghu Rajan
André Biedenkapp
123
6
0
16 Mar 2024
Deep Reinforcement Learning-based Large-scale Robot Exploration
Deep Reinforcement Learning-based Large-scale Robot Exploration
Yuhong Cao
Rui Zhao
Yizhuo Wang
Bairan Xiang
Guillaume Sartoretti
113
14
0
16 Mar 2024
Identifying Optimal Launch Sites of High-Altitude Latex-Balloons using
  Bayesian Optimisation for the Task of Station-Keeping
Identifying Optimal Launch Sites of High-Altitude Latex-Balloons using Bayesian Optimisation for the Task of Station-Keeping
Jack D. Saunders
Sajad Saeedi
Adam Hartshorne
Binbin Xu
Özgür Simsek
Alan Hunter
Wenbin Li
34
0
0
16 Mar 2024
Scheduling Drone and Mobile Charger via Hybrid-Action Deep Reinforcement
  Learning
Scheduling Drone and Mobile Charger via Hybrid-Action Deep Reinforcement Learning
Jizhe Dou
Haotian Zhang
Guodong Sun
96
0
0
16 Mar 2024
Diffusion-Reinforcement Learning Hierarchical Motion Planning in Multi-agent Adversarial Games
Diffusion-Reinforcement Learning Hierarchical Motion Planning in Multi-agent Adversarial Games
Zixuan Wu
Sean Ye
Manisha Natarajan
Matthew C. Gombolay
152
6
0
16 Mar 2024
HumanoidBench: Simulated Humanoid Benchmark for Whole-Body Locomotion
  and Manipulation
HumanoidBench: Simulated Humanoid Benchmark for Whole-Body Locomotion and Manipulation
Carmelo Sferrazza
Dun-Ming Huang
Xingyu Lin
Youngwoon Lee
Pieter Abbeel
136
48
0
15 Mar 2024
Stimulate the Potential of Robots via Competition
Stimulate the Potential of Robots via Competition
K. Huang
Di Guo
Xinyu Zhang
Xiangyang Ji
Huaping Liu
101
3
0
15 Mar 2024
Online Policy Learning from Offline Preferences
Online Policy Learning from Offline Preferences
Guoxi Zhang
Han Bao
Hisashi Kashima
OffRL
105
0
0
15 Mar 2024
BEHAVIOR-1K: A Human-Centered, Embodied AI Benchmark with 1,000 Everyday
  Activities and Realistic Simulation
BEHAVIOR-1K: A Human-Centered, Embodied AI Benchmark with 1,000 Everyday Activities and Realistic Simulation
Chengshu Li
Ruohan Zhang
J. Wong
Cem Gokmen
S. Srivastava
...
Silvio Savarese
H. Gweon
Chenxi Liu
Jiajun Wu
Fei-Fei Li
VGenLM&RoVLM
77
40
0
14 Mar 2024
ExploRLLM: Guiding Exploration in Reinforcement Learning with Large Language Models
ExploRLLM: Guiding Exploration in Reinforcement Learning with Large Language Models
Runyu Ma
Jelle Luijkx
Zlatan Ajanović
Jens Kober
LM&RoLRM
123
9
0
14 Mar 2024
DIFFTACTILE: A Physics-based Differentiable Tactile Simulator for
  Contact-rich Robotic Manipulation
DIFFTACTILE: A Physics-based Differentiable Tactile Simulator for Contact-rich Robotic Manipulation
Zilin Si
Gu Zhang
Qingwei Ben
Branden Romero
Zhou Xian
Chao Liu
Chuang Gan
81
22
0
13 Mar 2024
AutoDFP: Automatic Data-Free Pruning via Channel Similarity
  Reconstruction
AutoDFP: Automatic Data-Free Pruning via Channel Similarity Reconstruction
Siqi Li
Jun Chen
Jingyang Xiang
Chengrui Zhu
Yong-Jin Liu
78
0
0
13 Mar 2024
Symmetric Q-learning: Reducing Skewness of Bellman Error in Online
  Reinforcement Learning
Symmetric Q-learning: Reducing Skewness of Bellman Error in Online Reinforcement Learning
Motoki Omura
Takayuki Osa
Yusuke Mukuta
Tatsuya Harada
OffRL
51
0
0
12 Mar 2024
Constrained Optimal Fuel Consumption of HEV: A Constrained Reinforcement
  Learning Approach
Constrained Optimal Fuel Consumption of HEV: A Constrained Reinforcement Learning Approach
Shuchang Yan
45
1
0
12 Mar 2024
Reinforced Sequential Decision-Making for Sepsis Treatment: The POSNEGDM
  Framework with Mortality Classifier and Transformer
Reinforced Sequential Decision-Making for Sepsis Treatment: The POSNEGDM Framework with Mortality Classifier and Transformer
Dipesh Tamboli
Jiayu Chen
Kiran Pranesh Jotheeswaran
Denny Yu
Vaneet Aggarwal
OffRLAI4CE
107
4
0
12 Mar 2024
Disentangling Policy from Offline Task Representation Learning via
  Adversarial Data Augmentation
Disentangling Policy from Offline Task Representation Learning via Adversarial Data Augmentation
Chengxing Jia
Fuxiang Zhang
Yi-Chen Li
Chenxiao Gao
Xu-Hui Liu
Lei Yuan
Zongzhang Zhang
Yang Yu
AAML
83
4
0
12 Mar 2024
Unveiling the Significance of Toddler-Inspired Reward Transition in
  Goal-Oriented Reinforcement Learning
Unveiling the Significance of Toddler-Inspired Reward Transition in Goal-Oriented Reinforcement Learning
Junseok Park
Yoonsung Kim
Hee Bin Yoo
Min Whoo Lee
Kibeom Kim
Won-Seok Choi
Minsu Lee
Byoung-Tak Zhang
OffRL
70
1
0
11 Mar 2024
Tactical Decision Making for Autonomous Trucks by Deep Reinforcement
  Learning with Total Cost of Operation Based Reward
Tactical Decision Making for Autonomous Trucks by Deep Reinforcement Learning with Total Cost of Operation Based Reward
Deepthi Pathare
Leo Laine
M. Chehreghani
75
1
0
11 Mar 2024
DeepSafeMPC: Deep Learning-Based Model Predictive Control for Safe
  Multi-Agent Reinforcement Learning
DeepSafeMPC: Deep Learning-Based Model Predictive Control for Safe Multi-Agent Reinforcement Learning
Xuefeng Wang
Henglin Pu
Hyung Jun Kim
Husheng Li
65
2
0
11 Mar 2024
Finite-Time Error Analysis of Soft Q-Learning: Switching System Approach
Finite-Time Error Analysis of Soft Q-Learning: Switching System Approach
Narim Jeong
Donghwan Lee
59
1
0
11 Mar 2024
Dissecting Deep RL with High Update Ratios: Combatting Value Divergence
Dissecting Deep RL with High Update Ratios: Combatting Value Divergence
Marcel Hussing
C. Voelcker
Igor Gilitschenski
Amir-massoud Farahmand
Eric Eaton
103
3
0
09 Mar 2024
Conservative DDPG -- Pessimistic RL without Ensemble
Conservative DDPG -- Pessimistic RL without Ensemble
Nitsan Soffair
Shie Mannor
OffRL
54
0
0
08 Mar 2024
Reset & Distill: A Recipe for Overcoming Negative Transfer in Continual
  Reinforcement Learning
Reset & Distill: A Recipe for Overcoming Negative Transfer in Continual Reinforcement Learning
Hongjoon Ahn
Jinu Hyeon
Youngmin Oh
Bosun Hwang
Taesup Moon
CLLOnRL
69
2
0
08 Mar 2024
Learning Speed Adaptation for Flight in Clutter
Learning Speed Adaptation for Flight in Clutter
Guangyu Zhao
Tianyue Wu
Yeke Chen
Fei Gao
97
7
0
07 Mar 2024
Fill-and-Spill: Deep Reinforcement Learning Policy Gradient Methods for
  Reservoir Operation Decision and Control
Fill-and-Spill: Deep Reinforcement Learning Policy Gradient Methods for Reservoir Operation Decision and Control
Sadegh Sadeghi Tabas
Vidya Samadi
30
0
0
07 Mar 2024
Sampling-based Safe Reinforcement Learning for Nonlinear Dynamical
  Systems
Sampling-based Safe Reinforcement Learning for Nonlinear Dynamical Systems
Wesley A Suttle
Vipul K Sharma
K. Kosaraju
S. Sivaranjani
Ji Liu
Vijay Gupta
Brian M Sadler
72
1
0
06 Mar 2024
A Survey on Applications of Reinforcement Learning in Spatial Resource
  Allocation
A Survey on Applications of Reinforcement Learning in Spatial Resource Allocation
Di Zhang
Moyang Wang
Joseph D Mango
Xiang Li
Xianrui Xu
109
1
0
06 Mar 2024
World Models for Autonomous Driving: An Initial Survey
World Models for Autonomous Driving: An Initial Survey
Yanchen Guan
Haicheng Liao
Zhenning Li
Jia Hu
Runze Yuan
Yunjian Li
Guohui Zhang
Chengzhong Xu
160
43
0
05 Mar 2024
Koopman-Assisted Reinforcement Learning
Koopman-Assisted Reinforcement Learning
Preston Rozwood
Edward Mehrez
Ludger Paehler
Wen Sun
Steven L. Brunton
109
10
0
04 Mar 2024
An Efficient Model-Based Approach on Learning Agile Motor Skills without
  Reinforcement
An Efficient Model-Based Approach on Learning Agile Motor Skills without Reinforcement
Hao-bin Shi
Tingguang Li
Qing Zhu
Jiapeng Sheng
Lei Han
Max Q.-H. Meng
72
1
0
04 Mar 2024
Tsallis Entropy Regularization for Linearly Solvable MDP and Linear
  Quadratic Regulator
Tsallis Entropy Regularization for Linearly Solvable MDP and Linear Quadratic Regulator
Yota Hashizume
Koshi Oishi
Kenji Kashima
92
1
0
04 Mar 2024
Feint Behaviors and Strategies: Formalization, Implementation and Evaluation
Feint Behaviors and Strategies: Formalization, Implementation and Evaluation
Junyu Liu
Wangkai Jin
OffRL
62
0
0
04 Mar 2024
Towards Provable Log Density Policy Gradient
Towards Provable Log Density Policy Gradient
Pulkit Katdare
Anant Joshi
Katherine Driggs-Campbell
69
0
0
03 Mar 2024
SELFI: Autonomous Self-Improvement with Reinforcement Learning for
  Social Navigation
SELFI: Autonomous Self-Improvement with Reinforcement Learning for Social Navigation
Noriaki Hirose
Dhruv Shah
Kyle Stachowicz
A. Sridhar
Sergey Levine
126
5
0
01 Mar 2024
EfficientZero V2: Mastering Discrete and Continuous Control with Limited
  Data
EfficientZero V2: Mastering Discrete and Continuous Control with Limited Data
Shengjie Wang
Shaohuai Liu
Weirui Ye
Jiacheng You
Yang Gao
OffRL
110
15
0
01 Mar 2024
Conflict-Averse Gradient Aggregation for Constrained Multi-Objective
  Reinforcement Learning
Conflict-Averse Gradient Aggregation for Constrained Multi-Objective Reinforcement Learning
Dohyeong Kim
Mineui Hong
Jeongho Park
Songhwai Oh
78
0
0
01 Mar 2024
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL
Yifei Zhou
Andrea Zanette
Jiayi Pan
Sergey Levine
Aviral Kumar
160
79
0
29 Feb 2024
Temporal-Aware Deep Reinforcement Learning for Energy Storage Bidding in
  Energy and Contingency Reserve Markets
Temporal-Aware Deep Reinforcement Learning for Energy Storage Bidding in Energy and Contingency Reserve Markets
Jinhao Li
Changlong Wang
Yanru Zhang
Hao Wang
40
5
0
29 Feb 2024
Disentangling the Causes of Plasticity Loss in Neural Networks
Disentangling the Causes of Plasticity Loss in Neural Networks
Clare Lyle
Zeyu Zheng
Khimya Khetarpal
H. V. Hasselt
Razvan Pascanu
James Martens
Will Dabney
AI4CE
136
38
0
29 Feb 2024
Symmetry-aware Reinforcement Learning for Robotic Assembly under Partial
  Observability with a Soft Wrist
Symmetry-aware Reinforcement Learning for Robotic Assembly under Partial Observability with a Soft Wrist
Hai Nguyen
Tadashi Kozuno
C. C. Beltran-Hernandez
Masashi Hamaya
111
8
0
28 Feb 2024
Previous
123...202122...818283
Next