ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1801.01290
  4. Cited By
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
v1v2 (latest)

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
ArXiv (abs)PDFHTML

Papers citing "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"

50 / 4,128 papers shown
Title
State Estimation Using Particle Filtering in Adaptive Machine Learning Methods: Integrating Q-Learning and NEAT Algorithms with Noisy Radar Measurements
State Estimation Using Particle Filtering in Adaptive Machine Learning Methods: Integrating Q-Learning and NEAT Algorithms with Noisy Radar Measurements
Wonjin Song
Feng Bao
64
0
0
10 Apr 2025
Bridging Deep Reinforcement Learning and Motion Planning for Model-Free Navigation in Cluttered Environments
Bridging Deep Reinforcement Learning and Motion Planning for Model-Free Navigation in Cluttered Environments
Licheng Luo
Mingyu Cai
101
0
0
09 Apr 2025
Neuron-level Balance between Stability and Plasticity in Deep Reinforcement Learning
Neuron-level Balance between Stability and Plasticity in Deep Reinforcement Learning
Jiahua Lan
Sen Zhang
Haixia Pan
Ruijun Liu
Li Shen
Dacheng Tao
CLL
91
0
0
09 Apr 2025
Deep Neural Koopman Operator-based Economic Model Predictive Control of Shipboard Carbon Capture System
Deep Neural Koopman Operator-based Economic Model Predictive Control of Shipboard Carbon Capture System
Minghao Han
Xunyuan Yin
108
0
0
09 Apr 2025
Neural Motion Simulator: Pushing the Limit of World Models in Reinforcement Learning
Neural Motion Simulator: Pushing the Limit of World Models in Reinforcement Learning
Chenjie Hao
Weyl Lu
Yifan Xu
Yubei Chen
48
0
0
09 Apr 2025
xMTF: A Formula-Free Model for Reinforcement-Learning-Based Multi-Task Fusion in Recommender Systems
xMTF: A Formula-Free Model for Reinforcement-Learning-Based Multi-Task Fusion in Recommender Systems
Yang Cao
Changhao Zhang
Xiaoshuang Chen
Kaiqiao Zhan
Ben Wang
69
1
0
08 Apr 2025
Robo-taxi Fleet Coordination at Scale via Reinforcement Learning
Robo-taxi Fleet Coordination at Scale via Reinforcement Learning
Luigi Tresca
Carolin Schmidt
James Harrison
Filipe Rodrigues
G. Zardini
Daniele Gammelli
Marco Pavone
99
3
0
08 Apr 2025
Stratified Expert Cloning with Adaptive Selection for User Retention in Large-Scale Recommender Systems
Stratified Expert Cloning with Adaptive Selection for User Retention in Large-Scale Recommender Systems
Chengzhi Lin
Annan Xie
Shuchang Liu
Wuhong Wang
Chuyuan Wang
Yongqi Liu
OffRL
61
0
0
08 Apr 2025
Trust-Region Twisted Policy Improvement
Trust-Region Twisted Policy Improvement
Joery A. de Vries
Jinke He
Yaniv Oren
M. Spaan
OffRLLRM
144
0
0
08 Apr 2025
Smart Exploration in Reinforcement Learning using Bounded Uncertainty Models
Smart Exploration in Reinforcement Learning using Bounded Uncertainty Models
J.S. van Hulst
W.P.M.H. Heemels
D.J. Antunes
OffRL
56
0
0
08 Apr 2025
An Information-Geometric Approach to Artificial Curiosity
An Information-Geometric Approach to Artificial Curiosity
Alexander Nedergaard
Pablo A. Morales
135
0
0
08 Apr 2025
A Reinforcement Learning Method for Environments with Stochastic Variables: Post-Decision Proximal Policy Optimization with Dual Critic Networks
A Reinforcement Learning Method for Environments with Stochastic Variables: Post-Decision Proximal Policy Optimization with Dual Critic Networks
L. Felizardo
Edoardo Fadda
Paolo Brandimarte
E. Del-Moral-Hernandez
Mariá Cristina Vasconcelos Nascimento
OffRL
81
0
0
07 Apr 2025
A Unified Pairwise Framework for RLHF: Bridging Generative Reward Modeling and Policy Optimization
A Unified Pairwise Framework for RLHF: Bridging Generative Reward Modeling and Policy Optimization
Wenyuan Xu
Xiaochen Zuo
Chao Xin
Yu Yue
Lin Yan
Yonghui Wu
OffRL
73
7
0
07 Apr 2025
Deliberate Planning of 3D Bin Packing on Packing Configuration Trees
Deliberate Planning of 3D Bin Packing on Packing Configuration Trees
Hang Zhao
Juzhan Xu
Kexiong Yu
Ruizhen Hu
Chenyang Zhu
K. Xu
170
2
0
06 Apr 2025
Economic Battery Storage Dispatch with Deep Reinforcement Learning from Rule-Based Demonstrations
Economic Battery Storage Dispatch with Deep Reinforcement Learning from Rule-Based Demonstrations
Manuel Sage
Martin Staniszewski
Yaoyao Fiona Zhao
98
2
0
06 Apr 2025
MInCo: Mitigating Information Conflicts in Distracted Visual Model-based Reinforcement Learning
MInCo: Mitigating Information Conflicts in Distracted Visual Model-based Reinforcement Learning
Shiguang Sun
Hanbo Zhang
Zeyang Liu
Xinrui Yang
Lipeng Wan
Bing Yan
Xingyu Chen
232
0
0
05 Apr 2025
A General Peg-in-Hole Assembly Policy Based on Domain Randomized Reinforcement Learning
A General Peg-in-Hole Assembly Policy Based on Domain Randomized Reinforcement Learning
Xinyu Liu
Aljaz Kramberger
Leon Bodenhagen
57
0
0
05 Apr 2025
Optimistic Learning for Communication Networks
Optimistic Learning for Communication Networks
George Iosifidis
N. Mhaisen
D. Leith
OffRL
93
0
0
04 Apr 2025
MORAL: A Multimodal Reinforcement Learning Framework for Decision Making in Autonomous Laboratories
MORAL: A Multimodal Reinforcement Learning Framework for Decision Making in Autonomous Laboratories
Natalie Tirabassi
Sathish A. P. Kumar
S. Jha
Arvind Ramanathan
LM&RoOffRL
85
0
0
04 Apr 2025
MAD: A Magnitude And Direction Policy Parametrization for Stability Constrained Reinforcement Learning
MAD: A Magnitude And Direction Policy Parametrization for Stability Constrained Reinforcement Learning
Luca Furieri
Sucheth Shenoy
Danilo Saccani
Andrea Martin
Giancarlo Ferrari-Trecate
54
0
0
03 Apr 2025
Probabilistic Curriculum Learning for Goal-Based Reinforcement Learning
Probabilistic Curriculum Learning for Goal-Based Reinforcement Learning
Llewyn Salt
Marcus Gallagher
66
1
0
02 Apr 2025
Beyond Non-Expert Demonstrations: Outcome-Driven Action Constraint for Offline Reinforcement Learning
Beyond Non-Expert Demonstrations: Outcome-Driven Action Constraint for Offline Reinforcement Learning
Ke Jiang
Wen Jiang
You Li
Xiaoyang Tan
OffRL
91
0
0
02 Apr 2025
Ordering-based Conditions for Global Convergence of Policy Gradient Methods
Ordering-based Conditions for Global Convergence of Policy Gradient Methods
Jincheng Mei
Bo Dai
Alekh Agarwal
Mohammad Ghavamzadeh
Csaba Szepesvári
Dale Schuurmans
143
4
0
02 Apr 2025
Inverse RL Scene Dynamics Learning for Nonlinear Predictive Control in Autonomous Vehicles
Inverse RL Scene Dynamics Learning for Nonlinear Predictive Control in Autonomous Vehicles
Sorin Grigorescu
Mihai V. Zaha
AI4CE
127
0
0
02 Apr 2025
MPCritic: A plug-and-play MPC architecture for reinforcement learning
MPCritic: A plug-and-play MPC architecture for reinforcement learning
Nathan P. Lawrence
Thomas Banker
Ali Mesbah
107
0
0
01 Apr 2025
A Reactive Framework for Whole-Body Motion Planning of Mobile Manipulators Combining Reinforcement Learning and SDF-Constrained Quadratic Programmi
A Reactive Framework for Whole-Body Motion Planning of Mobile Manipulators Combining Reinforcement Learning and SDF-Constrained Quadratic Programmi
Chenyu Zhang
Shiying Sun
Kuan Liu
Chuanbao Zhou
Xiaoguang Zhao
M. Tan
Yuanmin Huang
93
0
0
31 Mar 2025
A Survey of Reinforcement Learning-Based Motion Planning for Autonomous Driving: Lessons Learned from a Driving Task Perspective
A Survey of Reinforcement Learning-Based Motion Planning for Autonomous Driving: Lessons Learned from a Driving Task Perspective
Zhuoren Li
Guizhe Jin
Ran Yu
Zhiwen Chen
Nan I. Li
...
Lu Xiong
Bo Leng
Jia Hu
Ilya Kolmanovsky
Dimitar Filev
108
0
0
31 Mar 2025
MAER-Nav: Bidirectional Motion Learning Through Mirror-Augmented Experience Replay for Robot Navigation
MAER-Nav: Bidirectional Motion Learning Through Mirror-Augmented Experience Replay for Robot Navigation
Shanze Wang
Mingao Tan
Zhiyong Yang
Biao Huang
Xiaoyu Shen
Hailong Huang
Wei Zhang
59
0
0
31 Mar 2025
RL2Grid: Benchmarking Reinforcement Learning in Power Grid Operations
RL2Grid: Benchmarking Reinforcement Learning in Power Grid Operations
Enrico Marchesini
Benjamin Donnot
Constance Crozier
Ian Dytham
Christian Merz
Lars Schewe
Nico Westerbeck
Cathy Wu
Antoine Marot
P. Donti
OffRL
93
1
0
29 Mar 2025
Entropy-guided sequence weighting for efficient exploration in RL-based LLM fine-tuning
Entropy-guided sequence weighting for efficient exploration in RL-based LLM fine-tuning
Abdullah Vanlioglu
144
0
0
28 Mar 2025
On the Mistaken Assumption of Interchangeable Deep Reinforcement Learning Implementations
On the Mistaken Assumption of Interchangeable Deep Reinforcement Learning Implementations
Rajdeep Singh Hundal
Yan Xiao
Xiaochun Cao
Jin Song Dong
Manuel Rigger
139
0
0
28 Mar 2025
Analysis of On-policy Policy Gradient Methods under the Distribution Mismatch
Analysis of On-policy Policy Gradient Methods under the Distribution Mismatch
Weizhen Wang
Jianping He
Xiaoming Duan
86
1
0
28 Mar 2025
FLAM: Foundation Model-Based Body Stabilization for Humanoid Locomotion and Manipulation
FLAM: Foundation Model-Based Body Stabilization for Humanoid Locomotion and Manipulation
Xianqi Zhang
Hongliang Wei
Wenrui Wang
Xingtao Wang
Xiaopeng Fan
Debin Zhao
82
1
0
28 Mar 2025
Bresa: Bio-inspired Reflexive Safe Reinforcement Learning for Contact-Rich Robotic Tasks
Bresa: Bio-inspired Reflexive Safe Reinforcement Learning for Contact-Rich Robotic Tasks
Heng Zhang
Gokhan Solak
Arash Ajoudani
70
2
0
27 Mar 2025
Efficient Learning for Entropy-Regularized Markov Decision Processes via Multilevel Monte Carlo
Efficient Learning for Entropy-Regularized Markov Decision Processes via Multilevel Monte Carlo
Matthieu Meunier
C. Reisinger
Yufei Zhang
136
0
0
27 Mar 2025
Model-Based Offline Reinforcement Learning with Adversarial Data Augmentation
Model-Based Offline Reinforcement Learning with Adversarial Data Augmentation
Hongye Cao
Fan Feng
Jing Huo
Shangdong Yang
Meng Fang
Tianpei Yang
Yang Gao
AAMLOffRL
115
0
0
26 Mar 2025
Offline Reinforcement Learning with Discrete Diffusion Skills
Offline Reinforcement Learning with Discrete Diffusion Skills
Ruixi Qiao
Jie Cheng
Xingyuan Dai
Yonglin Tian
Yisheng Lv
OffRL
104
0
0
26 Mar 2025
AccidentSim: Generating Physically Realistic Vehicle Collision Videos from Real-World Accident Reports
AccidentSim: Generating Physically Realistic Vehicle Collision Videos from Real-World Accident Reports
Xinsong Zhang
Qian Zhang
Longfei Han
Qiang Qu
Xiaoming Chen
VGen
103
1
0
26 Mar 2025
Risk-Aware Reinforcement Learning for Autonomous Driving: Improving Safety When Driving through Intersection
Risk-Aware Reinforcement Learning for Autonomous Driving: Improving Safety When Driving through Intersection
Bo Leng
Ran Yu
Wei Han
Lu Xiong
Zhuoren Li
Hailong Huang
125
1
0
25 Mar 2025
NeoRL-2: Near Real-World Benchmarks for Offline Reinforcement Learning with Extended Realistic Scenarios
NeoRL-2: Near Real-World Benchmarks for Offline Reinforcement Learning with Extended Realistic Scenarios
Songyi Gao
Zuolin Tu
Rong-Jun Qin
Yi-Hao Sun
Xiong-Hui Chen
Yang Yu
OffRL
79
0
0
25 Mar 2025
Continual Reinforcement Learning for HVAC Systems Control: Integrating Hypernetworks and Transfer Learning
Continual Reinforcement Learning for HVAC Systems Control: Integrating Hypernetworks and Transfer Learning
Gautham Udayakumar Bekal
Ahmed Ghareeb
Ashish Pujari
AI4CE
82
0
0
24 Mar 2025
Evolutionary Policy Optimization
Evolutionary Policy Optimization
Jianren Wang
Yifan Su
Abhinav Gupta
Deepak Pathak
86
0
0
24 Mar 2025
Sample-Efficient Reinforcement Learning of Koopman eNMPC
Sample-Efficient Reinforcement Learning of Koopman eNMPC
Daniel Mayfrank
M. Velioglu
Alexander Mitsos
Manuel Dahmen
OffRL
91
0
0
24 Mar 2025
Bootstrapped Model Predictive Control
Bootstrapped Model Predictive Control
Yuhang Wang
Hanwei Guo
Sizhe Wang
Long Qian
Xuguang Lan
122
1
0
24 Mar 2025
Predicting Multitasking in Manual and Automated Driving with Optimal Supervisory Control
Predicting Multitasking in Manual and Automated Driving with Optimal Supervisory Control
Jussi Jokinen
Patrick Ebel
Tuomo Kujala
89
0
0
23 Mar 2025
KEA: Keeping Exploration Alive by Proactively Coordinating Exploration Strategies
KEA: Keeping Exploration Alive by Proactively Coordinating Exploration Strategies
Shih-Min Yang
Martin Magnusson
J. A. Stork
Todor Stoyanov
82
0
0
23 Mar 2025
CAE: Repurposing the Critic as an Explorer in Deep Reinforcement Learning
CAE: Repurposing the Critic as an Explorer in Deep Reinforcement Learning
Yexin Li
Pring Wong
Hanfang Zhang
Shuo Chen
Siyuan Qi
OffRL
87
1
0
23 Mar 2025
Understanding Inverse Reinforcement Learning under Overparameterization: Non-Asymptotic Analysis and Global Optimality
Understanding Inverse Reinforcement Learning under Overparameterization: Non-Asymptotic Analysis and Global Optimality
Ruijia Zhang
Siliang Zeng
Chenliang Li
Alfredo García
Mingyi Hong
121
0
0
22 Mar 2025
LaMOuR: Leveraging Language Models for Out-of-Distribution Recovery in Reinforcement Learning
LaMOuR: Leveraging Language Models for Out-of-Distribution Recovery in Reinforcement Learning
Chan Kim
Seung-Woo Seo
Seong-Woo Kim
OODD
457
0
0
21 Mar 2025
Self-Learning-Based Optimization for Free-form Pipe Routing in Aeroengine with Dynamic Design Environment
Self-Learning-Based Optimization for Free-form Pipe Routing in Aeroengine with Dynamic Design Environment
Caicheng Wang
Zili Wang
Shuyou Zhang
Yongzhe Xiang
Zhiyu Li
Jianrong Tan
AI4CE
56
0
0
20 Mar 2025
Previous
123456...818283
Next