Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1801.01290
Cited By
v1
v2 (latest)
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"
50 / 4,128 papers shown
Title
State Estimation Using Particle Filtering in Adaptive Machine Learning Methods: Integrating Q-Learning and NEAT Algorithms with Noisy Radar Measurements
Wonjin Song
Feng Bao
64
0
0
10 Apr 2025
Bridging Deep Reinforcement Learning and Motion Planning for Model-Free Navigation in Cluttered Environments
Licheng Luo
Mingyu Cai
101
0
0
09 Apr 2025
Neuron-level Balance between Stability and Plasticity in Deep Reinforcement Learning
Jiahua Lan
Sen Zhang
Haixia Pan
Ruijun Liu
Li Shen
Dacheng Tao
CLL
91
0
0
09 Apr 2025
Deep Neural Koopman Operator-based Economic Model Predictive Control of Shipboard Carbon Capture System
Minghao Han
Xunyuan Yin
108
0
0
09 Apr 2025
Neural Motion Simulator: Pushing the Limit of World Models in Reinforcement Learning
Chenjie Hao
Weyl Lu
Yifan Xu
Yubei Chen
48
0
0
09 Apr 2025
xMTF: A Formula-Free Model for Reinforcement-Learning-Based Multi-Task Fusion in Recommender Systems
Yang Cao
Changhao Zhang
Xiaoshuang Chen
Kaiqiao Zhan
Ben Wang
69
1
0
08 Apr 2025
Robo-taxi Fleet Coordination at Scale via Reinforcement Learning
Luigi Tresca
Carolin Schmidt
James Harrison
Filipe Rodrigues
G. Zardini
Daniele Gammelli
Marco Pavone
99
3
0
08 Apr 2025
Stratified Expert Cloning with Adaptive Selection for User Retention in Large-Scale Recommender Systems
Chengzhi Lin
Annan Xie
Shuchang Liu
Wuhong Wang
Chuyuan Wang
Yongqi Liu
OffRL
61
0
0
08 Apr 2025
Trust-Region Twisted Policy Improvement
Joery A. de Vries
Jinke He
Yaniv Oren
M. Spaan
OffRL
LRM
144
0
0
08 Apr 2025
Smart Exploration in Reinforcement Learning using Bounded Uncertainty Models
J.S. van Hulst
W.P.M.H. Heemels
D.J. Antunes
OffRL
56
0
0
08 Apr 2025
An Information-Geometric Approach to Artificial Curiosity
Alexander Nedergaard
Pablo A. Morales
135
0
0
08 Apr 2025
A Reinforcement Learning Method for Environments with Stochastic Variables: Post-Decision Proximal Policy Optimization with Dual Critic Networks
L. Felizardo
Edoardo Fadda
Paolo Brandimarte
E. Del-Moral-Hernandez
Mariá Cristina Vasconcelos Nascimento
OffRL
81
0
0
07 Apr 2025
A Unified Pairwise Framework for RLHF: Bridging Generative Reward Modeling and Policy Optimization
Wenyuan Xu
Xiaochen Zuo
Chao Xin
Yu Yue
Lin Yan
Yonghui Wu
OffRL
73
7
0
07 Apr 2025
Deliberate Planning of 3D Bin Packing on Packing Configuration Trees
Hang Zhao
Juzhan Xu
Kexiong Yu
Ruizhen Hu
Chenyang Zhu
K. Xu
170
2
0
06 Apr 2025
Economic Battery Storage Dispatch with Deep Reinforcement Learning from Rule-Based Demonstrations
Manuel Sage
Martin Staniszewski
Yaoyao Fiona Zhao
98
2
0
06 Apr 2025
MInCo: Mitigating Information Conflicts in Distracted Visual Model-based Reinforcement Learning
Shiguang Sun
Hanbo Zhang
Zeyang Liu
Xinrui Yang
Lipeng Wan
Bing Yan
Xingyu Chen
232
0
0
05 Apr 2025
A General Peg-in-Hole Assembly Policy Based on Domain Randomized Reinforcement Learning
Xinyu Liu
Aljaz Kramberger
Leon Bodenhagen
57
0
0
05 Apr 2025
Optimistic Learning for Communication Networks
George Iosifidis
N. Mhaisen
D. Leith
OffRL
93
0
0
04 Apr 2025
MORAL: A Multimodal Reinforcement Learning Framework for Decision Making in Autonomous Laboratories
Natalie Tirabassi
Sathish A. P. Kumar
S. Jha
Arvind Ramanathan
LM&Ro
OffRL
85
0
0
04 Apr 2025
MAD: A Magnitude And Direction Policy Parametrization for Stability Constrained Reinforcement Learning
Luca Furieri
Sucheth Shenoy
Danilo Saccani
Andrea Martin
Giancarlo Ferrari-Trecate
54
0
0
03 Apr 2025
Probabilistic Curriculum Learning for Goal-Based Reinforcement Learning
Llewyn Salt
Marcus Gallagher
66
1
0
02 Apr 2025
Beyond Non-Expert Demonstrations: Outcome-Driven Action Constraint for Offline Reinforcement Learning
Ke Jiang
Wen Jiang
You Li
Xiaoyang Tan
OffRL
91
0
0
02 Apr 2025
Ordering-based Conditions for Global Convergence of Policy Gradient Methods
Jincheng Mei
Bo Dai
Alekh Agarwal
Mohammad Ghavamzadeh
Csaba Szepesvári
Dale Schuurmans
143
4
0
02 Apr 2025
Inverse RL Scene Dynamics Learning for Nonlinear Predictive Control in Autonomous Vehicles
Sorin Grigorescu
Mihai V. Zaha
AI4CE
127
0
0
02 Apr 2025
MPCritic: A plug-and-play MPC architecture for reinforcement learning
Nathan P. Lawrence
Thomas Banker
Ali Mesbah
107
0
0
01 Apr 2025
A Reactive Framework for Whole-Body Motion Planning of Mobile Manipulators Combining Reinforcement Learning and SDF-Constrained Quadratic Programmi
Chenyu Zhang
Shiying Sun
Kuan Liu
Chuanbao Zhou
Xiaoguang Zhao
M. Tan
Yuanmin Huang
93
0
0
31 Mar 2025
A Survey of Reinforcement Learning-Based Motion Planning for Autonomous Driving: Lessons Learned from a Driving Task Perspective
Zhuoren Li
Guizhe Jin
Ran Yu
Zhiwen Chen
Nan I. Li
...
Lu Xiong
Bo Leng
Jia Hu
Ilya Kolmanovsky
Dimitar Filev
108
0
0
31 Mar 2025
MAER-Nav: Bidirectional Motion Learning Through Mirror-Augmented Experience Replay for Robot Navigation
Shanze Wang
Mingao Tan
Zhiyong Yang
Biao Huang
Xiaoyu Shen
Hailong Huang
Wei Zhang
59
0
0
31 Mar 2025
RL2Grid: Benchmarking Reinforcement Learning in Power Grid Operations
Enrico Marchesini
Benjamin Donnot
Constance Crozier
Ian Dytham
Christian Merz
Lars Schewe
Nico Westerbeck
Cathy Wu
Antoine Marot
P. Donti
OffRL
93
1
0
29 Mar 2025
Entropy-guided sequence weighting for efficient exploration in RL-based LLM fine-tuning
Abdullah Vanlioglu
144
0
0
28 Mar 2025
On the Mistaken Assumption of Interchangeable Deep Reinforcement Learning Implementations
Rajdeep Singh Hundal
Yan Xiao
Xiaochun Cao
Jin Song Dong
Manuel Rigger
139
0
0
28 Mar 2025
Analysis of On-policy Policy Gradient Methods under the Distribution Mismatch
Weizhen Wang
Jianping He
Xiaoming Duan
86
1
0
28 Mar 2025
FLAM: Foundation Model-Based Body Stabilization for Humanoid Locomotion and Manipulation
Xianqi Zhang
Hongliang Wei
Wenrui Wang
Xingtao Wang
Xiaopeng Fan
Debin Zhao
82
1
0
28 Mar 2025
Bresa: Bio-inspired Reflexive Safe Reinforcement Learning for Contact-Rich Robotic Tasks
Heng Zhang
Gokhan Solak
Arash Ajoudani
70
2
0
27 Mar 2025
Efficient Learning for Entropy-Regularized Markov Decision Processes via Multilevel Monte Carlo
Matthieu Meunier
C. Reisinger
Yufei Zhang
136
0
0
27 Mar 2025
Model-Based Offline Reinforcement Learning with Adversarial Data Augmentation
Hongye Cao
Fan Feng
Jing Huo
Shangdong Yang
Meng Fang
Tianpei Yang
Yang Gao
AAML
OffRL
115
0
0
26 Mar 2025
Offline Reinforcement Learning with Discrete Diffusion Skills
Ruixi Qiao
Jie Cheng
Xingyuan Dai
Yonglin Tian
Yisheng Lv
OffRL
104
0
0
26 Mar 2025
AccidentSim: Generating Physically Realistic Vehicle Collision Videos from Real-World Accident Reports
Xinsong Zhang
Qian Zhang
Longfei Han
Qiang Qu
Xiaoming Chen
VGen
103
1
0
26 Mar 2025
Risk-Aware Reinforcement Learning for Autonomous Driving: Improving Safety When Driving through Intersection
Bo Leng
Ran Yu
Wei Han
Lu Xiong
Zhuoren Li
Hailong Huang
125
1
0
25 Mar 2025
NeoRL-2: Near Real-World Benchmarks for Offline Reinforcement Learning with Extended Realistic Scenarios
Songyi Gao
Zuolin Tu
Rong-Jun Qin
Yi-Hao Sun
Xiong-Hui Chen
Yang Yu
OffRL
79
0
0
25 Mar 2025
Continual Reinforcement Learning for HVAC Systems Control: Integrating Hypernetworks and Transfer Learning
Gautham Udayakumar Bekal
Ahmed Ghareeb
Ashish Pujari
AI4CE
82
0
0
24 Mar 2025
Evolutionary Policy Optimization
Jianren Wang
Yifan Su
Abhinav Gupta
Deepak Pathak
86
0
0
24 Mar 2025
Sample-Efficient Reinforcement Learning of Koopman eNMPC
Daniel Mayfrank
M. Velioglu
Alexander Mitsos
Manuel Dahmen
OffRL
91
0
0
24 Mar 2025
Bootstrapped Model Predictive Control
Yuhang Wang
Hanwei Guo
Sizhe Wang
Long Qian
Xuguang Lan
122
1
0
24 Mar 2025
Predicting Multitasking in Manual and Automated Driving with Optimal Supervisory Control
Jussi Jokinen
Patrick Ebel
Tuomo Kujala
89
0
0
23 Mar 2025
KEA: Keeping Exploration Alive by Proactively Coordinating Exploration Strategies
Shih-Min Yang
Martin Magnusson
J. A. Stork
Todor Stoyanov
82
0
0
23 Mar 2025
CAE: Repurposing the Critic as an Explorer in Deep Reinforcement Learning
Yexin Li
Pring Wong
Hanfang Zhang
Shuo Chen
Siyuan Qi
OffRL
87
1
0
23 Mar 2025
Understanding Inverse Reinforcement Learning under Overparameterization: Non-Asymptotic Analysis and Global Optimality
Ruijia Zhang
Siliang Zeng
Chenliang Li
Alfredo García
Mingyi Hong
121
0
0
22 Mar 2025
LaMOuR: Leveraging Language Models for Out-of-Distribution Recovery in Reinforcement Learning
Chan Kim
Seung-Woo Seo
Seong-Woo Kim
OODD
457
0
0
21 Mar 2025
Self-Learning-Based Optimization for Free-form Pipe Routing in Aeroengine with Dynamic Design Environment
Caicheng Wang
Zili Wang
Shuyou Zhang
Yongzhe Xiang
Zhiyu Li
Jianrong Tan
AI4CE
56
0
0
20 Mar 2025
Previous
1
2
3
4
5
6
...
81
82
83
Next