ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1801.01290
  4. Cited By
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
v1v2 (latest)

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
ArXiv (abs)PDFHTML

Papers citing "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"

50 / 4,128 papers shown
Title
Planning under Uncertainty to Goal Distributions
Planning under Uncertainty to Goal Distributions
Adam Conkey
Tucker Hermans
84
3
0
01 Jul 2025
Robust Dynamic Material Handling via Adaptive Constrained Evolutionary Reinforcement Learning
Robust Dynamic Material Handling via Adaptive Constrained Evolutionary Reinforcement Learning
Chengpeng Hu
Ziming Wang
Bo Yuan
Jialin Liu
Chengqi Zhang
Xin Yao
27
0
0
20 Jun 2025
Network Sparsity Unlocks the Scaling Potential of Deep Reinforcement Learning
Network Sparsity Unlocks the Scaling Potential of Deep Reinforcement Learning
Guozheng Ma
Lu Li
Zilin Wang
Li Shen
Pierre-Luc Bacon
Dacheng Tao
OffRL
27
0
0
20 Jun 2025
Robust Reinforcement Learning for Discrete Compositional Generation via General Soft Operators
Robust Reinforcement Learning for Discrete Compositional Generation via General Soft Operators
Marco Jiralerspong
E. Derman
Danilo Vucetic
Nikolay Malkin
Bilun Sun
Tianyu Zhang
Pierre-Luc Bacon
Gauthier Gidel
OffRL
28
0
0
20 Jun 2025
Off-Policy Actor-Critic for Adversarial Observation Robustness: Virtual Alternative Training via Symmetric Policy Evaluation
Off-Policy Actor-Critic for Adversarial Observation Robustness: Virtual Alternative Training via Symmetric Policy Evaluation
Kosuke Nakanishi
Akihiro Kubo
Yuji Yasui
Shin Ishii
AAMLOffRL
31
0
0
20 Jun 2025
DRARL: Disengagement-Reason-Augmented Reinforcement Learning for Efficient Improvement of Autonomous Driving Policy
DRARL: Disengagement-Reason-Augmented Reinforcement Learning for Efficient Improvement of Autonomous Driving Policy
Weitao Zhou
Bo Zhang
Zhong Cao
X. Li
Qian Cheng
Chunyang Liu
Y. Zhang
Diange Yang
27
0
0
20 Jun 2025
Data-Driven Policy Mapping for Safe RL-based Energy Management Systems
Data-Driven Policy Mapping for Safe RL-based Energy Management Systems
Theo Zangato
A. Osmani
Pegah Alizadeh
20
0
0
19 Jun 2025
BIDA: A Bi-level Interaction Decision-making Algorithm for Autonomous Vehicles in Dynamic Traffic Scenarios
BIDA: A Bi-level Interaction Decision-making Algorithm for Autonomous Vehicles in Dynamic Traffic Scenarios
Liyang Yu
Tianyi Wang
Junfeng Jiao
Fengwu Shan
Hongqing Chu
B. Gao
15
0
0
19 Jun 2025
Distribution Parameter Actor-Critic: Shifting the Agent-Environment Boundary for Diverse Action Spaces
Distribution Parameter Actor-Critic: Shifting the Agent-Environment Boundary for Diverse Action Spaces
Jiamin He
A. Rupam Mahmood
Martha White
24
0
0
19 Jun 2025
GoalLadder: Incremental Goal Discovery with Vision-Language Models
GoalLadder: Incremental Goal Discovery with Vision-Language Models
Alexey Zakharov
Shimon Whiteson
24
0
0
19 Jun 2025
Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning
Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning
Roger Creus Castanyer
J. Obando-Ceron
Lu Li
Pierre-Luc Bacon
Glen Berseth
Aaron Courville
Pablo Samuel Castro
36
0
0
18 Jun 2025
Learning Task-Agnostic Skill Bases to Uncover Motor Primitives in Animal Behaviors
Learning Task-Agnostic Skill Bases to Uncover Motor Primitives in Animal Behaviors
Jiyi Wang
Jingyang Ke
Bo Dai
Anqi Wu
17
0
0
18 Jun 2025
CAWR: Corruption-Averse Advantage-Weighted Regression for Robust Policy Optimization
CAWR: Corruption-Averse Advantage-Weighted Regression for Robust Policy Optimization
Ranting Hu
OffRL
38
0
0
18 Jun 2025
Steering Your Diffusion Policy with Latent Space Reinforcement Learning
Steering Your Diffusion Policy with Latent Space Reinforcement Learning
Andrew Wagenmaker
Mitsuhiko Nakamoto
Yunchu Zhang
S. Park
Waleed Yagoub
Anusha Nagabandi
Abhishek Gupta
Sergey Levine
OffRL
39
0
0
18 Jun 2025
IntelliLung: Advancing Safe Mechanical Ventilation using Offline RL with Hybrid Actions and Clinically Aligned Rewards
IntelliLung: Advancing Safe Mechanical Ventilation using Offline RL with Hybrid Actions and Clinically Aligned Rewards
Muhammad Hamza Yousuf
Jason Li
S. Vahdati
Raphael Theilen
Jakob Wittenstein
Jens Lehmann
OffRL
22
0
0
17 Jun 2025
Reasoning with Exploration: An Entropy Perspective
Reasoning with Exploration: An Entropy Perspective
Daixuan Cheng
Shaohan Huang
Xuekai Zhu
Bo Dai
Wayne Xin Zhao
Zhenliang Zhang
Furu Wei
LRM
38
0
0
17 Jun 2025
Learning Swing-up Maneuvers for a Suspended Aerial Manipulation Platform in a Hierarchical Control Framework
Learning Swing-up Maneuvers for a Suspended Aerial Manipulation Platform in a Hierarchical Control Framework
Hemjyoti Das
Minh Nhat Vu
Christian Ott
23
0
0
16 Jun 2025
Overcoming Overfitting in Reinforcement Learning via Gaussian Process Diffusion Policy
Overcoming Overfitting in Reinforcement Learning via Gaussian Process Diffusion Policy
Amornyos Horprasert
Esa Apriaskar
Xingyu Liu
Lanlan Su
Lyudmila S. Mihaylova
32
0
0
16 Jun 2025
Scaling Algorithm Distillation for Continuous Control with Mamba
Scaling Algorithm Distillation for Continuous Control with Mamba
Samuel Beaussant
Mehdi Mounsif
30
0
0
16 Jun 2025
A Novel ViDAR Device With Visual Inertial Encoder Odometry and Reinforcement Learning-Based Active SLAM Method
A Novel ViDAR Device With Visual Inertial Encoder Odometry and Reinforcement Learning-Based Active SLAM Method
Zhanhua Xin
Zhihao Wang
Shenghao Zhang
Wanchao Chi
Yan Meng
Shihan Kong
Yan Xiong
Chong Zhang
Yuzhen Liu
Junzhi Yu
22
0
0
16 Jun 2025
Enhancing Rating-Based Reinforcement Learning to Effectively Leverage Feedback from Large Vision-Language Models
Enhancing Rating-Based Reinforcement Learning to Effectively Leverage Feedback from Large Vision-Language Models
Tung M. Luu
Younghwan Lee
Donghoon Lee
Sunho Kim
Min Jun Kim
Chang D. Yoo
ALMVLM
25
0
0
15 Jun 2025
Flow-Based Policy for Online Reinforcement Learning
Flow-Based Policy for Online Reinforcement Learning
Lei Lv
Y. Li
Yu-Juan Luo
F. Sun
Tao Kong
Jiafeng Xu
Xiao Ma
28
0
0
15 Jun 2025
CIRO7.2: A Material Network with Circularity of -7.2 and Reinforcement-Learning-Controlled Robotic Disassembler
CIRO7.2: A Material Network with Circularity of -7.2 and Reinforcement-Learning-Controlled Robotic Disassembler
Federico Zocco
Monica Malvezzi
17
0
0
13 Jun 2025
Palpation Alters Auditory Pain Expressions with Gender-Specific Variations in Robopatients
Palpation Alters Auditory Pain Expressions with Gender-Specific Variations in Robopatients
Chapa Sirithunge
Yue Xie
Saitarun Nadipineni
Fumiya Iida
Thilina Dulantha Lalitharatne
87
0
0
13 Jun 2025
DoublyAware: Dual Planning and Policy Awareness for Temporal Difference Learning in Humanoid Locomotion
DoublyAware: Dual Planning and Policy Awareness for Temporal Difference Learning in Humanoid Locomotion
Khang Nguyen
An T. Le
Jan Peters
Minh Nhat Vu
25
0
0
12 Jun 2025
Wasserstein Barycenter Soft Actor-Critic
Wasserstein Barycenter Soft Actor-Critic
Zahra Shahrooei
Ali Baheri
OffRL
61
0
0
11 Jun 2025
Bipedal Balance Control with Whole-body Musculoskeletal Standing and Falling Simulations
Bipedal Balance Control with Whole-body Musculoskeletal Standing and Falling Simulations
Chengtian Ma
Yunyue Wei
Chenhui Zuo
Chen Zhang
Yanan Sui
70
0
0
11 Jun 2025
On a few pitfalls in KL divergence gradient estimation for RL
Yunhao Tang
Rémi Munos
64
0
0
11 Jun 2025
Efficient Preference-Based Reinforcement Learning: Randomized Exploration Meets Experimental Design
Efficient Preference-Based Reinforcement Learning: Randomized Exploration Meets Experimental Design
Andreas Schlaginhaufen
Reda Ouhamma
Maryam Kamgarpour
76
0
0
11 Jun 2025
Time-Aware World Model for Adaptive Prediction and Control
Anh N. Nhu
Sanghyun Son
Ming-Chyuan Lin
AI4TSTTA
38
0
0
10 Jun 2025
Dynamical System Optimization
Emo Todorov
31
0
0
10 Jun 2025
Re4MPC: Reactive Nonlinear MPC for Multi-model Motion Planning via Deep Reinforcement Learning
Neset Unver Akmandor
Sarvesh Prajapati
Mark Zolotas
T. Padır
24
0
0
10 Jun 2025
Intention-Conditioned Flow Occupancy Models
Chongyi Zheng
S. Park
Sergey Levine
Benjamin Eysenbach
AI4TSOffRLAI4CE
48
0
0
10 Jun 2025
Your Agent Can Defend Itself against Backdoor Attacks
Your Agent Can Defend Itself against Backdoor Attacks
Li Changjiang
Liang Jiacheng
Cao Bochuan
Chen Jinghui
Wang Ting
AAMLLLMAG
53
0
0
10 Jun 2025
Offline RL with Smooth OOD Generalization in Convex Hull and its Neighborhood
Qingmao Yao
Zhichao Lei
Tianyuan Chen
Ziyue Yuan
Xuefan Chen
Jianxiang Liu
Faguo Wu
Xiao Zhang
OffRL
33
1
0
10 Jun 2025
Deep Reinforcement Learning-Based Motion Planning and PDE Control for Flexible Manipulators
Amir Hossein Barjini
Seyed Adel Alizadeh Kolagar
Sadeq Yaqubi
Jouni Mattila
20
0
0
10 Jun 2025
MOBODY: Model Based Off-Dynamics Offline Reinforcement Learning
Yihong Guo
Yu Yang
Pan Xu
Anqi Liu
OffRL
47
0
0
10 Jun 2025
Graph-Assisted Stitching for Offline Hierarchical Reinforcement Learning
Graph-Assisted Stitching for Offline Hierarchical Reinforcement Learning
Seungho Baek
Taegeon Park
Jongchan Park
Seungjun Oh
Yusung Kim
OffRL
33
0
0
09 Jun 2025
Reliable Critics: Monotonic Improvement and Convergence Guarantees for Reinforcement Learning
Reliable Critics: Monotonic Improvement and Convergence Guarantees for Reinforcement Learning
Eshwar S. R.
Gugan Thoppe
Aditya Gopalan
Gal Dalal
20
0
0
08 Jun 2025
Learning What Matters Now: A Dual-Critic Context-Aware RL Framework for Priority-Driven Information Gain
Learning What Matters Now: A Dual-Critic Context-Aware RL Framework for Priority-Driven Information Gain
Dimitris Panagopoulos
Adolfo Perrusquía
Weisi Guo
26
0
0
07 Jun 2025
Gradual Transition from Bellman Optimality Operator to Bellman Operator in Online Reinforcement Learning
Gradual Transition from Bellman Optimality Operator to Bellman Operator in Online Reinforcement Learning
Motoki Omura
Kazuki Ota
Takayuki Osa
Yusuke Mukuta
Tatsuya Harada
OffRL
48
0
0
06 Jun 2025
Self driving algorithm for an active four wheel drive racecar
Self driving algorithm for an active four wheel drive racecar
Gergely Bari
Laszlo Palkovics
72
0
0
06 Jun 2025
AMPED: Adaptive Multi-objective Projection for balancing Exploration and skill Diversification
AMPED: Adaptive Multi-objective Projection for balancing Exploration and skill Diversification
Geonwoo Cho
Jaemoon Lee
Jaegyun Im
Subi Lee
Jihwan Lee
Sundong Kim
40
0
0
06 Jun 2025
Self-Predictive Dynamics for Generalization of Vision-based Reinforcement Learning
Self-Predictive Dynamics for Generalization of Vision-based Reinforcement Learning
Kyungsoo Kim
Jeongsoo Ha
Yusung Kim
BDL
47
7
0
05 Jun 2025
AutoQD: Automatic Discovery of Diverse Behaviors with Quality-Diversity Optimization
AutoQD: Automatic Discovery of Diverse Behaviors with Quality-Diversity Optimization
Saeed Hedayatian
Stefanos Nikolaidis
19
0
0
05 Jun 2025
When Maximum Entropy Misleads Policy Optimization
When Maximum Entropy Misleads Policy Optimization
Ruipeng Zhang
Ya-Chien Chang
Sicun Gao
48
0
0
05 Jun 2025
Verification-Guided Falsification for Safe RL via Explainable Abstraction and Risk-Aware Exploration
Verification-Guided Falsification for Safe RL via Explainable Abstraction and Risk-Aware Exploration
Tuan Le
Risal Shahriar Shefin
Debashis Gupta
Thai Le
Sarra Alqahtani
OffRL
99
0
0
04 Jun 2025
Horizon Reduction Makes RL Scalable
Horizon Reduction Makes RL Scalable
Seohong Park
Kevin Frans
Deepinder Mann
Benjamin Eysenbach
Aviral Kumar
Sergey Levine
OffRL
96
0
0
04 Jun 2025
FLIP: Flowability-Informed Powder Weighing
FLIP: Flowability-Informed Powder Weighing
Nikola Radulov
Alex Wright
Thomas Little
Andrew I. Cooper
Gabriella Pizzuto
96
0
0
04 Jun 2025
Latent Guided Sampling for Combinatorial Optimization
Latent Guided Sampling for Combinatorial Optimization
Sobihan Surendran
Adeline Fermanian
Sylvain Le Corff
BDLOffRL
99
0
0
04 Jun 2025
1234...818283
Next