ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1801.01290
  4. Cited By
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
v1v2 (latest)

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
ArXiv (abs)PDFHTML

Papers citing "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"

50 / 4,130 papers shown
Title
UDUC: An Uncertainty-driven Approach for Learning-based Robust Control
UDUC: An Uncertainty-driven Approach for Learning-based Robust Control
Yuan Zhang
Jasper Hoffmann
Joschka Boedecker
85
0
0
04 May 2024
Off-OAB: Off-Policy Policy Gradient Method with Optimal Action-Dependent
  Baseline
Off-OAB: Off-Policy Policy Gradient Method with Optimal Action-Dependent Baseline
Wenjia Meng
Qian Zheng
Long Yang
Yilong Yin
Gang Pan
OffRL
93
0
0
04 May 2024
CTD4 -- A Deep Continuous Distributional Actor-Critic Agent with a Kalman Fusion of Multiple Critics
CTD4 -- A Deep Continuous Distributional Actor-Critic Agent with a Kalman Fusion of Multiple Critics
David Valencia
Henry Williams
Trevor Gee
Bruce A MacDonaland
Minas V. Liarokapis
Minas Liarokapis
OffRL
176
2
0
04 May 2024
Towards Improving Learning from Demonstration Algorithms via MCMC
  Methods
Towards Improving Learning from Demonstration Algorithms via MCMC Methods
Carl Qi
Edward Sun
Harry Zhang
OffRL
107
0
0
03 May 2024
Zero-Sum Positional Differential Games as a Framework for Robust
  Reinforcement Learning: Deep Q-Learning Approach
Zero-Sum Positional Differential Games as a Framework for Robust Reinforcement Learning: Deep Q-Learning Approach
Anton Plaksin
Vitaly Kalev
77
1
0
03 May 2024
Intelligent Switching for Reset-Free RL
Intelligent Switching for Reset-Free RL
Darshan Patil
Janarthanan Rajendran
Glen Berseth
Sarath Chandar
100
0
0
02 May 2024
CityLearn v2: Energy-flexible, resilient, occupant-centric, and
  carbon-aware management of grid-interactive communities
CityLearn v2: Energy-flexible, resilient, occupant-centric, and carbon-aware management of grid-interactive communities
Kingsley Nweye
Kathryn Kaspar
Giacomo Buscemi
Tiago Fonseca
G. Pinto
...
Luis Lino Ferreira
Tianzhen Hong
Mohamed Ouf
Alfonso Capozzoli
Zoltán Nagy
76
11
0
02 May 2024
Goal-conditioned reinforcement learning for ultrasound navigation
  guidance
Goal-conditioned reinforcement learning for ultrasound navigation guidance
A. Amadou
Vivek Singh
Florin-Cristian Ghesu
Young-Ho Kim
Laura Stanciulescu
Harshitha P. Sai
Puneet Sharma
Alistair Young
Ronak Rajani
K. Rhode
68
4
0
02 May 2024
Towards Interpretable Reinforcement Learning with Constrained
  Normalizing Flow Policies
Towards Interpretable Reinforcement Learning with Constrained Normalizing Flow Policies
Finn Rietz
Erik Schaffernicht
Stefan Heinrich
J. A. Stork
AI4CE
76
0
0
02 May 2024
Leveraging Procedural Generation for Learning Autonomous Peg-in-Hole
  Assembly in Space
Leveraging Procedural Generation for Learning Autonomous Peg-in-Hole Assembly in Space
Andrej Orsula
Matthieu Geist
Miguel Olivares-Mendez
Carol Martinez
35
1
0
02 May 2024
LOQA: Learning with Opponent Q-Learning Awareness
LOQA: Learning with Opponent Q-Learning Awareness
Milad Aghajohari
Juan Agustin Duque
Tim Cooijmans
Rameswar Panda
74
4
0
02 May 2024
S$^2$AC: Energy-Based Reinforcement Learning with Stein Soft Actor
  Critic
S2^22AC: Energy-Based Reinforcement Learning with Stein Soft Actor Critic
Safa Messaoud
Billel Mokeddem
Zhenghai Xue
Linsey Pang
Bo An
Haipeng Chen
Sanjay Chawla
119
5
0
02 May 2024
Self-Play Preference Optimization for Language Model Alignment
Self-Play Preference Optimization for Language Model Alignment
Yue Wu
Zhiqing Sun
Huizhuo Yuan
Kaixuan Ji
Yiming Yang
Quanquan Gu
149
145
0
01 May 2024
Learning Tactile Insertion in the Real World
Learning Tactile Insertion in the Real World
Daniel Palenicek
Theo Gruner
Tim Schneider
Alina Böhm
Janis Lenz
Inga Pfenning
Eric Krämer
Jan Peters
102
2
0
01 May 2024
Employing Federated Learning for Training Autonomous HVAC Systems
Employing Federated Learning for Training Autonomous HVAC Systems
Fredrik Hagström
Vikas Garg
Fabricio Oliveira
AI4CE
167
0
0
01 May 2024
FOTS: A Fast Optical Tactile Simulator for Sim2Real Learning of
  Tactile-motor Robot Manipulation Skills
FOTS: A Fast Optical Tactile Simulator for Sim2Real Learning of Tactile-motor Robot Manipulation Skills
Yongqiang Zhao
Kun Qian
Boyi Duan
Shan Luo
101
10
0
30 Apr 2024
Leveraging Sub-Optimal Data for Human-in-the-Loop Reinforcement Learning
Leveraging Sub-Optimal Data for Human-in-the-Loop Reinforcement Learning
Calarina Muslimani
Matthew E. Taylor
OffRL
138
2
0
30 Apr 2024
Reinforcement Learning Driven Cooperative Ball Balance in Rigidly
  Coupled Drones
Reinforcement Learning Driven Cooperative Ball Balance in Rigidly Coupled Drones
Shraddha Barawkar
Nikhil Chopra
35
0
0
29 Apr 2024
MRIC: Model-Based Reinforcement-Imitation Learning with
  Mixture-of-Codebooks for Autonomous Driving Simulation
MRIC: Model-Based Reinforcement-Imitation Learning with Mixture-of-Codebooks for Autonomous Driving Simulation
Baotian He
Yibing Li
126
1
0
29 Apr 2024
From Persona to Personalization: A Survey on Role-Playing Language
  Agents
From Persona to Personalization: A Survey on Role-Playing Language Agents
Jiangjie Chen
Xintao Wang
Rui Xu
Siyu Yuan
Yikai Zhang
...
Caiyu Hu
Siye Wu
Scott Ren
Ziquan Fu
Yanghua Xiao
145
98
0
28 Apr 2024
Knowledge Transfer for Cross-Domain Reinforcement Learning: A Systematic
  Review
Knowledge Transfer for Cross-Domain Reinforcement Learning: A Systematic Review
Sergio A. Serrano
J. Martínez-Carranza
L. Sucar
97
1
0
26 Apr 2024
Generalize by Touching: Tactile Ensemble Skill Transfer for Robotic
  Furniture Assembly
Generalize by Touching: Tactile Ensemble Skill Transfer for Robotic Furniture Assembly
Hao-ming Lin
Radu Corcodel
Ding Zhao
116
7
0
26 Apr 2024
Probabilistic Inference in Language Models via Twisted Sequential Monte
  Carlo
Probabilistic Inference in Language Models via Twisted Sequential Monte Carlo
Stephen Zhao
Rob Brekelmans
Alireza Makhzani
Roger C. Grosse
95
41
0
26 Apr 2024
Part-Guided 3D RL for Sim2Real Articulated Object Manipulation
Part-Guided 3D RL for Sim2Real Articulated Object Manipulation
Pengwei Xie
Rui Chen
Siang Chen
Yuzhe Qin
Fanbo Xiang
Tianyu Sun
Jing Xu
Guijin Wang
Haoran Su
105
12
0
26 Apr 2024
IDIL: Imitation Learning of Intent-Driven Expert Behavior
IDIL: Imitation Learning of Intent-Driven Expert Behavior
Sangwon Seo
Vaibhav Unhelkar
58
3
0
25 Apr 2024
DrS: Learning Reusable Dense Rewards for Multi-Stage Tasks
DrS: Learning Reusable Dense Rewards for Multi-Stage Tasks
Tongzhou Mu
Minghua Liu
Hao Su
OffRL
99
4
0
25 Apr 2024
REBEL: Reinforcement Learning via Regressing Relative Rewards
REBEL: Reinforcement Learning via Regressing Relative Rewards
Zhaolin Gao
Jonathan D. Chang
Wenhao Zhan
Owen Oertell
Gokul Swamy
Kianté Brantley
Thorsten Joachims
J. Andrew Bagnell
Jason D. Lee
Wen Sun
OffRL
87
41
0
25 Apr 2024
A Dual Perspective of Reinforcement Learning for Imposing Policy Constraints
A Dual Perspective of Reinforcement Learning for Imposing Policy Constraints
Bram De Cooman
Johan A. K. Suykens
105
0
0
25 Apr 2024
RUMOR: Reinforcement learning for Understanding a Model of the Real World for Navigation in Dynamic Environments
RUMOR: Reinforcement learning for Understanding a Model of the Real World for Navigation in Dynamic Environments
Diego Martínez Baselga
L. Riazuelo
Luis Montano
169
1
0
25 Apr 2024
AFU: Actor-Free critic Updates in off-policy RL for continuous control
AFU: Actor-Free critic Updates in off-policy RL for continuous control
Nicolas Perrin-Gilbert
OffRL
108
0
0
24 Apr 2024
GRSN: Gated Recurrent Spiking Neurons for POMDPs and MARL
GRSN: Gated Recurrent Spiking Neurons for POMDPs and MARL
Lang Qin
Ziming Wang
Runhao Jiang
Rui Yan
Huajin Tang
75
1
0
24 Apr 2024
MultiSTOP: Solving Functional Equations with Reinforcement Learning
MultiSTOP: Solving Functional Equations with Reinforcement Learning
Alessandro Trenta
Davide Bacciu
Andrea Cossu
Pietro Ferrero
122
0
0
23 Apr 2024
Evolutionary Reinforcement Learning via Cooperative Coevolution
Evolutionary Reinforcement Learning via Cooperative Coevolution
Chengpeng Hu
Jialin Liu
Xinghu Yao
136
0
0
23 Apr 2024
Rank2Reward: Learning Shaped Reward Functions from Passive Video
Rank2Reward: Learning Shaped Reward Functions from Passive Video
Daniel Yang
Davin Tjia
Jacob Berg
Dima Damen
Pulkit Agrawal
Abhishek Gupta
OffRL
76
5
0
23 Apr 2024
Beyond the Edge: An Advanced Exploration of Reinforcement Learning for
  Mobile Edge Computing, its Applications, and Future Research Trajectories
Beyond the Edge: An Advanced Exploration of Reinforcement Learning for Mobile Edge Computing, its Applications, and Future Research Trajectories
Ning Yang
Shuo Chen
Haijun Zhang
Randall Berry
OffRL
106
9
0
22 Apr 2024
Multi-view Disentanglement for Reinforcement Learning with Multiple
  Cameras
Multi-view Disentanglement for Reinforcement Learning with Multiple Cameras
Mhairi Dunion
Stefano V. Albrecht
99
5
0
22 Apr 2024
Explicit Lipschitz Value Estimation Enhances Policy Robustness Against
  Perturbation
Explicit Lipschitz Value Estimation Enhances Policy Robustness Against Perturbation
Xulin Chen
Ruipeng Liu
Garret E. Katz
76
0
0
22 Apr 2024
Adaptive Social Force Window Planner with Reinforcement Learning
Adaptive Social Force Window Planner with Reinforcement Learning
Mauro Martini
Noé Pérez-Higueras
Andrea Ostuni
Marcello Chiaberge
F. Caballero
L. Merino
83
3
0
21 Apr 2024
PIPER: Primitive-Informed Preference-based Hierarchical Reinforcement
  Learning via Hindsight Relabeling
PIPER: Primitive-Informed Preference-based Hierarchical Reinforcement Learning via Hindsight Relabeling
Utsav Singh
Wesley A Suttle
Brian M Sadler
Vinay P. Namboodiri
Amrit Singh Bedi
75
5
0
20 Apr 2024
Decentralized Coordination of Distributed Energy Resources through Local
  Energy Markets and Deep Reinforcement Learning
Decentralized Coordination of Distributed Energy Resources through Local Energy Markets and Deep Reinforcement Learning
Daniel May
Matthew E. Taylor
Petr Musílek
49
1
0
19 Apr 2024
Adaptive Regularization of Representation Rank as an Implicit Constraint
  of Bellman Equation
Adaptive Regularization of Representation Rank as an Implicit Constraint of Bellman Equation
Qiang He
Dinesh Manocha
Meng Fang
S. Maghsudi
93
3
0
19 Apr 2024
Learning to Cut via Hierarchical Sequence/Set Model for Efficient
  Mixed-Integer Programming
Learning to Cut via Hierarchical Sequence/Set Model for Efficient Mixed-Integer Programming
Jie Wang
Zhihai Wang
Xijun Li
Yufei Kuang
Zhihao Shi
Fangzhou Zhu
Mingxuan Yuan
Jianguo Zeng
Yongdong Zhang
Feng Wu
83
8
0
19 Apr 2024
TrajDeleter: Enabling Trajectory Forgetting in Offline Reinforcement
  Learning Agents
TrajDeleter: Enabling Trajectory Forgetting in Offline Reinforcement Learning Agents
Chen Gong
Kecen Li
Jin Yao
Tianhao Wang
OnRL
75
1
0
18 Apr 2024
ASID: Active Exploration for System Identification in Robotic
  Manipulation
ASID: Active Exploration for System Identification in Robotic Manipulation
Marius Memmel
Andrew Wagenmaker
Chuning Zhu
Patrick Yin
Dieter Fox
Abhishek Gupta
142
15
0
18 Apr 2024
S4TP: Social-Suitable and Safety-Sensitive Trajectory Planning for
  Autonomous Vehicles
S4TP: Social-Suitable and Safety-Sensitive Trajectory Planning for Autonomous Vehicles
Xiao Wang
Ke Tang
Xingyuan Dai
Jintao Xu
Quancheng Du
Rui Ai
Yuxiao Wang
Weihao Gu
99
3
0
18 Apr 2024
Actor-Critic Reinforcement Learning with Phased Actor
Actor-Critic Reinforcement Learning with Phased Actor
Ruofan Wu
Junmin Zhong
Jennie Si
43
0
0
18 Apr 2024
Function Approximation for Reinforcement Learning Controller for Energy
  from Spread Waves
Function Approximation for Reinforcement Learning Controller for Energy from Spread Waves
Soumyendu Sarkar
Vineet Gundecha
Sahand Ghorbanpour
Alexander Shmakov
Ashwin Ramesh Babu
Avisek Naug
Alexandre Frederic Julien Pichard
Mathieu Cocho
71
8
0
17 Apr 2024
Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
Shusheng Xu
Wei Fu
Jiaxuan Gao
Wenjie Ye
Weiling Liu
Zhiyu Mei
Guangju Wang
Chao Yu
Yi Wu
174
165
0
16 Apr 2024
Continual Offline Reinforcement Learning via Diffusion-based Dual
  Generative Replay
Continual Offline Reinforcement Learning via Diffusion-based Dual Generative Replay
Jinmei Liu
Wenbin Li
Xiangyu Yue
Shilin Zhang
Chunlin Chen
Zhi Wang
OffRLDiffM
88
6
0
16 Apr 2024
Continuous Control Reinforcement Learning: Distributed Distributional
  DrQ Algorithms
Continuous Control Reinforcement Learning: Distributed Distributional DrQ Algorithms
Zehao Zhou
OffRL
38
0
0
16 Apr 2024
Previous
123...181920...818283
Next