ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1801.01290
  4. Cited By
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
v1v2 (latest)

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
ArXiv (abs)PDFHTML

Papers citing "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"

50 / 4,130 papers shown
Title
Sustainable Diffusion-based Incentive Mechanism for Generative AI-driven
  Digital Twins in Industrial Cyber-Physical Systems
Sustainable Diffusion-based Incentive Mechanism for Generative AI-driven Digital Twins in Industrial Cyber-Physical Systems
Jinbo Wen
Jiawen Kang
Dusit Niyato
Yang Zhang
Shiwen Mao
41
7
0
02 Aug 2024
A Survey on Self-play Methods in Reinforcement Learning
A Survey on Self-play Methods in Reinforcement Learning
Chao Yu
Zelai Xu
Chengdong Ma
Chao Yu
Weijuan Tu
...
Deheng Ye
Wenbo Ding
Yaodong Yang
Yu Wang
Yu Wang
SyDaSSLOnRL
189
9
0
02 Aug 2024
MuJoCo MPC for Humanoid Control: Evaluation on HumanoidBench
MuJoCo MPC for Humanoid Control: Evaluation on HumanoidBench
Moritz Meser
Aditya Bhatt
Boris Belousov
Jan Peters
96
2
0
01 Aug 2024
Discretizing Continuous Action Space with Unimodal Probability
  Distributions for On-Policy Reinforcement Learning
Discretizing Continuous Action Space with Unimodal Probability Distributions for On-Policy Reinforcement Learning
Yuanyang Zhu
Zhi Wang
Yuanheng Zhu
Chunlin Chen
Dongbin Zhao
138
0
0
01 Aug 2024
A Policy-Gradient Approach to Solving Imperfect-Information Games with Best-Iterate Convergence
A Policy-Gradient Approach to Solving Imperfect-Information Games with Best-Iterate Convergence
Mingyang Liu
Gabriele Farina
Asuman Ozdaglar
83
3
0
01 Aug 2024
On the Perturbed States for Transformed Input-robust Reinforcement
  Learning
On the Perturbed States for Transformed Input-robust Reinforcement Learning
Tung M. Luu
Haeyong Kang
Matthew Groh
Thanh Nguyen
Chang D. Yoo
OODAAMLOffRL
71
0
0
31 Jul 2024
Learning Stable Robot Grasping with Transformer-based Tactile Control
  Policies
Learning Stable Robot Grasping with Transformer-based Tactile Control Policies
En Yen Puang
Zechen Li
Chee Meng Chew
Shan Luo
Yan Wu
59
1
0
30 Jul 2024
How to Choose a Reinforcement-Learning Algorithm
How to Choose a Reinforcement-Learning Algorithm
Fabian Bongratz
Vladimir Golkov
Lukas Mautner
Luca Della Libera
Frederik Heetmeyer
Felix Czaja
Julian Rodemann
Daniel Cremers
80
1
0
30 Jul 2024
SAPG: Split and Aggregate Policy Gradients
SAPG: Split and Aggregate Policy Gradients
Jayesh Singla
Ananye Agarwal
Deepak Pathak
OffRLOnRL
88
5
0
29 Jul 2024
Diffusion-DICE: In-Sample Diffusion Guidance for Offline Reinforcement
  Learning
Diffusion-DICE: In-Sample Diffusion Guidance for Offline Reinforcement Learning
Liyuan Mao
Haoran Xu
Weinan Zhang
Xianyuan Zhan
Amy Zhang
OffRL
116
19
0
29 Jul 2024
NAVIX: Scaling MiniGrid Environments with JAX
NAVIX: Scaling MiniGrid Environments with JAX
Eduardo Pignatelli
Jarek Liesen
R. T. Lange
Chris Xiaoxuan Lu
Pablo Samuel Castro
Laura Toni
147
4
0
28 Jul 2024
Reinforcement Learning for Sustainable Energy: A Survey
Reinforcement Learning for Sustainable Energy: A Survey
Koen Ponse
Felix Kleuker
Márton Fejér
Álvaro Serra-Gómez
Aske Plaat
Thomas M. Moerland
OffRLAI4CE
103
2
0
26 Jul 2024
PianoMime: Learning a Generalist, Dexterous Piano Player from Internet
  Demonstrations
PianoMime: Learning a Generalist, Dexterous Piano Player from Internet Demonstrations
Cheng Qian
Julen Urain
Kevin Zakka
Jan Peters
64
5
0
25 Jul 2024
Path Following and Stabilisation of a Bicycle Model using a
  Reinforcement Learning Approach
Path Following and Stabilisation of a Bicycle Model using a Reinforcement Learning Approach
Sebastian Weyrer
Peter Manzl
A. L. Schwab
Johannes Gerstmayr
38
0
0
24 Jul 2024
Gymnasium: A Standard Interface for Reinforcement Learning Environments
Gymnasium: A Standard Interface for Reinforcement Learning Environments
Mark Towers
Ariel Kwiatkowski
Jordan Terry
John U. Balis
Gianluca De Cola
...
Andrea Pierré
Sander Schulhoff
Jun Jet Tai
Hannah Tan
Omar G. Younis
AuLLMOffRL
102
216
0
24 Jul 2024
Functional Acceleration for Policy Mirror Descent
Functional Acceleration for Policy Mirror Descent
Veronica Chelu
Doina Precup
115
0
0
23 Jul 2024
WayEx: Waypoint Exploration using a Single Demonstration
WayEx: Waypoint Exploration using a Single Demonstration
Mara Levy
Nirat Saini
Abhinav Shrivastava
90
1
0
22 Jul 2024
Exterior Penalty Policy Optimization with Penalty Metric Network under
  Constraints
Exterior Penalty Policy Optimization with Penalty Metric Network under Constraints
Shiqing Gao
Jiaxin Ding
Luoyi Fu
Xinbing Wang
Cheng Zhou
63
0
0
22 Jul 2024
Temporal Abstraction in Reinforcement Learning with Offline Data
Temporal Abstraction in Reinforcement Learning with Offline Data
Ranga Shaarad Ayyagari
Anurita Ghosh
Ambedkar Dukkipati
OffRL
55
0
0
21 Jul 2024
Rocket Landing Control with Random Annealing Jump Start Reinforcement
  Learning
Rocket Landing Control with Random Annealing Jump Start Reinforcement Learning
Yuxuan Jiang
Yujie Yang
Zhiqian Lan
Guojian Zhan
Shengbo Eben Li
Qi Sun
Jian Ma
Tianwen Yu
Changwu Zhang
58
1
0
21 Jul 2024
Explainable Post hoc Portfolio Management Financial Policy of a Deep
  Reinforcement Learning agent
Explainable Post hoc Portfolio Management Financial Policy of a Deep Reinforcement Learning agent
Alejandra de la Rica Escudero
E.C. Garrido-Merchán
Maria Coronado Vaca
AIFin
97
3
0
19 Jul 2024
Model-based Policy Optimization using Symbolic World Model
Model-based Policy Optimization using Symbolic World Model
Andrey Gorodetskiy
Konstantin Mironov
Aleksandr I. Panov
78
0
0
18 Jul 2024
LIMT: Language-Informed Multi-Task Visual World Models
LIMT: Language-Informed Multi-Task Visual World Models
Elie Aljalbout
Nikolaos Sotirakis
Patrick van der Smagt
Maximilian Karl
Nutan Chen
125
5
0
18 Jul 2024
DeepClair: Utilizing Market Forecasts for Effective Portfolio Selection
DeepClair: Utilizing Market Forecasts for Effective Portfolio Selection
Donghee Choi
Jinkyu Kim
Mogan Gim
Jinho Lee
Jaewoo Kang
92
0
0
18 Jul 2024
Deterministic Trajectory Optimization through Probabilistic Optimal
  Control
Deterministic Trajectory Optimization through Probabilistic Optimal Control
Mohammad Mahmoudi Filabadi
Tom Lefebvre
Guillaume Crevecoeur
29
0
0
18 Jul 2024
PG-Rainbow: Using Distributional Reinforcement Learning in Policy
  Gradient Methods
PG-Rainbow: Using Distributional Reinforcement Learning in Policy Gradient Methods
WooJae Jeon
KanJun Lee
Jeewoo Lee
OffRL
52
0
0
18 Jul 2024
On Causally Disentangled State Representation Learning for Reinforcement
  Learning based Recommender Systems
On Causally Disentangled State Representation Learning for Reinforcement Learning based Recommender Systems
Siyu Wang
Xiaocong Chen
Lina Yao
CML
71
0
0
18 Jul 2024
ROLeR: Effective Reward Shaping in Offline Reinforcement Learning for Recommender Systems
ROLeR: Effective Reward Shaping in Offline Reinforcement Learning for Recommender Systems
Yi Zhang
Ruihong Qiu
Jiajun Liu
Sen Wang
OffRL
110
1
0
18 Jul 2024
Random Latent Exploration for Deep Reinforcement Learning
Random Latent Exploration for Deep Reinforcement Learning
Srinath Mahankali
Zhang-Wei Hong
Ayush Sekhari
Alexander Rakhlin
Pulkit Agrawal
268
3
0
18 Jul 2024
Estimating Reaction Barriers with Deep Reinforcement Learning
Estimating Reaction Barriers with Deep Reinforcement Learning
Adittya Pal
65
0
0
17 Jul 2024
Energy-Guided Diffusion Sampling for Offline-to-Online Reinforcement
  Learning
Energy-Guided Diffusion Sampling for Offline-to-Online Reinforcement Learning
Xu-Hui Liu
Tian-Shuo Liu
Shengyi Jiang
Ruifeng Chen
Zhilong Zhang
Xinwei Chen
Yang Yu
OffRLOnRL
90
3
0
17 Jul 2024
Satisficing Exploration for Deep Reinforcement Learning
Satisficing Exploration for Deep Reinforcement Learning
Dilip Arumugam
Saurabh Kumar
Ramki Gummadi
Benjamin Van Roy
71
1
0
16 Jul 2024
Exciting Action: Investigating Efficient Exploration for Learning
  Musculoskeletal Humanoid Locomotion
Exciting Action: Investigating Efficient Exploration for Learning Musculoskeletal Humanoid Locomotion
Henri-Jacques Geiss
Firas Al-Hafez
Andre Seyfarth
Jan Peters
Davide Tateo
66
2
0
16 Jul 2024
RobotKeyframing: Learning Locomotion with High-Level Objectives via
  Mixture of Dense and Sparse Rewards
RobotKeyframing: Learning Locomotion with High-Level Objectives via Mixture of Dense and Sparse Rewards
Fatemeh Zargarbashi
Jin Cheng
Dongho Kang
Robert Sumner
Stelian Coros
181
9
0
16 Jul 2024
Make-An-Agent: A Generalizable Policy Network Generator with Behavior-Prompted Diffusion
Make-An-Agent: A Generalizable Policy Network Generator with Behavior-Prompted Diffusion
Yongyuan Liang
Tingqiang Xu
Kaizhe Hu
Guangqi Jiang
Furong Huang
Huazhe Xu
VGenLM&RoDiffM
109
4
0
15 Jul 2024
Ontology-driven Reinforcement Learning for Personalized Student Support
Ontology-driven Reinforcement Learning for Personalized Student Support
Ryan Hare
Ying Tang
49
1
0
14 Jul 2024
Preserving the Privacy of Reward Functions in MDPs through Deception
Preserving the Privacy of Reward Functions in MDPs through Deception
Shashank Reddy Chirra
Pradeep Varakantham
P. Paruchuri
75
0
0
13 Jul 2024
A Benchmark Environment for Offline Reinforcement Learning in Racing
  Games
A Benchmark Environment for Offline Reinforcement Learning in Racing Games
Girolamo Macaluso
Alessandro Sestini
Andrew D. Bagdanov
OffRL
76
1
0
12 Jul 2024
HACMan++: Spatially-Grounded Motion Primitives for Manipulation
HACMan++: Spatially-Grounded Motion Primitives for Manipulation
Bowen Jiang
Yilin Wu
Wenxuan Zhou
Chris Paxton
David Held
77
2
0
11 Jul 2024
TLDR: Unsupervised Goal-Conditioned RL via Temporal Distance-Aware
  Representations
TLDR: Unsupervised Goal-Conditioned RL via Temporal Distance-Aware Representations
Junik Bae
Kwanyoung Park
Youngwoon Lee
92
3
0
11 Jul 2024
Gradient Boosting Reinforcement Learning
Gradient Boosting Reinforcement Learning
Benjamin Fuhrer
Chen Tessler
Gal Dalal
OffRLAI4CE
188
3
0
11 Jul 2024
RoboMorph: Evolving Robot Morphology using Large Language Models
RoboMorph: Evolving Robot Morphology using Large Language Models
Kevin Qiu
Krzysztof Ciebiera
Krzysztof Ciebiera
Marek Cygan
Marek Cygan
Łukasz Kuciński
LM&Ro
165
1
0
11 Jul 2024
Intercepting Unauthorized Aerial Robots in Controlled Airspace Using
  Reinforcement Learning
Intercepting Unauthorized Aerial Robots in Controlled Airspace Using Reinforcement Learning
Francisco Giral
Ignacio Gómez
S. L. Clainche
72
0
0
09 Jul 2024
Frequency and Generalisation of Periodic Activation Functions in Reinforcement Learning
Frequency and Generalisation of Periodic Activation Functions in Reinforcement Learning
Augustine N. Mavor-Parker
Matthew J. Sargent
Caswell Barry
Lewis D. Griffin
Clare Lyle
118
2
0
09 Jul 2024
Can Learned Optimization Make Reinforcement Learning Less Difficult?
Can Learned Optimization Make Reinforcement Learning Less Difficult?
Alexander David Goldie
Chris Xiaoxuan Lu
Matthew Jackson
Shimon Whiteson
Jakob N. Foerster
145
5
0
09 Jul 2024
Enhanced Safety in Autonomous Driving: Integrating Latent State
  Diffusion Model for End-to-End Navigation
Enhanced Safety in Autonomous Driving: Integrating Latent State Diffusion Model for End-to-End Navigation
Detian Chu
Linyuan Bai
Jianuo Huang
Zhenlong Fang
Peng Zhang
Wei Kang
Haifeng Lin
155
3
0
08 Jul 2024
Generalizing soft actor-critic algorithms to discrete action spaces
Generalizing soft actor-critic algorithms to discrete action spaces
Le Zhang
Yong Gu
Xin Zhao
Yanshuo Zhang
Shu Zhao
Yifei Jin
Xinxin Wu
95
0
0
08 Jul 2024
A Novel Bifurcation Method for Observation Perturbation Attacks on
  Reinforcement Learning Agents: Load Altering Attacks on a Cyber Physical
  Power System
A Novel Bifurcation Method for Observation Perturbation Attacks on Reinforcement Learning Agents: Load Altering Attacks on a Cyber Physical Power System
Kiernan Broda-Milian
Ranwa Al-Mallah
H. Dagdougui
AAML
70
0
0
06 Jul 2024
FOSP: Fine-tuning Offline Safe Policy through World Models
FOSP: Fine-tuning Offline Safe Policy through World Models
Chenyang Cao
Yucheng Xin
Silang Wu
Longxiang He
Zichen Yan
Junbo Tan
Xueqian Wang
OffRL
148
1
0
06 Jul 2024
Augmented Bayesian Policy Search
Augmented Bayesian Policy Search
Mahdi Kallel
Debabrota Basu
R. Akrour
Carlo DÉramo
89
3
0
05 Jul 2024
Previous
123...131415...818283
Next