ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1801.01290
  4. Cited By
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
ArXivPDFHTML

Papers citing "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"

50 / 4,044 papers shown
Title
REVEAL-IT: REinforcement learning with Visibility of Evolving Agent
  poLicy for InTerpretability
REVEAL-IT: REinforcement learning with Visibility of Evolving Agent poLicy for InTerpretability
Shuang Ao
Simon Khan
Haris Aziz
Flora D. Salim
53
0
0
20 Jun 2024
Urban-Focused Multi-Task Offline Reinforcement Learning with Contrastive
  Data Sharing
Urban-Focused Multi-Task Offline Reinforcement Learning with Contrastive Data Sharing
Xinbo Zhao
Yingxue Zhang
Xin Zhang
Yu Yang
Yiqun Xie
Yanhua Li
Jun Luo
OffRL
45
2
0
20 Jun 2024
Equivariant Offline Reinforcement Learning
Equivariant Offline Reinforcement Learning
Arsh Tangri
Ondrej Biza
Dian Wang
David Klee
Owen Howell
Robert Platt
OffRL
65
3
0
20 Jun 2024
A Decision-Making GPT Model Augmented with Entropy Regularization for
  Autonomous Vehicles
A Decision-Making GPT Model Augmented with Entropy Regularization for Autonomous Vehicles
Jiaqi Liu
Shiyu Fang
Xuekai Liu
Lulu Guo
Peng Hang
Jian Sun
46
3
0
20 Jun 2024
SRL-VIC: A Variable Stiffness-Based Safe Reinforcement Learning for
  Contact-Rich Robotic Tasks
SRL-VIC: A Variable Stiffness-Based Safe Reinforcement Learning for Contact-Rich Robotic Tasks
Heng Zhang
Gokhan Solak
G. J. G. Lahr
Arash Ajoudani
28
10
0
19 Jun 2024
Improving GFlowNets with Monte Carlo Tree Search
Improving GFlowNets with Monte Carlo Tree Search
Nikita Morozov
D. Tiapkin
S. Samsonov
Alexey Naumov
Dmitry Vetrov
75
1
0
19 Jun 2024
Efficient Offline Reinforcement Learning: The Critic is Critical
Efficient Offline Reinforcement Learning: The Critic is Critical
Adam Jelley
Trevor A. McInroe
Sam Devlin
Amos Storkey
OffRL
57
1
0
19 Jun 2024
Autonomous navigation of catheters and guidewires in mechanical
  thrombectomy using inverse reinforcement learning
Autonomous navigation of catheters and guidewires in mechanical thrombectomy using inverse reinforcement learning
Harry Robertshaw
Lennart Karstensen
Benjamin Jackson
Alejandro Granados
Thomas C Booth
44
7
0
18 Jun 2024
Memory Sequence Length of Data Sampling Impacts the Adaptation of
  Meta-Reinforcement Learning Agents
Memory Sequence Length of Data Sampling Impacts the Adaptation of Meta-Reinforcement Learning Agents
Menglong Zhang
Fuyuan Qian
Quanying Liu
61
1
0
18 Jun 2024
BadSampler: Harnessing the Power of Catastrophic Forgetting to Poison
  Byzantine-robust Federated Learning
BadSampler: Harnessing the Power of Catastrophic Forgetting to Poison Byzantine-robust Federated Learning
Yi Liu
Cong Wang
Lizhen Qu
AAML
71
3
0
18 Jun 2024
An Imitative Reinforcement Learning Framework for Autonomous Dogfight
An Imitative Reinforcement Learning Framework for Autonomous Dogfight
Siyuan Li
Rongchang Zuo
Peng Liu
Yingnan Zhao
Yingnan Zhao
51
1
0
17 Jun 2024
Exploration by Learning Diverse Skills through Successor State Measures
Exploration by Learning Diverse Skills through Successor State Measures
Paul-Antoine Le Tolguenec
Yann Besse
Florent Teichteil-Königsbuch
Dennis G. Wilson
Emmanuel Rachelson
47
0
0
14 Jun 2024
Bridging the Communication Gap: Artificial Agents Learning Sign Language
  through Imitation
Bridging the Communication Gap: Artificial Agents Learning Sign Language through Imitation
Federico Tavella
Aphrodite Galata
Angelo Cangelosi
39
1
0
14 Jun 2024
Deep Bayesian Active Learning for Preference Modeling in Large Language
  Models
Deep Bayesian Active Learning for Preference Modeling in Large Language Models
Luckeciano C. Melo
P. Tigas
Alessandro Abate
Yarin Gal
61
9
0
14 Jun 2024
Robust Model-Based Reinforcement Learning with an Adversarial Auxiliary
  Model
Robust Model-Based Reinforcement Learning with an Adversarial Auxiliary Model
Siemen Herremans
Ali Anwar
Siegfried Mercelis
52
2
0
14 Jun 2024
I Know How: Combining Prior Policies to Solve New Tasks
I Know How: Combining Prior Policies to Solve New Tasks
Malio Li
Elia Piccoli
Vincenzo Lomonaco
Davide Bacciu
CLL
41
0
0
14 Jun 2024
DAG-Plan: Generating Directed Acyclic Dependency Graphs for Dual-Arm Cooperative Planning
DAG-Plan: Generating Directed Acyclic Dependency Graphs for Dual-Arm Cooperative Planning
Zeyu Gao
Yao Mu
Jinye Qu
Mengkang Hu
Lingyue Guo
Ping Luo
Yanfeng Lu
Ping Luo
Shanghang Zhang
Yanfeng Lu
77
10
0
14 Jun 2024
AutomaChef: A Physics-informed Demonstration-guided Learning Framework
  for Granular Material Manipulation
AutomaChef: A Physics-informed Demonstration-guided Learning Framework for Granular Material Manipulation
Minglun Wei
Xintong Yang
Yu-Kun Lai
S. A. Tafrishi
Ze Ji
AI4CE
44
0
0
13 Jun 2024
DiffPoGAN: Diffusion Policies with Generative Adversarial Networks for
  Offline Reinforcement Learning
DiffPoGAN: Diffusion Policies with Generative Adversarial Networks for Offline Reinforcement Learning
Xuemin Hu
Shen Li
Yingfen Xu
Bo Tang
Long Chen
53
0
0
13 Jun 2024
CUER: Corrected Uniform Experience Replay for Off-Policy Continuous Deep
  Reinforcement Learning Algorithms
CUER: Corrected Uniform Experience Replay for Off-Policy Continuous Deep Reinforcement Learning Algorithms
Arda Sarp Yenicesu
Furkan B. Mutlu
Suleyman S. Kozat
Ozgur S. Oguz
27
1
0
13 Jun 2024
A Dual Approach to Imitation Learning from Observations with Offline
  Datasets
A Dual Approach to Imitation Learning from Observations with Offline Datasets
Harshit S. Sikchi
Caleb Chuck
Amy Zhang
S. Niekum
OffRL
62
4
0
13 Jun 2024
BaSeNet: A Learning-based Mobile Manipulator Base Pose Sequence Planning
  for Pickup Tasks
BaSeNet: A Learning-based Mobile Manipulator Base Pose Sequence Planning for Pickup Tasks
Lakshadeep Naik
Sinan Kalkan
S. Sørensen
Mikkel B. Kjærgaard
Norbert Kruger
40
1
0
12 Jun 2024
Optimizing Deep Reinforcement Learning for Adaptive Robotic Arm Control
Optimizing Deep Reinforcement Learning for Adaptive Robotic Arm Control
Jonaid Shianifar
Michael Schukat
Karl Mason
33
3
0
12 Jun 2024
Residual Learning and Context Encoding for Adaptive Offline-to-Online
  Reinforcement Learning
Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning
Mohammadreza Nakhaei
Aidan Scannell
Joni Pajarinen
OffRL
69
1
0
12 Jun 2024
The Max-Min Formulation of Multi-Objective Reinforcement Learning: From
  Theory to a Model-Free Algorithm
The Max-Min Formulation of Multi-Objective Reinforcement Learning: From Theory to a Model-Free Algorithm
Giseung Park
Woohyeon Byeon
Seongmin Kim
Elad Havakuk
Amir Leshem
Youngchul Sung
34
2
0
12 Jun 2024
Unifying Interpretability and Explainability for Alzheimer's Disease
  Progression Prediction
Unifying Interpretability and Explainability for Alzheimer's Disease Progression Prediction
Raja Farrukh Ali
Stephanie Milani
John Woods
Emmanuel Adenij
Ayesha Farooq
Clayton Mansel
Jeffrey Burns
William Hsu
38
0
0
11 Jun 2024
CDSA: Conservative Denoising Score-based Algorithm for Offline
  Reinforcement Learning
CDSA: Conservative Denoising Score-based Algorithm for Offline Reinforcement Learning
Zeyuan Liu
Kai Yang
Xiu Li
OffRL
64
0
0
11 Jun 2024
Hybrid Reinforcement Learning from Offline Observation Alone
Hybrid Reinforcement Learning from Offline Observation Alone
Yuda Song
J. Andrew Bagnell
Aarti Singh
OffRL
92
2
0
11 Jun 2024
Semantic-Aware Spectrum Sharing in Internet of Vehicles Based on Deep
  Reinforcement Learning
Semantic-Aware Spectrum Sharing in Internet of Vehicles Based on Deep Reinforcement Learning
Wenjun Zhang
Qiong Wu
Pingyi Fan
Nan Cheng
Wen Chen
Jiangzhou Wang
Khaled B. Letaief
49
22
0
11 Jun 2024
Optimal Gait Control for a Tendon-driven Soft Quadruped Robot by
  Model-based Reinforcement Learning
Optimal Gait Control for a Tendon-driven Soft Quadruped Robot by Model-based Reinforcement Learning
Xuezhi Niu
Kaige Tan
Lei Feng
33
0
0
11 Jun 2024
Learning Continually by Spectral Regularization
Learning Continually by Spectral Regularization
Alex Lewandowski
Saurabh Kumar
Dale Schuurmans
András Gyorgy
Marlos C. Machado
CLL
66
6
0
10 Jun 2024
Coprocessor Actor Critic: A Model-Based Reinforcement Learning Approach
  For Adaptive Brain Stimulation
Coprocessor Actor Critic: A Model-Based Reinforcement Learning Approach For Adaptive Brain Stimulation
Michelle Pan
Mariah L. Schrum
Vivek Myers
Erdem Bıyık
Anca Dragan
31
0
0
10 Jun 2024
Adaptive Opponent Policy Detection in Multi-Agent MDPs: Real-Time
  Strategy Switch Identification Using Running Error Estimation
Adaptive Opponent Policy Detection in Multi-Agent MDPs: Real-Time Strategy Switch Identification Using Running Error Estimation
Mohidul Haque Mridul
Mohammad Foysal Khan
Redwan Ahmed Rizvee
Md. Mosaddek Khan
AAML
26
0
0
10 Jun 2024
Towards Real-World Efficiency: Domain Randomization in Reinforcement
  Learning for Pre-Capture of Free-Floating Moving Targets by Autonomous Robots
Towards Real-World Efficiency: Domain Randomization in Reinforcement Learning for Pre-Capture of Free-Floating Moving Targets by Autonomous Robots
Bahador Beigomi
Zheng H. Zhu
40
0
0
10 Jun 2024
Decoupling regularization from the action space
Decoupling regularization from the action space
Sobhan Mohammadpour
Emma Frejinger
Pierre-Luc Bacon
43
0
0
10 Jun 2024
Boosting Robustness in Preference-Based Reinforcement Learning with Dynamic Sparsity
Boosting Robustness in Preference-Based Reinforcement Learning with Dynamic Sparsity
Calarina Muslimani
Bram Grooten
Deepak Ranganatha Sastry Mamillapalli
Mykola Pechenizkiy
Decebal Constantin Mocanu
Matthew E. Taylor
73
0
0
10 Jun 2024
ICU-Sepsis: A Benchmark MDP Built from Real Medical Data
ICU-Sepsis: A Benchmark MDP Built from Real Medical Data
Kartik Choudhary
Dhawal Gupta
Philip S. Thomas
OOD
VLM
33
0
0
09 Jun 2024
LGR2: Language Guided Reward Relabeling for Accelerating Hierarchical Reinforcement Learning
LGR2: Language Guided Reward Relabeling for Accelerating Hierarchical Reinforcement Learning
Utsav Singh
Pramit Bhattacharyya
Vinay P. Namboodiri
LM&Ro
52
1
0
09 Jun 2024
Multi-attribute Auction-based Resource Allocation for Twins Migration in
  Vehicular Metaverses: A GPT-based DRL Approach
Multi-attribute Auction-based Resource Allocation for Twins Migration in Vehicular Metaverses: A GPT-based DRL Approach
Yongju Tong
Junlong Chen
Minrui Xu
Jiawen Kang
Zehui Xiong
Dusit Niyato
Chau Yuen
Zhu Han
37
3
0
08 Jun 2024
Reinforcement Learning for Intensity Control: An Application to
  Choice-Based Network Revenue Management
Reinforcement Learning for Intensity Control: An Application to Choice-Based Network Revenue Management
Huiling Meng
Ningyuan Chen
Xuefeng Gao
70
1
0
08 Jun 2024
Sim-to-Real Transfer of Deep Reinforcement Learning Agents for Online
  Coverage Path Planning
Sim-to-Real Transfer of Deep Reinforcement Learning Agents for Online Coverage Path Planning
Arvi Jonnarth
Ola Johansson
Michael Felsberg
OffRL
74
1
0
07 Jun 2024
Skill-aware Mutual Information Optimisation for Generalisation in
  Reinforcement Learning
Skill-aware Mutual Information Optimisation for Generalisation in Reinforcement Learning
Xuehui Yu
Mhairi Dunion
Xin Li
Stefano V. Albrecht
70
2
0
07 Jun 2024
Optimization of geological carbon storage operations with multimodal
  latent dynamic model and deep reinforcement learning
Optimization of geological carbon storage operations with multimodal latent dynamic model and deep reinforcement learning
Zhongzheng Wang
Yuntian Chen
Guodong Chen
Dongxiao Zhang
AI4CE
42
0
0
07 Jun 2024
Strategically Conservative Q-Learning
Strategically Conservative Q-Learning
Yutaka Shimizu
Joey Hong
Sergey Levine
Masayoshi Tomizuka
OffRL
OnRL
54
0
0
06 Jun 2024
ATraDiff: Accelerating Online Reinforcement Learning with Imaginary
  Trajectories
ATraDiff: Accelerating Online Reinforcement Learning with Imaginary Trajectories
Qianlan Yang
Yu-Xiong Wang
OnRL
50
1
0
06 Jun 2024
Simulating, Fast and Slow: Learning Policies for Black-Box Optimization
Simulating, Fast and Slow: Learning Policies for Black-Box Optimization
F. V. Massoli
Tim Bakker
Thomas M. Hehn
Tribhuvanesh Orekondy
Arash Behboodi
78
0
0
06 Jun 2024
Redundancy-aware Action Spaces for Robot Learning
Redundancy-aware Action Spaces for Robot Learning
Pietro Mazzaglia
Nicholas Backshall
Xiao Ma
Stephen James
45
2
0
06 Jun 2024
Bootstrapping Expectiles in Reinforcement Learning
Bootstrapping Expectiles in Reinforcement Learning
Pierre Clavier
Emmanuel Rachelson
E. L. Pennec
Matthieu Geist
OffRL
57
0
0
06 Jun 2024
AC4MPC: Actor-Critic Reinforcement Learning for Nonlinear Model
  Predictive Control
AC4MPC: Actor-Critic Reinforcement Learning for Nonlinear Model Predictive Control
Rudolf Reiter
Andrea Ghezzi
Katrin Baumgärtner
Jasper Hoffmann
Robert D. McAllister
Moritz Diehl
39
7
0
06 Jun 2024
Exploring Pessimism and Optimism Dynamics in Deep Reinforcement Learning
Exploring Pessimism and Optimism Dynamics in Deep Reinforcement Learning
Bahareh Tasdighi
Nicklas Werge
Yi-Shan Wu
M. Kandemir
18
0
0
06 Jun 2024
Previous
123...131415...798081
Next