ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1801.01290
  4. Cited By
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
v1v2 (latest)

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
ArXiv (abs)PDFHTML

Papers citing "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"

50 / 4,128 papers shown
Title
Independent Component Alignment for Multi-Task Learning
Independent Component Alignment for Multi-Task Learning
Dmitry Senushkin
Nikolay Patakin
Arseny Kuznetsov
Anton Konushin
CVBM
102
46
0
30 May 2023
Maximize to Explore: One Objective Function Fusing Estimation, Planning,
  and Exploration
Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration
Zhihan Liu
Miao Lu
Wei Xiong
Han Zhong
Haotian Hu
Shenao Zhang
Sirui Zheng
Zhuoran Yang
Zhaoran Wang
OffRL
124
22
0
29 May 2023
RLAD: Reinforcement Learning from Pixels for Autonomous Driving in Urban
  Environments
RLAD: Reinforcement Learning from Pixels for Autonomous Driving in Urban Environments
Daniel Coelho
Miguel Oliveira
Vítor M. F. Santos
50
4
0
29 May 2023
Diffusion Model is an Effective Planner and Data Synthesizer for
  Multi-Task Reinforcement Learning
Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning
Haoran He
Chenjia Bai
Kang Xu
Zhuoran Yang
Weinan Zhang
Dong Wang
Bingyan Zhao
Xuelong Li
DiffMOffRL
101
98
0
29 May 2023
Continual Task Allocation in Meta-Policy Network via Sparse Prompting
Continual Task Allocation in Meta-Policy Network via Sparse Prompting
Yijun Yang
Tianyi Zhou
Jing Jiang
Guodong Long
Yuhui Shi
CLLOffRL
95
9
0
29 May 2023
Off-Policy RL Algorithms Can be Sample-Efficient for Continuous Control
  via Sample Multiple Reuse
Off-Policy RL Algorithms Can be Sample-Efficient for Continuous Control via Sample Multiple Reuse
Jiafei Lyu
Le Wan
Zongqing Lu
Xiu Li
OffRL
68
9
0
29 May 2023
Interpretable Reward Redistribution in Reinforcement Learning: A Causal
  Approach
Interpretable Reward Redistribution in Reinforcement Learning: A Causal Approach
Yudi Zhang
Yali Du
Erdun Gao
Ziyan Wang
Jun Wang
Meng Fang
Mykola Pechenizkiy
CML
109
18
0
28 May 2023
Direction-oriented Multi-objective Learning: Simple and Provable
  Stochastic Algorithms
Direction-oriented Multi-objective Learning: Simple and Provable Stochastic Algorithms
Peiyao Xiao
Hao Ban
Kaiyi Ji
130
21
0
28 May 2023
Cross-Domain Policy Adaptation via Value-Guided Data Filtering
Cross-Domain Policy Adaptation via Value-Guided Data Filtering
Kang Xu
Chenjia Bai
Xiaoteng Ma
Dong Wang
Bingyan Zhao
Zhen Wang
Xuelong Li
Wei Li
98
18
0
28 May 2023
On the Value of Myopic Behavior in Policy Reuse
On the Value of Myopic Behavior in Policy Reuse
Kang Xu
Chenjia Bai
Shuang Qiu
Haoran He
Bin Zhao
Zhen Wang
Wei Li
Xuelong Li
89
1
0
28 May 2023
Query-Policy Misalignment in Preference-Based Reinforcement Learning
Query-Policy Misalignment in Preference-Based Reinforcement Learning
Xiao Hu
Jianxiong Li
Xianyuan Zhan
Qing-Shan Jia
Ya Zhang
104
9
0
27 May 2023
Self-Supervised Reinforcement Learning that Transfers using Random
  Features
Self-Supervised Reinforcement Learning that Transfers using Random Features
Boyuan Chen
Chuning Zhu
Pulkit Agrawal
Jianchao Tan
Abhishek Gupta
OffRLSSL
88
9
0
26 May 2023
A Model-Based Solution to the Offline Multi-Agent Reinforcement Learning
  Coordination Problem
A Model-Based Solution to the Offline Multi-Agent Reinforcement Learning Coordination Problem
Paul Barde
Jakob N. Foerster
Derek Nowrouzezahrai
Amy Zhang
OffRL
73
12
0
26 May 2023
Let the Flows Tell: Solving Graph Combinatorial Optimization Problems
  with GFlowNets
Let the Flows Tell: Solving Graph Combinatorial Optimization Problems with GFlowNets
Dinghuai Zhang
H. Dai
Nikolay Malkin
Aaron Courville
Yoshua Bengio
L. Pan
141
37
0
26 May 2023
Adaptive PD Control using Deep Reinforcement Learning for Local-Remote
  Teleoperation with Stochastic Time Delays
Adaptive PD Control using Deep Reinforcement Learning for Local-Remote Teleoperation with Stochastic Time Delays
Lucy McCutcheon
Saber Fallah
54
0
0
26 May 2023
Learning Interpretable Models of Aircraft Handling Behaviour by
  Reinforcement Learning from Human Feedback
Learning Interpretable Models of Aircraft Handling Behaviour by Reinforcement Learning from Human Feedback
Tom Bewley
J. Lawry
Arthur G. Richards
80
1
0
26 May 2023
Future-conditioned Unsupervised Pretraining for Decision Transformer
Future-conditioned Unsupervised Pretraining for Decision Transformer
Zhihui Xie
Zichuan Lin
Deheng Ye
Qiang Fu
Wei Yang
Shuai Li
OffRLOnRL
92
23
0
26 May 2023
Physics-Regulated Deep Reinforcement Learning: Invariant Embeddings
Physics-Regulated Deep Reinforcement Learning: Invariant Embeddings
H. Cao
Y. Mao
L. Sha
Marco Caccamo
PINNAI4CE
76
6
0
26 May 2023
Distributional Reinforcement Learning with Dual Expectile-Quantile Regression
Distributional Reinforcement Learning with Dual Expectile-Quantile Regression
Sami Jullien
Romain Deffayet
J. Renders
Paul T. Groth
Maarten de Rijke
OOD
109
1
0
26 May 2023
Counterfactual Explainer Framework for Deep Reinforcement Learning
  Models Using Policy Distillation
Counterfactual Explainer Framework for Deep Reinforcement Learning Models Using Policy Distillation
Amir Samadi
K. Koufos
Kurt Debattista
M. Dianati
OffRL
75
3
0
25 May 2023
Reward-Machine-Guided, Self-Paced Reinforcement Learning
Reward-Machine-Guided, Self-Paced Reinforcement Learning
Cevahir Köprülü
Ufuk Topcu
86
3
0
25 May 2023
Coherent Soft Imitation Learning
Coherent Soft Imitation Learning
Joe Watson
Sandy H. Huang
Nicholas Heess
93
12
0
25 May 2023
Sample Efficient Reinforcement Learning in Mixed Systems through
  Augmented Samples and Its Applications to Queueing Networks
Sample Efficient Reinforcement Learning in Mixed Systems through Augmented Samples and Its Applications to Queueing Networks
Honghao Wei
Xin Liu
Weina Wang
Lei Ying
72
10
0
25 May 2023
Imitating Task and Motion Planning with Visuomotor Transformers
Imitating Task and Motion Planning with Visuomotor Transformers
Murtaza Dalal
Ajay Mandlekar
Caelan Reed Garrett
Ankur Handa
Ruslan Salakhutdinov
Dieter Fox
163
57
0
25 May 2023
Learning Better with Less: Effective Augmentation for Sample-Efficient
  Visual Reinforcement Learning
Learning Better with Less: Effective Augmentation for Sample-Efficient Visual Reinforcement Learning
Guozheng Ma
Linrui Zhang
Haoyu Wang
Lu Li
Zilin Wang
Zhen Wang
Li Shen
Xueqian Wang
Dacheng Tao
102
13
0
25 May 2023
PROTO: Iterative Policy Regularized Offline-to-Online Reinforcement
  Learning
PROTO: Iterative Policy Regularized Offline-to-Online Reinforcement Learning
Jianxiong Li
Xiao Hu
Haoran Xu
Jingjing Liu
Xianyuan Zhan
Ya Zhang
OffRLOnRL
99
19
0
25 May 2023
Inverse Preference Learning: Preference-based RL without a Reward
  Function
Inverse Preference Learning: Preference-based RL without a Reward Function
Joey Hejna
Dorsa Sadigh
OffRL
106
56
0
24 May 2023
Successor-Predecessor Intrinsic Exploration
Successor-Predecessor Intrinsic Exploration
Changmin Yu
Neil Burgess
M. Sahani
S. Gershman
63
4
0
24 May 2023
Decision-Aware Actor-Critic with Function Approximation and Theoretical
  Guarantees
Decision-Aware Actor-Critic with Function Approximation and Theoretical Guarantees
Sharan Vaswani
A. Kazemi
Reza Babanezhad
Nicolas Le Roux
OffRL
90
4
0
24 May 2023
Neural Lyapunov and Optimal Control
Neural Lyapunov and Optimal Control
Daniel Layeghi
Steve Tonneau
M. Mistry
65
0
0
24 May 2023
ChemGymRL: An Interactive Framework for Reinforcement Learning for
  Digital Chemistry
ChemGymRL: An Interactive Framework for Reinforcement Learning for Digital Chemistry
Chris Beeler
Sriram Ganapathi Subramanian
Kyle Sprague
Nouha Chatti
C. Bellinger
...
Amanuel Dawit
Zihan Yang
Xinkai Li
Mark Crowley
Isaac Tamblyn
OffRL
77
6
0
23 May 2023
Solving Stabilize-Avoid Optimal Control via Epigraph Form and Deep
  Reinforcement Learning
Solving Stabilize-Avoid Optimal Control via Epigraph Form and Deep Reinforcement Learning
Oswin So
Chuchu Fan
46
24
0
23 May 2023
Conditional Mutual Information for Disentangled Representations in
  Reinforcement Learning
Conditional Mutual Information for Disentangled Representations in Reinforcement Learning
Mhairi Dunion
Trevor A. McInroe
K. Luck
Josiah P. Hanna
Stefano V. Albrecht
OODDRL
74
19
0
23 May 2023
RLBoost: Boosting Supervised Models using Deep Reinforcement Learning
RLBoost: Boosting Supervised Models using Deep Reinforcement Learning
Eloy Anguiano Batanero
Ángela Fernández Pascual
Á. Jiménez
OffRL
34
0
0
23 May 2023
Trend-Based SAC Beam Control Method with Zero-Shot in Superconducting
  Linear Accelerator
Trend-Based SAC Beam Control Method with Zero-Shot in Superconducting Linear Accelerator
Xiaolong Chen
X. Qi
Chun-Wei Su
Yuan He
Zhi-jun Wang
...
Weilong Chen
Shuhui Liu
Xiaoying Zhao
Duanyang Jia
Man Yi
13
0
0
23 May 2023
Constrained Reinforcement Learning for Dynamic Material Handling
Constrained Reinforcement Learning for Dynamic Material Handling
Chengpeng Hu
Ziming Wang
Jialin Liu
J. Wen
Bifei Mao
Xinghu Yao
117
1
0
23 May 2023
OER: Offline Experience Replay for Continual Offline Reinforcement
  Learning
OER: Offline Experience Replay for Continual Offline Reinforcement Learning
Sibo Gai
Donglin Wang
Li He
CLLOffRL
109
3
0
23 May 2023
Proximal Policy Gradient Arborescence for Quality Diversity
  Reinforcement Learning
Proximal Policy Gradient Arborescence for Quality Diversity Reinforcement Learning
Sumeet Batra
Bryon Tjanaka
Matthew C. Fontaine
Aleksei Petrenko
Stefanos Nikolaidis
Gaurav Sukhatme
OffRL
103
17
0
23 May 2023
Towards Efficient Multi-Agent Learning Systems
Towards Efficient Multi-Agent Learning Systems
Kailash Gogineni
Peng Wei
Tian-Shing Lan
Guru Venkataramani
85
6
0
22 May 2023
Regularization and Variance-Weighted Regression Achieves Minimax
  Optimality in Linear MDPs: Theory and Practice
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice
Toshinori Kitamura
Tadashi Kozuno
Yunhao Tang
Nino Vieillard
Michal Valko
...
Olivier Pietquin
Matthieu Geist
Csaba Szepesvári
Wataru Kumagai
Yutaka Matsuo
OffRL
90
3
0
22 May 2023
Policy Representation via Diffusion Probability Model for Reinforcement
  Learning
Policy Representation via Diffusion Probability Model for Reinforcement Learning
Long Yang
Zhixiong Huang
Fenghao Lei
Yucun Zhong
Yiming Yang
Cong Fang
Shiting Wen
Binbin Zhou
Zhouchen Lin
DiffM
114
53
0
22 May 2023
Road Planning for Slums via Deep Reinforcement Learning
Road Planning for Slums via Deep Reinforcement Learning
Y. Zheng
Hongyuan Su
Jingtao Ding
Depeng Jin
Yong Li
80
14
0
22 May 2023
Testing of Deep Reinforcement Learning Agents with Surrogate Models
Testing of Deep Reinforcement Learning Agents with Surrogate Models
Matteo Biagiola
Paolo Tonella
97
21
0
22 May 2023
TOM: Learning Policy-Aware Models for Model-Based Reinforcement Learning
  via Transition Occupancy Matching
TOM: Learning Policy-Aware Models for Model-Based Reinforcement Learning via Transition Occupancy Matching
Yecheng Jason Ma
K. Sivakumar
Jason Yan
Osbert Bastani
Dinesh Jayaraman
OffRLMU
82
6
0
22 May 2023
Regularization of Soft Actor-Critic Algorithms with Automatic
  Temperature Adjustment
Regularization of Soft Actor-Critic Algorithms with Automatic Temperature Adjustment
Ben You
38
0
0
19 May 2023
Learning Diverse Risk Preferences in Population-based Self-play
Learning Diverse Risk Preferences in Population-based Self-play
Y. Jiang
Qihan Liu
Xiaoteng Ma
Chenghao Li
Yiqin Yang
Jun Yang
Bin Liang
Qianchuan Zhao
137
6
0
19 May 2023
Counterfactual Fairness Filter for Fair-Delay Multi-Robot Navigation
Counterfactual Fairness Filter for Fair-Delay Multi-Robot Navigation
Hikaru Asano
Ryo Yonetani
Mai Nishimura
Tadashi Kozuno
81
0
0
19 May 2023
Deep Metric Tensor Regularized Policy Gradient
Deep Metric Tensor Regularized Policy Gradient
Gang Chen
Victoria Huang
78
0
0
18 May 2023
Sharing Lifelong Reinforcement Learning Knowledge via Modulating Masks
Sharing Lifelong Reinforcement Learning Knowledge via Modulating Masks
Saptarshi Nath
Christos Peridis
Eseoghene Ben-Iwhiwhu
Xinran Liu
Shirin Dora
Cong Liu
Soheil Kolouri
Andrea Soltoggio
CLL
76
10
0
18 May 2023
Deep Reinforcement Learning-Based Control for Stomach Coverage Scanning
  of Wireless Capsule Endoscopy
Deep Reinforcement Learning-Based Control for Stomach Coverage Scanning of Wireless Capsule Endoscopy
Yameng Zhang
Long Bai
Li Liu
Hongliang Ren
Max Q.-H. Meng
68
10
0
18 May 2023
Previous
123...333435...818283
Next