ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1801.01290
  4. Cited By
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
v1v2 (latest)

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
ArXiv (abs)PDFHTML

Papers citing "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"

50 / 4,130 papers shown
Title
Residual Policy Learning for Powertrain Control
Residual Policy Learning for Powertrain Control
Lindsey Kerbel
B. Ayalew
Andrej Ivanco
K. Loiselle
43
4
0
15 Dec 2022
Cross-Domain Transfer via Semantic Skill Imitation
Cross-Domain Transfer via Semantic Skill Imitation
Karl Pertsch
Ruta Desai
Vikash Kumar
Franziska Meier
Joseph J. Lim
Dhruv Batra
Akshara Rai
LM&Ro
73
19
0
14 Dec 2022
Hybrid Multi-agent Deep Reinforcement Learning for Autonomous Mobility
  on Demand Systems
Hybrid Multi-agent Deep Reinforcement Learning for Autonomous Mobility on Demand Systems
Tobias Enders
James Harrison
Marco Pavone
Maximilian Schiffer
75
25
0
14 Dec 2022
Reinforcement Learning in System Identification
Reinforcement Learning in System Identification
J. Antonio
Martin H Oscar Fernández
Sergio Pérez
Anas Belfadil
C. Ibáñez-Llano
Freddy José Perozo
Javier Valle
Javier Arechalde Pelaz
54
0
0
14 Dec 2022
Efficient Exploration in Resource-Restricted Reinforcement Learning
Efficient Exploration in Resource-Restricted Reinforcement Learning
Zhihai Wang
Taoxing Pan
Qi Zhou
Jie Wang
OffRL
62
12
0
14 Dec 2022
Learning Robotic Navigation from Experience: Principles, Methods, and
  Recent Results
Learning Robotic Navigation from Experience: Principles, Methods, and Recent Results
Sergey Levine
Dhruv Shah
SSL
95
23
0
13 Dec 2022
Model-Free Approach to Fair Solar PV Curtailment Using Reinforcement
  Learning
Model-Free Approach to Fair Solar PV Curtailment Using Reinforcement Learning
Zhuo Wei
F. D. Nijs
Jinhao Li
Hao Wang
33
9
0
13 Dec 2022
Off-Policy Deep Reinforcement Learning Algorithms for Handling Various
  Robotic Manipulator Tasks
Off-Policy Deep Reinforcement Learning Algorithms for Handling Various Robotic Manipulator Tasks
Altun Rzayev
Vahid Tavakol Aghaei
OffRL
76
2
0
11 Dec 2022
On the Sensitivity of Reward Inference to Misspecified Human Models
On the Sensitivity of Reward Inference to Misspecified Human Models
Joey Hong
Kush S. Bhatia
Anca Dragan
66
26
0
09 Dec 2022
Reinforcement Learning for Predicting Traffic Accidents
Reinforcement Learning for Predicting Traffic Accidents
I. Cho
Praveenbalaji Rajendran
Taeyoung Kim
Dongsoo Har
43
6
0
09 Dec 2022
Model-based trajectory stitching for improved behavioural cloning and
  its applications
Model-based trajectory stitching for improved behavioural cloning and its applications
Charles A. Hepburn
Giovanni Montana
OffRL
88
7
0
08 Dec 2022
Accelerating Self-Imitation Learning from Demonstrations via Policy
  Constraints and Q-Ensemble
Accelerating Self-Imitation Learning from Demonstrations via Policy Constraints and Q-Ensemble
Chong Li
OffRL
77
1
0
07 Dec 2022
Few-Shot Preference Learning for Human-in-the-Loop RL
Few-Shot Preference Learning for Human-in-the-Loop RL
Joey Hejna
Dorsa Sadigh
OffRL
117
101
0
06 Dec 2022
Dynamic Decision Frequency with Continuous Options
Dynamic Decision Frequency with Continuous Options
Amir-Hossein Karimi
Jun Jin
Jun Luo
A. R. Mahmood
Martin Jägersand
Samuele Tosatto
104
10
0
06 Dec 2022
ISAACS: Iterative Soft Adversarial Actor-Critic for Safety
ISAACS: Iterative Soft Adversarial Actor-Critic for Safety
Kai Hsu
D. Nguyen
J. F. Fisac
87
29
0
06 Dec 2022
State Space Closure: Revisiting Endless Online Level Generation via
  Reinforcement Learning
State Space Closure: Revisiting Endless Online Level Generation via Reinforcement Learning
Ziqi Wang
Tianye Shu
Jialin Liu
OffRL
61
1
0
06 Dec 2022
Safe Imitation Learning of Nonlinear Model Predictive Control for
  Flexible Robots
Safe Imitation Learning of Nonlinear Model Predictive Control for Flexible Robots
Shamil Mamedov
Rudolf Reiter
Seyed Mahdi Basiri Azad
Joschka Boedecker
Moritz Diehl
Jan Swevers
82
2
0
06 Dec 2022
PrefRec: Recommender Systems with Human Preferences for Reinforcing
  Long-term User Engagement
PrefRec: Recommender Systems with Human Preferences for Reinforcing Long-term User Engagement
Wanqi Xue
Qingpeng Cai
Zhenghai Xue
Shuo Sun
Shuchang Liu
Dong Zheng
Peng Jiang
Kun Gai
Bo An
OffRL
62
28
0
06 Dec 2022
Cooperative control of environmental extremes by artificial intelligent
  agents
Cooperative control of environmental extremes by artificial intelligent agents
Martí Sánchez-Fibla
Clément Moulin-Frier
Ricard Solé
AI4CE
64
2
0
05 Dec 2022
Differentiated Federated Reinforcement Learning Based Traffic Offloading
  on Space-Air-Ground Integrated Networks
Differentiated Federated Reinforcement Learning Based Traffic Offloading on Space-Air-Ground Integrated Networks
Yeguang Qin
Yilin Yang
Fengxiao Tang
Xin Yao
Mingde Zhao
Nei Kato
69
6
0
05 Dec 2022
A Hierarchical Deep Reinforcement Learning Framework for 6-DOF UCAV
  Air-to-Air Combat
A Hierarchical Deep Reinforcement Learning Framework for 6-DOF UCAV Air-to-Air Combat
Jiajun Chai
Wenzhang Chen
Yuanheng Zhu
Zonggui Yao
Dongbin Zhao
BDL
63
38
0
05 Dec 2022
Resilience Evaluation of Entropy Regularized Logistic Networks with
  Probabilistic Cost
Resilience Evaluation of Entropy Regularized Logistic Networks with Probabilistic Cost
Koshi Oishi
Yota Hashizume
Tomohiko Jimbo
Hirotaka Kaji
Kenji Kashima
53
2
0
05 Dec 2022
Reinforcement learning with Demonstrations from Mismatched Task under
  Sparse Reward
Reinforcement learning with Demonstrations from Mismatched Task under Sparse Reward
Yanjiang Guo
Jingyue Gao
Zheng Wu
Chengming Shi
Jianyu Chen
OffRL
90
5
0
03 Dec 2022
A Bayesian Framework for Digital Twin-Based Control, Monitoring, and
  Data Collection in Wireless Systems
A Bayesian Framework for Digital Twin-Based Control, Monitoring, and Data Collection in Wireless Systems
Clement Ruah
Osvaldo Simeone
Bashir M. Al-Hashimi
220
29
0
02 Dec 2022
Predict-and-Critic: Accelerated End-to-End Predictive Control for Cloud
  Computing through Reinforcement Learning
Predict-and-Critic: Accelerated End-to-End Predictive Control for Cloud Computing through Reinforcement Learning
Kaustubh Sridhar
Vikram Singh
Balakrishnan Narayanaswamy
Abishek Sankararaman
68
1
0
02 Dec 2022
Utilizing Prior Solutions for Reward Shaping and Composition in
  Entropy-Regularized Reinforcement Learning
Utilizing Prior Solutions for Reward Shaping and Composition in Entropy-Regularized Reinforcement Learning
Jacob Adamczyk
A. Arriojas
Stas Tiomkin
R. Kulkarni
83
11
0
02 Dec 2022
STL-Based Synthesis of Feedback Controllers Using Reinforcement Learning
STL-Based Synthesis of Feedback Controllers Using Reinforcement Learning
Nikhil Kumar Singh
Indranil Saha
55
6
0
02 Dec 2022
Launchpad: Learning to Schedule Using Offline and Online RL Methods
Launchpad: Learning to Schedule Using Offline and Online RL Methods
V. Venkataswamy
J. E. Grigsby
A. Grimshaw
Yanjun Qi
OffRLOnRL
75
1
0
01 Dec 2022
One Risk to Rule Them All: A Risk-Sensitive Perspective on Model-Based
  Offline Reinforcement Learning
One Risk to Rule Them All: A Risk-Sensitive Perspective on Model-Based Offline Reinforcement Learning
Marc Rigter
Bruno Lacerda
Nick Hawes
OffRL
98
7
0
30 Nov 2022
Safe and Efficient Reinforcement Learning Using
  Disturbance-Observer-Based Control Barrier Functions
Safe and Efficient Reinforcement Learning Using Disturbance-Observer-Based Control Barrier Functions
Yikun Cheng
Pan Zhao
N. Hovakimyan
OffRL
61
12
0
30 Nov 2022
Multi-Agent Reinforcement Learning for Microprocessor Design Space
  Exploration
Multi-Agent Reinforcement Learning for Microprocessor Design Space Exploration
Srivatsan Krishnan
Natasha Jaques
Shayegan Omidshafiei
Dan Zhang
Izzeddin Gur
Vijay Janapa Reddi
Aleksandra Faust
84
2
0
29 Nov 2022
PAC-Bayes Bounds for Bandit Problems: A Survey and Experimental
  Comparison
PAC-Bayes Bounds for Bandit Problems: A Survey and Experimental Comparison
H. Flynn
David Reeb
M. Kandemir
Jan Peters
OffRL
85
7
0
29 Nov 2022
Behavior Estimation from Multi-Source Data for Offline Reinforcement
  Learning
Behavior Estimation from Multi-Source Data for Offline Reinforcement Learning
Guoxi Zhang
H. Kashima
OffRL
88
2
0
29 Nov 2022
Offline Reinforcement Learning with Closed-Form Policy Improvement
  Operators
Offline Reinforcement Learning with Closed-Form Policy Improvement Operators
Jiachen Li
Edwin Zhang
Ming Yin
Qinxun Bai
Yu Wang
William Yang Wang
OffRL
112
17
0
29 Nov 2022
The Effectiveness of World Models for Continual Reinforcement Learning
The Effectiveness of World Models for Continual Reinforcement Learning
Samuel Kessler
M. Ostaszewski
Michal Bortkiewicz
M. Żarski
Maciej Wołczyk
Jack Parker-Holder
Stephen J. Roberts
Piotr Milo's
KELMOffRLCLL
92
8
0
29 Nov 2022
CLAS: Coordinating Multi-Robot Manipulation with Central Latent Action
  Spaces
CLAS: Coordinating Multi-Robot Manipulation with Central Latent Action Spaces
Elie Aljalbout
Maximilian Karl
Patrick van der Smagt
75
5
0
28 Nov 2022
Is Conditional Generative Modeling all you need for Decision-Making?
Is Conditional Generative Modeling all you need for Decision-Making?
Anurag Ajay
Yilun Du
Abhi Gupta
J. Tenenbaum
Tommi Jaakkola
Pulkit Agrawal
DiffM
170
409
0
28 Nov 2022
Tackling Visual Control via Multi-View Exploration Maximization
Tackling Visual Control via Multi-View Exploration Maximization
Mingqi Yuan
Xin Jin
Bo Li
Wenjun Zeng
67
1
0
28 Nov 2022
Continuous Episodic Control
Continuous Episodic Control
Zhao Yang
Thomas M. Moerland
Mike Preuss
Aske Plaat
OffRL
75
3
0
28 Nov 2022
Reinforcement Learning from Simulation to Real World Autonomous Driving
  using Digital Twin
Reinforcement Learning from Simulation to Real World Autonomous Driving using Digital Twin
Kevin Voogd
Jean Pierre Allamaa
Javier Alonso-Mora
Tong Duy Son
64
14
0
27 Nov 2022
Domain Generalization for Robust Model-Based Offline Reinforcement
  Learning
Domain Generalization for Robust Model-Based Offline Reinforcement Learning
Alan Clark
Shoaib Ahmed Siddiqui
Robert Kirk
Usman Anwar
Stephen Chung
David M. Krueger
OODOffRL
72
0
0
27 Nov 2022
BEAR: Physics-Principled Building Environment for Control and
  Reinforcement Learning
BEAR: Physics-Principled Building Environment for Control and Reinforcement Learning
Chi Zhang
Yu Shi
Yize Chen
25
7
0
27 Nov 2022
RL-Based Guidance in Outpatient Hysteroscopy Training: A Feasibility
  Study
RL-Based Guidance in Outpatient Hysteroscopy Training: A Feasibility Study
V. Poliakov
K. Niu
E. V. Poorten
Dzmitry Tsetserukou
OnRL
38
0
0
26 Nov 2022
Transfer RL via the Undo Maps Formalism
Transfer RL via the Undo Maps Formalism
Abhi Gupta
Theodore H. Moskovitz
David Alvarez-Melis
Aldo Pacchiano
OffRL
69
0
0
26 Nov 2022
Choreographer: Learning and Adapting Skills in Imagination
Choreographer: Learning and Adapting Skills in Imagination
Pietro Mazzaglia
Tim Verbelen
Bart Dhoedt
Alexandre Lacoste
Sai Rajeswar
128
25
0
23 Nov 2022
Actively Learning Costly Reward Functions for Reinforcement Learning
Actively Learning Costly Reward Functions for Reinforcement Learning
André Eberhard
Houssam Metni
G. Fahland
A. Stroh
Pascal Friederich
OffRL
113
0
0
23 Nov 2022
Representation Learning for Continuous Action Spaces is Beneficial for
  Efficient Policy Learning
Representation Learning for Continuous Action Spaces is Beneficial for Efficient Policy Learning
Tingting Zhao
Ying Wang
Weidong Sun
Yarui Chen
Gang Niu
Masashi Sugiyama
70
1
0
23 Nov 2022
Reinforcement learning for traffic signal control in hybrid action space
Haoqing Luo
Sheng Jin
82
7
0
23 Nov 2022
Efficient Exploration using Model-Based Quality-Diversity with Gradients
Efficient Exploration using Model-Based Quality-Diversity with Gradients
Bryan Lim
Manon Flageat
Antoine Cully
66
4
0
22 Nov 2022
Model-based Trajectory Stitching for Improved Offline Reinforcement
  Learning
Model-based Trajectory Stitching for Improved Offline Reinforcement Learning
Charles A. Hepburn
Giovanni Montana
OffRL
94
14
0
21 Nov 2022
Previous
123...404142...818283
Next