Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
1801.01290
Cited By
v1
v2 (latest)
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"
50 / 4,130 papers shown
Title
Residual Policy Learning for Powertrain Control
Lindsey Kerbel
B. Ayalew
Andrej Ivanco
K. Loiselle
43
4
0
15 Dec 2022
Cross-Domain Transfer via Semantic Skill Imitation
Karl Pertsch
Ruta Desai
Vikash Kumar
Franziska Meier
Joseph J. Lim
Dhruv Batra
Akshara Rai
LM&Ro
73
19
0
14 Dec 2022
Hybrid Multi-agent Deep Reinforcement Learning for Autonomous Mobility on Demand Systems
Tobias Enders
James Harrison
Marco Pavone
Maximilian Schiffer
75
25
0
14 Dec 2022
Reinforcement Learning in System Identification
J. Antonio
Martin H Oscar Fernández
Sergio Pérez
Anas Belfadil
C. Ibáñez-Llano
Freddy José Perozo
Javier Valle
Javier Arechalde Pelaz
54
0
0
14 Dec 2022
Efficient Exploration in Resource-Restricted Reinforcement Learning
Zhihai Wang
Taoxing Pan
Qi Zhou
Jie Wang
OffRL
62
12
0
14 Dec 2022
Learning Robotic Navigation from Experience: Principles, Methods, and Recent Results
Sergey Levine
Dhruv Shah
SSL
95
23
0
13 Dec 2022
Model-Free Approach to Fair Solar PV Curtailment Using Reinforcement Learning
Zhuo Wei
F. D. Nijs
Jinhao Li
Hao Wang
33
9
0
13 Dec 2022
Off-Policy Deep Reinforcement Learning Algorithms for Handling Various Robotic Manipulator Tasks
Altun Rzayev
Vahid Tavakol Aghaei
OffRL
76
2
0
11 Dec 2022
On the Sensitivity of Reward Inference to Misspecified Human Models
Joey Hong
Kush S. Bhatia
Anca Dragan
66
26
0
09 Dec 2022
Reinforcement Learning for Predicting Traffic Accidents
I. Cho
Praveenbalaji Rajendran
Taeyoung Kim
Dongsoo Har
43
6
0
09 Dec 2022
Model-based trajectory stitching for improved behavioural cloning and its applications
Charles A. Hepburn
Giovanni Montana
OffRL
88
7
0
08 Dec 2022
Accelerating Self-Imitation Learning from Demonstrations via Policy Constraints and Q-Ensemble
Chong Li
OffRL
77
1
0
07 Dec 2022
Few-Shot Preference Learning for Human-in-the-Loop RL
Joey Hejna
Dorsa Sadigh
OffRL
117
101
0
06 Dec 2022
Dynamic Decision Frequency with Continuous Options
Amir-Hossein Karimi
Jun Jin
Jun Luo
A. R. Mahmood
Martin Jägersand
Samuele Tosatto
104
10
0
06 Dec 2022
ISAACS: Iterative Soft Adversarial Actor-Critic for Safety
Kai Hsu
D. Nguyen
J. F. Fisac
87
29
0
06 Dec 2022
State Space Closure: Revisiting Endless Online Level Generation via Reinforcement Learning
Ziqi Wang
Tianye Shu
Jialin Liu
OffRL
61
1
0
06 Dec 2022
Safe Imitation Learning of Nonlinear Model Predictive Control for Flexible Robots
Shamil Mamedov
Rudolf Reiter
Seyed Mahdi Basiri Azad
Joschka Boedecker
Moritz Diehl
Jan Swevers
82
2
0
06 Dec 2022
PrefRec: Recommender Systems with Human Preferences for Reinforcing Long-term User Engagement
Wanqi Xue
Qingpeng Cai
Zhenghai Xue
Shuo Sun
Shuchang Liu
Dong Zheng
Peng Jiang
Kun Gai
Bo An
OffRL
62
28
0
06 Dec 2022
Cooperative control of environmental extremes by artificial intelligent agents
Martí Sánchez-Fibla
Clément Moulin-Frier
Ricard Solé
AI4CE
64
2
0
05 Dec 2022
Differentiated Federated Reinforcement Learning Based Traffic Offloading on Space-Air-Ground Integrated Networks
Yeguang Qin
Yilin Yang
Fengxiao Tang
Xin Yao
Mingde Zhao
Nei Kato
69
6
0
05 Dec 2022
A Hierarchical Deep Reinforcement Learning Framework for 6-DOF UCAV Air-to-Air Combat
Jiajun Chai
Wenzhang Chen
Yuanheng Zhu
Zonggui Yao
Dongbin Zhao
BDL
63
38
0
05 Dec 2022
Resilience Evaluation of Entropy Regularized Logistic Networks with Probabilistic Cost
Koshi Oishi
Yota Hashizume
Tomohiko Jimbo
Hirotaka Kaji
Kenji Kashima
53
2
0
05 Dec 2022
Reinforcement learning with Demonstrations from Mismatched Task under Sparse Reward
Yanjiang Guo
Jingyue Gao
Zheng Wu
Chengming Shi
Jianyu Chen
OffRL
90
5
0
03 Dec 2022
A Bayesian Framework for Digital Twin-Based Control, Monitoring, and Data Collection in Wireless Systems
Clement Ruah
Osvaldo Simeone
Bashir M. Al-Hashimi
220
29
0
02 Dec 2022
Predict-and-Critic: Accelerated End-to-End Predictive Control for Cloud Computing through Reinforcement Learning
Kaustubh Sridhar
Vikram Singh
Balakrishnan Narayanaswamy
Abishek Sankararaman
68
1
0
02 Dec 2022
Utilizing Prior Solutions for Reward Shaping and Composition in Entropy-Regularized Reinforcement Learning
Jacob Adamczyk
A. Arriojas
Stas Tiomkin
R. Kulkarni
83
11
0
02 Dec 2022
STL-Based Synthesis of Feedback Controllers Using Reinforcement Learning
Nikhil Kumar Singh
Indranil Saha
55
6
0
02 Dec 2022
Launchpad: Learning to Schedule Using Offline and Online RL Methods
V. Venkataswamy
J. E. Grigsby
A. Grimshaw
Yanjun Qi
OffRL
OnRL
75
1
0
01 Dec 2022
One Risk to Rule Them All: A Risk-Sensitive Perspective on Model-Based Offline Reinforcement Learning
Marc Rigter
Bruno Lacerda
Nick Hawes
OffRL
98
7
0
30 Nov 2022
Safe and Efficient Reinforcement Learning Using Disturbance-Observer-Based Control Barrier Functions
Yikun Cheng
Pan Zhao
N. Hovakimyan
OffRL
61
12
0
30 Nov 2022
Multi-Agent Reinforcement Learning for Microprocessor Design Space Exploration
Srivatsan Krishnan
Natasha Jaques
Shayegan Omidshafiei
Dan Zhang
Izzeddin Gur
Vijay Janapa Reddi
Aleksandra Faust
84
2
0
29 Nov 2022
PAC-Bayes Bounds for Bandit Problems: A Survey and Experimental Comparison
H. Flynn
David Reeb
M. Kandemir
Jan Peters
OffRL
85
7
0
29 Nov 2022
Behavior Estimation from Multi-Source Data for Offline Reinforcement Learning
Guoxi Zhang
H. Kashima
OffRL
88
2
0
29 Nov 2022
Offline Reinforcement Learning with Closed-Form Policy Improvement Operators
Jiachen Li
Edwin Zhang
Ming Yin
Qinxun Bai
Yu Wang
William Yang Wang
OffRL
112
17
0
29 Nov 2022
The Effectiveness of World Models for Continual Reinforcement Learning
Samuel Kessler
M. Ostaszewski
Michal Bortkiewicz
M. Żarski
Maciej Wołczyk
Jack Parker-Holder
Stephen J. Roberts
Piotr Milo's
KELM
OffRL
CLL
92
8
0
29 Nov 2022
CLAS: Coordinating Multi-Robot Manipulation with Central Latent Action Spaces
Elie Aljalbout
Maximilian Karl
Patrick van der Smagt
75
5
0
28 Nov 2022
Is Conditional Generative Modeling all you need for Decision-Making?
Anurag Ajay
Yilun Du
Abhi Gupta
J. Tenenbaum
Tommi Jaakkola
Pulkit Agrawal
DiffM
170
409
0
28 Nov 2022
Tackling Visual Control via Multi-View Exploration Maximization
Mingqi Yuan
Xin Jin
Bo Li
Wenjun Zeng
67
1
0
28 Nov 2022
Continuous Episodic Control
Zhao Yang
Thomas M. Moerland
Mike Preuss
Aske Plaat
OffRL
75
3
0
28 Nov 2022
Reinforcement Learning from Simulation to Real World Autonomous Driving using Digital Twin
Kevin Voogd
Jean Pierre Allamaa
Javier Alonso-Mora
Tong Duy Son
64
14
0
27 Nov 2022
Domain Generalization for Robust Model-Based Offline Reinforcement Learning
Alan Clark
Shoaib Ahmed Siddiqui
Robert Kirk
Usman Anwar
Stephen Chung
David M. Krueger
OOD
OffRL
72
0
0
27 Nov 2022
BEAR: Physics-Principled Building Environment for Control and Reinforcement Learning
Chi Zhang
Yu Shi
Yize Chen
25
7
0
27 Nov 2022
RL-Based Guidance in Outpatient Hysteroscopy Training: A Feasibility Study
V. Poliakov
K. Niu
E. V. Poorten
Dzmitry Tsetserukou
OnRL
38
0
0
26 Nov 2022
Transfer RL via the Undo Maps Formalism
Abhi Gupta
Theodore H. Moskovitz
David Alvarez-Melis
Aldo Pacchiano
OffRL
69
0
0
26 Nov 2022
Choreographer: Learning and Adapting Skills in Imagination
Pietro Mazzaglia
Tim Verbelen
Bart Dhoedt
Alexandre Lacoste
Sai Rajeswar
128
25
0
23 Nov 2022
Actively Learning Costly Reward Functions for Reinforcement Learning
André Eberhard
Houssam Metni
G. Fahland
A. Stroh
Pascal Friederich
OffRL
113
0
0
23 Nov 2022
Representation Learning for Continuous Action Spaces is Beneficial for Efficient Policy Learning
Tingting Zhao
Ying Wang
Weidong Sun
Yarui Chen
Gang Niu
Masashi Sugiyama
70
1
0
23 Nov 2022
Reinforcement learning for traffic signal control in hybrid action space
Haoqing Luo
Sheng Jin
82
7
0
23 Nov 2022
Efficient Exploration using Model-Based Quality-Diversity with Gradients
Bryan Lim
Manon Flageat
Antoine Cully
66
4
0
22 Nov 2022
Model-based Trajectory Stitching for Improved Offline Reinforcement Learning
Charles A. Hepburn
Giovanni Montana
OffRL
94
14
0
21 Nov 2022
Previous
1
2
3
...
40
41
42
...
81
82
83
Next