Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1801.01290
Cited By
v1
v2 (latest)
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"
50 / 4,128 papers shown
Title
SACHA: Soft Actor-Critic with Heuristic-Based Attention for Partially Observable Multi-Agent Path Finding
Qiushi Lin
Hang Ma
122
19
0
05 Jul 2023
FOCUS: Object-Centric World Models for Robotics Manipulation
Stefano Ferraro
Pietro Mazzaglia
Tim Verbelen
Bart Dhoedt
OCL
LM&Ro
95
13
0
05 Jul 2023
LLQL: Logistic Likelihood Q-Learning for Reinforcement Learning
Outongyi Lv
Bingxin Zhou
OffRL
117
0
0
05 Jul 2023
First-Explore, then Exploit: Meta-Learning Intelligent Exploration
Ben Norman
Jeff Clune
58
0
0
05 Jul 2023
Dynamic Feature-based Deep Reinforcement Learning for Flow Control of Circular Cylinder with Sparse Surface Pressure Sensing
Qiulei Wang
Lei Yan
Gang Hu
Wenli Chen
Jean Rabault
B. R. Noack
AI4CE
59
31
0
05 Jul 2023
Is Risk-Sensitive Reinforcement Learning Properly Resolved?
Ruiwen Zhou
Minghuan Liu
Kan Ren
Xufang Luo
Weinan Zhang
Dongsheng Li
54
3
0
02 Jul 2023
Shared Growth of Graph Neural Networks via Prompted Free-direction Knowledge Distillation
Kaituo Feng
Yikun Miao
Changsheng Li
Ye Yuan
Guoren Wang
121
0
0
02 Jul 2023
RObotic MAnipulation Network (ROMAN)
\unicode
x
2013
\unicode{x2013}
\unicode
x
2013
Hybrid Hierarchical Learning for Solving Complex Sequential Tasks
Eleftherios Triantafyllidis
Fernando Acero
Zhaocheng Liu
Zhibin Li
100
0
0
30 Jun 2023
Resetting the Optimizer in Deep RL: An Empirical Study
Kavosh Asadi
Rasool Fakoor
Shoham Sabach
ODL
75
26
0
30 Jun 2023
Thompson sampling for improved exploration in GFlowNets
Jarrid Rector-Brooks
Kanika Madan
Moksh Jain
Maksym Korablyov
Cheng-Hao Liu
Sarath Chandar
Nikolay Malkin
Yoshua Bengio
93
30
0
30 Jun 2023
Human-like Decision-making at Unsignalized Intersection using Social Value Orientation
Yan Tong
Licheng Wen
Pinlong Cai
Daocheng Fu
Song Mao
Yikang Li
102
2
0
30 Jun 2023
Probabilistic Constraint for Safety-Critical Reinforcement Learning
Weiqin Chen
D. Subramanian
Santiago Paternain
87
15
0
29 Jun 2023
Identifying Important Sensory Feedback for Learning Locomotion Skills
Wanming Yu
Chuanyu Yang
C. McGreavy
Eleftherios Triantafyllidis
Guillaume Bellegarda
M. Shafiee
A. Ijspeert
Zhibin Li
85
16
0
29 Jun 2023
Learning Coverage Paths in Unknown Environments with Deep Reinforcement Learning
Arvi Jonnarth
Jie Zhao
Michael Felsberg
110
8
0
29 Jun 2023
Eigensubspace of Temporal-Difference Dynamics and How It Improves Value Approximation in Reinforcement Learning
Qiang He
Dinesh Manocha
Meng Fang
S. Maghsudi
76
5
0
29 Jun 2023
Principles and Guidelines for Evaluating Social Robot Navigation Algorithms
Anthony G. Francis
Claudia Pérez-DÁrpino
Chengshu Li
Fei Xia
Alexandre Alahi
...
Xuesu Xiao
Peng Xu
Naoki Yokoyama
Alexander Toshev
Roberto Martin-Martin Logical Robotics
122
78
0
29 Jun 2023
SARC: Soft Actor Retrospective Critic
Sukriti Verma
Ayush Chopra
J. Subramanian
Mausoom Sarkar
Nikaash Puri
Piyush B. Gupta
Balaji Krishnamurthy
48
0
0
28 Jun 2023
MRHER: Model-based Relay Hindsight Experience Replay for Sequential Object Manipulation Tasks with Sparse Rewards
Yuming Huang
Bin Ren
Ziming Xu
Lianghong Wu
OffRL
70
0
0
28 Jun 2023
RL
3
^3
3
: Boosting Meta Reinforcement Learning via RL inside RL
2
^2
2
Abhinav Bhatia
Samer B. Nashed
S. Zilberstein
OffRL
107
0
0
28 Jun 2023
Diversity is Strength: Mastering Football Full Game with Interactive Reinforcement Learning of Multiple AIs
Chenglu Sun
Shuo Shen
Sijia Xu
Weidong Zhang
52
1
0
28 Jun 2023
What Went Wrong? Closing the Sim-to-Real Gap via Differentiable Causal Discovery
Peide Huang
Xilun Zhang
Ziang Cao
Shiqi Liu
Mengdi Xu
Wenhao Ding
Jonathan M Francis
Bingqing Chen
Ding Zhao
121
25
0
28 Jun 2023
Automatic Truss Design with Reinforcement Learning
Weihua Du
Jinglun Zhao
Chao Yu
Xingcheng Yao
Zimeng Song
Siyang Wu
Ruifeng Luo
Zhiyuan Liu
Xianzhong Zhao
Yi Wu
OffRL
3DV
31
1
0
27 Jun 2023
Learning non-Markovian Decision-Making from State-only Sequences
Aoyang Qin
Feng Gao
Qing Li
Song-Chun Zhu
Sirui Xie
75
9
0
27 Jun 2023
Learning to Modulate pre-trained Models in RL
Thomas Schmied
M. Hofmarcher
Fabian Paischer
Razvan Pascanu
Sepp Hochreiter
CLL
OffRL
109
18
0
26 Jun 2023
Maximum State Entropy Exploration using Predecessor and Successor Representations
A. Jain
Lucas Lehnert
Irina Rish
Glen Berseth
91
16
0
26 Jun 2023
TVDO: Tchebycheff Value-Decomposition Optimization for Multi-Agent Reinforcement Learning
Xiao Hu
P. Guo
Chuanwei Zhou
Tong Zhang
Zhen Cui
60
1
0
24 Jun 2023
Safe Reinforcement Learning with Dead-Ends Avoidance and Recovery
Xiao Zhang
Hai Zhang
Hongtu Zhou
Chang Huang
Di Zhang
Chen Ye
Junqiao Zhao
OffRL
87
5
0
24 Jun 2023
Comparing the Efficacy of Fine-Tuning and Meta-Learning for Few-Shot Policy Imitation
Massimiliano Patacchiola
Mingfei Sun
Katja Hofmann
Richard Turner
OffRL
86
1
0
23 Jun 2023
Sum-Rate Maximization of RSMA-based Aerial Communications with Energy Harvesting: A Reinforcement Learning Approach
Jaehyup Seong
Mesut Toka
W. Shin
13
5
0
22 Jun 2023
MP3: Movement Primitive-Based (Re-)Planning Policy
Fabian Otto
Hongyi Zhou
Onur Celik
Ge Li
Rudolf Lioutikov
Gerhard Neumann
86
5
0
22 Jun 2023
SoftGPT: Learn Goal-oriented Soft Object Manipulation Skills by Generative Pre-trained Heterogeneous Graph Transformer
Junjia Liu
Zhihao Li
Wanyu Lin
Sylvain Calinon
Kay Chen Tan
Fei Chen
94
9
0
22 Jun 2023
One Policy to Dress Them All: Learning to Dress People with Diverse Poses and Garments
Yufei Wang
Zhanyi Sun
Zackory M. Erickson
David Held
91
26
0
21 Jun 2023
Optimistic Active Exploration of Dynamical Systems
Bhavya Sukhija
Lenart Treven
Cansu Sancaktar
Sebastian Blaes
Stelian Coros
Andreas Krause
124
18
0
21 Jun 2023
AdCraft: An Advanced Reinforcement Learning Benchmark Environment for Search Engine Marketing Optimization
Maziar Gomrokchi
Owen Levin
Jeffrey Roach
Jonah White
OffRL
90
1
0
21 Jun 2023
Efficient Dynamics Modeling in Interactive Environments with Koopman Theory
Arnab Kumar Mondal
Siba Smarak Panigrahi
Sai Rajeswar
K. Siddiqi
Siamak Ravanbakhsh
99
3
0
20 Jun 2023
Adaptive Ensemble Q-learning: Minimizing Estimation Bias via Error Feedback
Hang Wang
Sen Lin
Junshan Zhang
77
19
0
20 Jun 2023
Learning to Generate Better Than Your LLM
Jonathan D. Chang
Kianté Brantley
Rajkumar Ramamurthy
Dipendra Kumar Misra
Wen Sun
80
49
0
20 Jun 2023
Evolutionary Strategy Guided Reinforcement Learning via MultiBuffer Communication
Adam Callaghan
Karl Mason
Patrick Mannion
71
2
0
20 Jun 2023
Provably Robust Temporal Difference Learning for Heavy-Tailed Rewards
Semih Cayci
A. Eryilmaz
72
2
0
20 Jun 2023
Multi-user Reset Controller for Redirected Walking Using Reinforcement Learning
Ho Jung Lee
Sang-Bin Jeon
Yong-Hun Cho
In-Kwon Lee
16
2
0
20 Jun 2023
Autonomous Driving with Deep Reinforcement Learning in CARLA Simulation
Jumman Hossain
59
7
0
20 Jun 2023
AdaStop: adaptive statistical testing for sound comparisons of Deep RL agents
Timothée Mathieu
R. D. Vecchia
Alena Shilova
M. Centa
Hector Kohler
Odalric-Ambrym Maillard
Philippe Preux
51
0
0
19 Jun 2023
Collaborative Optimization of Multi-microgrids System with Shared Energy Storage Based on Multi-agent Stochastic Game and Reinforcement Learning
Yijia Wang
Yangliu Cui
Yang Li
Yang Xu
42
29
0
19 Jun 2023
PLASTIC: Improving Input and Label Plasticity for Sample Efficient Reinforcement Learning
Hojoon Lee
Hanseul Cho
Hyunseung Kim
Daehoon Gwak
Joonkee Kim
Jaegul Choo
Se-Young Yun
Chulhee Yun
OffRL
157
30
0
19 Jun 2023
Variational Sequential Optimal Experimental Design using Reinforcement Learning
Wanggang Shen
Jiayuan Dong
Xun Huan
66
3
0
17 Jun 2023
FP-IRL: Fokker-Planck-based Inverse Reinforcement Learning -- A Physics-Constrained Approach to Markov Decision Processes
Chengyang Huang
Siddharth Srivastava
Xun Huan
K. Garikipati
32
0
0
17 Jun 2023
The RL Perceptron: Generalisation Dynamics of Policy Learning in High Dimensions
Nishil Patel
Sebastian Lee
Stefano Sarao Mannelli
Sebastian Goldt
Adrew Saxe
OffRL
141
4
0
17 Jun 2023
Active Policy Improvement from Multiple Black-box Oracles
Xuefeng Liu
Takuma Yoneda
Chaoqi Wang
Matthew R. Walter
Yuxin Chen
128
10
0
17 Jun 2023
Low-Switching Policy Gradient with Exploration via Online Sensitivity Sampling
Yunfan Li
Yiran Wang
Y. Cheng
Lin F. Yang
OffRL
104
4
0
15 Jun 2023
Residual Q-Learning: Offline and Online Policy Customization without Value
Chenran Li
Chen Tang
Haruki Nishimura
Jean Mercat
Masayoshi Tomizuka
Wei Zhan
OffRL
102
7
0
15 Jun 2023
Previous
1
2
3
...
31
32
33
...
81
82
83
Next