Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1801.01290
Cited By
v1
v2 (latest)
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"
50 / 4,130 papers shown
Title
Sample-Efficient Preference-based Reinforcement Learning with Dynamics Aware Rewards
Katherine Metcalf
Miguel Sarabia
Natalie Mackraz
B. Theobald
78
6
0
28 Feb 2024
Imitation-regularized Optimal Transport on Networks: Provable Robustness and Application to Logistics Planning
Koshi Oishi
Yota Hashizume
Tomohiko Jimbo
Hirotaka Kaji
Kenji Kashima
OOD
95
2
0
28 Feb 2024
Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning
Michael T. Matthews
Michael Beukman
Benjamin Ellis
Mikayel Samvelyan
Matthew Jackson
Samuel Coward
Jakob Foerster
OffRL
114
31
0
26 Feb 2024
Think2Drive: Efficient Reinforcement Learning by Thinking in Latent World Model for Quasi-Realistic Autonomous Driving (in CARLA-v2)
Qifeng Li
Xiaosong Jia
Shaobo Wang
Junchi Yan
126
34
0
26 Feb 2024
Language-guided Skill Learning with Temporal Variational Inference
Haotian Fu
Pratyusha Sharma
Elias Stengel-Eskin
George Konidaris
Nicolas Le Roux
Marc-Alexandre Côté
Xingdi Yuan
113
8
0
26 Feb 2024
Harnessing the Synergy between Pushing, Grasping, and Throwing to Enhance Object Manipulation in Cluttered Scenarios
Hamidreza Kasaei
Mohammadreza Kasaei
81
1
0
25 Feb 2024
Discretionary Lane-Change Decision and Control via Parameterized Soft Actor-Critic for Hybrid Action Space
Yuan Lin
Xiao Liu
Zishun Zheng
60
5
0
24 Feb 2024
A priori Estimates for Deep Residual Network in Continuous-time Reinforcement Learning
Shuyu Yin
Qixuan Zhou
Fei Wen
Tao Luo
83
0
0
24 Feb 2024
Is Offline Decision Making Possible with Only Few Samples? Reliable Decisions in Data-Starved Bandits via Trust Region Enhancement
Ruiqi Zhang
Yuexiang Zhai
Andrea Zanette
111
0
0
24 Feb 2024
Fair Resource Allocation in Multi-Task Learning
Hao Ban
Kaiyi Ji
80
14
0
23 Feb 2024
Reinforcement Learning with Elastic Time Steps
Dong Wang
Giovanni Beltrame
102
2
0
22 Feb 2024
ACE : Off-Policy Actor-Critic with Causality-Aware Entropy Regularization
Tianying Ji
Yongyuan Liang
Yan Zeng
Yu-Juan Luo
Guowei Xu
Jiawei Guo
Ruijie Zheng
Furong Huang
Gang Hua
Huazhe Xu
CML
125
12
0
22 Feb 2024
Enhancing Robotic Manipulation with AI Feedback from Multimodal Large Language Models
Jinyi Liu
Yifu Yuan
Jianye Hao
Fei Ni
Lingzhi Fu
Yibin Chen
Yan Zheng
LM&Ro
410
6
0
22 Feb 2024
BeTAIL: Behavior Transformer Adversarial Imitation Learning from Human Racing Gameplay
Catherine Weaver
Chen Tang
Ce Hao
Kenta Kawamoto
Masayoshi Tomizuka
Wei Zhan
OffRL
84
0
0
22 Feb 2024
Learning control strategy in soft robotics through a set of configuration spaces
Etienne Ménager
Christian Duriez
89
0
0
21 Feb 2024
The Edge-of-Reach Problem in Offline Model-Based Reinforcement Learning
Anya Sims
Cong Lu
Yee Whye Teh
OffRL
98
4
0
19 Feb 2024
In value-based deep reinforcement learning, a pruned network is a good network
J. Obando-Ceron
Rameswar Panda
Pablo Samuel Castro
OffRL
132
26
0
19 Feb 2024
Revisiting Data Augmentation in Deep Reinforcement Learning
Jianshu Hu
Yunpeng Jiang
Paul Weng
OffRL
94
6
0
19 Feb 2024
All Language Models Large and Small
Zhixun Chen
Yali Du
D. Mguni
57
0
0
19 Feb 2024
Multi Task Inverse Reinforcement Learning for Common Sense Reward
Neta Glazer
Aviv Navon
Aviv Shamsian
Ethan Fetaya
85
0
0
17 Feb 2024
Debiased Offline Representation Learning for Fast Online Adaptation in Non-stationary Dynamics
Xinyu Zhang
Wenjie Qiu
Yi-Chen Li
Lei Yuan
Chengxing Jia
Zongzhang Zhang
Yang Yu
OffRL
121
1
0
17 Feb 2024
Policy Learning for Off-Dynamics RL with Deficient Support
Linh Le Pham Van
Hung The Tran
Sunil R. Gupta
84
2
0
16 Feb 2024
Revisiting Experience Replayable Conditions
Taisuke Kobayashi
104
3
0
15 Feb 2024
Discrete Probabilistic Inference as Control in Multi-path Environments
T. Deleu
Padideh Nouri
Nikolay Malkin
Doina Precup
Yoshua Bengio
183
31
0
15 Feb 2024
Risk-Sensitive Soft Actor-Critic for Robust Deep Reinforcement Learning under Distribution Shifts
Tobias Enders
James Harrison
Maximilian Schiffer
OOD
99
5
0
15 Feb 2024
Dataset Clustering for Improved Offline Policy Learning
Qiang Wang
Yixin Deng
Francisco Roldan Sanchez
Keru Wang
Kevin McGuinness
Noel E. O'Connor
Stephen J. Redmond
OffRL
89
2
0
14 Feb 2024
Entropy-regularized Point-based Value Iteration
Harrison Delecki
Marcell Vazquez-Chanlatte
Esen Yel
K. H. Wray
Tomer Arnon
Stefan J. Witwicki
Mykel J. Kochenderfer
OOD
92
0
0
14 Feb 2024
Single-Reset Divide & Conquer Imitation Learning
Alexandre Chenu
Olivier Serris
Olivier Sigaud
Nicolas Perrin-Gilbert
69
0
0
14 Feb 2024
Hybrid Inverse Reinforcement Learning
Juntao Ren
Gokul Swamy
Zhiwei Steven Wu
J. Andrew Bagnell
Sanjiban Choudhury
88
20
0
13 Feb 2024
Evaluation of a Smart Mobile Robotic System for Industrial Plant Inspection and Supervision
Georg K.J. Fischer
M. Bergau
D. A. Gómez-Rosal
Andreas Wachaja
Johannes Grater
...
Nikhil Gosala
Niklas Wetzel
Daniel Buscher
Abhinav Valada
Wolfram Burgard
58
3
0
12 Feb 2024
SPO: Sequential Monte Carlo Policy Optimisation
Clément Bonnet
Edan Toledo
Donal Byrne
Paul Duckworth
Alexandre Laterre
80
1
0
12 Feb 2024
More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning
Kaiwen Wang
Owen Oertell
Alekh Agarwal
Nathan Kallus
Wen Sun
OffRL
128
12
0
11 Feb 2024
Deceptive Path Planning via Reinforcement Learning with Graph Neural Networks
Michael Y. Fatemi
Wesley A Suttle
Brian M Sadler
OffRL
57
4
0
09 Feb 2024
Hierarchical Transformers are Efficient Meta-Reinforcement Learners
Gresa Shala
André Biedenkapp
Josif Grabocka
OffRL
114
4
0
09 Feb 2024
Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement
Muning Wen
Junwei Liao
Cheng Deng
Jun Wang
Weinan Zhang
Ying Wen
92
3
0
09 Feb 2024
Learn to Teach: Sample-Efficient Privileged Learning for Humanoid Locomotion over Diverse Terrains
Feiyang Wu
Xavier Nal
Ye Zhao
Anqi Wu
Zhaoyuan Gu
Anqi Wu
Ye Zhao
84
0
0
09 Feb 2024
Scaling Artificial Intelligence for Digital Wargaming in Support of Decision-Making
Scotty Black
Christian J. Darken
32
2
0
08 Feb 2024
Limitations of Agents Simulated by Predictive Models
Raymond Douglas
Jacek Karwowski
Chan Bae
Andis Draguns
Victoria Krakovna
46
0
0
08 Feb 2024
DiffTORI: Differentiable Trajectory Optimization for Deep Reinforcement and Imitation Learning
Weikang Wan
Ziyu Wang
Yufei Wang
Zackory M. Erickson
David Held
143
4
0
08 Feb 2024
Three Pathways to Neurosymbolic Reinforcement Learning with Interpretable Model and Policy Networks
Peter Graf
Patrick Emami
61
2
0
07 Feb 2024
Do Transformer World Models Give Better Policy Gradients?
Michel Ma
Tianwei Ni
Clement Gehring
P. DÓro
Pierre-Luc Bacon
86
4
0
07 Feb 2024
Analyzing Adversarial Inputs in Deep Reinforcement Learning
Davide Corsi
Guy Amir
Guy Katz
Alessandro Farinelli
AAML
65
7
0
07 Feb 2024
QGFN: Controllable Greediness with Action Values
Elaine Lau
Stephen Zhewen Lu
Ling Pan
Doina Precup
Emmanuel Bengio
178
14
0
07 Feb 2024
Learning Diverse Policies with Soft Self-Generated Guidance
Guojian Wang
Faguo Wu
Xiao Zhang
Jianxiang Liu
OffRL
65
4
0
07 Feb 2024
Entropy-regularized Diffusion Policy with Q-Ensembles for Offline Reinforcement Learning
Ruoqing Zhang
Ziwei Luo
Jens Sjölund
Thomas B. Schön
Per Mattsson
113
13
0
06 Feb 2024
Reinforcement Learning from Bagged Reward
Yuting Tang
Xin-Qiang Cai
Yao-Xiang Ding
Qiyu Wu
Guoqing Liu
Masashi Sugiyama
OffRL
88
0
0
06 Feb 2024
RL-VLM-F: Reinforcement Learning from Vision Language Foundation Model Feedback
Yufei Wang
Zhanyi Sun
Jesse Zhang
Zhou Xian
Erdem Biyik
David Held
Zackory M. Erickson
VLM
124
59
0
06 Feb 2024
Transductive Reward Inference on Graph
B. Qu
Xiaofeng Cao
Qing Guo
Yi Chang
Ivor W. Tsang
Chengqi Zhang
OffRL
114
0
0
06 Feb 2024
A Multi-step Loss Function for Robust Learning of the Dynamics in Model-based Reinforcement Learning
Abdelhakim Benechehab
Albert Thomas
Giuseppe Paolo
Maurizio Filippone
Balázs Kégl
NoLa
61
1
0
05 Feb 2024
Boosting Reinforcement Learning with Strongly Delayed Feedback Through Auxiliary Short Delays
Qingyuan Wu
S. Zhan
Yixuan Wang
Yuhui Wang
Chung-Wei Lin
Chen Lv
Qi Zhu
Jürgen Schmidhuber
Chao Huang
OffRL
93
2
0
05 Feb 2024
Previous
1
2
3
...
21
22
23
...
81
82
83
Next