Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1801.01290
Cited By
v1
v2 (latest)
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"
50 / 4,130 papers shown
Title
The State of Robot Motion Generation
Kostas E. Bekris
Joe H. Doerr
Patrick Meng
Sumanth Tangirala
3DV
96
3
0
16 Oct 2024
When to Trust Your Data: Enhancing Dyna-Style Model-Based Reinforcement Learning With Data Filter
Yansong Li
Zeyu Dong
Ertai Luo
Yu Wu
Shuo Wu
Shuo Han
48
2
0
16 Oct 2024
Mitigating Suboptimality of Deterministic Policy Gradients in Complex Q-functions
Ayush Jain
Norio Kosaka
Xinhu Li
Kyung-Min Kim
Erdem Bıyık
Joseph J. Lim
OffRL
49
0
0
15 Oct 2024
Solving The Dynamic Volatility Fitting Problem: A Deep Reinforcement Learning Approach
Emmanuel Gnabeyeu
Omar Karkar
Imad Idboufous
51
0
0
15 Oct 2024
Robust Manipulation Primitive Learning via Domain Contraction
Teng Xue
Amirreza Razmjoo
Suhan Shetty
Sylvain Calinon
95
3
0
15 Oct 2024
Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement
Zhi Wang
Li Zhang
Wenhao Wu
Yuanheng Zhu
Dongbin Zhao
C. L. Philip Chen
OffRL
106
9
0
15 Oct 2024
Disentangled Unsupervised Skill Discovery for Efficient Hierarchical Reinforcement Learning
Jiaheng Hu
Zizhao Wang
Peter Stone
Roberto Martín-Martín
86
2
0
15 Oct 2024
Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning
Jiayu Chen
Wentse Chen
Jeff Schneider
OffRL
109
4
0
15 Oct 2024
Traversability-Aware Legged Navigation by Learning from Real-World Visual Data
Hongbo Zhang
Zhongyu Li
Xuanqi Zeng
Laura Smith
Kyle Stachowicz
...
Zhitao Song
Weipeng Xia
Sergey Levine
Koushil Sreenath
Yun-Hui Liu
84
3
0
14 Oct 2024
Continual Deep Reinforcement Learning to Prevent Catastrophic Forgetting in Jamming Mitigation
Kemal Davaslioglu
Sastry Kompella
T. Erpek
Y. Sagduyu
50
1
0
14 Oct 2024
Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning
Hung Le
Kien Do
D. Nguyen
Sunil Gupta
Svetha Venkatesh
76
0
0
14 Oct 2024
Large Language Model Evaluation via Matrix Nuclear-Norm
Yongbin Li
Tingyu Xia
Yi-Ju Chang
Yuan Wu
63
0
0
14 Oct 2024
SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning
Hojoon Lee
Dongyoon Hwang
Donghu Kim
Hyunseung Kim
Jun Jet Tai
K. Subramanian
Peter R. Wurman
Jaegul Choo
Peter Stone
Takuma Seno
OffRL
194
17
0
13 Oct 2024
TOP-ERL: Transformer-based Off-Policy Episodic Reinforcement Learning
Ge Li
Dong Tian
Hongyi Zhou
Xinkai Jiang
Rudolf Lioutikov
Gerhard Neumann
OffRL
540
4
0
12 Oct 2024
Learning to Walk from Three Minutes of Real-World Data with Semi-structured Dynamics Models
Jacob Levy
T. Westenbroek
David Fridovich-Keil
102
8
0
11 Oct 2024
Can we hop in general? A discussion of benchmark selection and design using the Hopper environment
C. Voelcker
Marcel Hussing
Eric Eaton
OffRL
93
3
0
11 Oct 2024
MAD-TD: Model-Augmented Data stabilizes High Update Ratio RL
C. Voelcker
Marcel Hussing
Eric Eaton
Amir-massoud Farahmand
Igor Gilitschenski
139
5
0
11 Oct 2024
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization
Guanlin Liu
Kaixuan Ji
Ning Dai
Zheng Wu
Chen Dun
Q. Gu
Lin Yan
Quanquan Gu
Lin Yan
OffRL
LRM
158
13
0
11 Oct 2024
Zero-Shot Offline Imitation Learning via Optimal Transport
Thomas Rupf
Marco Bagatella
Nico Gürtler
Jonas Frey
Georg Martius
OffRL
453
0
0
11 Oct 2024
FRASA: An End-to-End Reinforcement Learning Agent for Fall Recovery and Stand Up of Humanoid Robots
Clément Gaspard
Marc Duclusaud
G. Passault
Mélodie Daniel
Olivier Ly
95
4
0
11 Oct 2024
Drama: Mamba-Enabled Model-Based Reinforcement Learning Is Sample and Parameter Efficient
Wenlong Wang
Ivana Dusparic
Yucheng Shi
Ke Zhang
Vinny Cahill
Mamba
471
1
0
11 Oct 2024
Efficient Reinforcement Learning with Large Language Model Priors
Xue Yan
Yan Song
Xidong Feng
Mengyue Yang
Haifeng Zhang
Haitham Bou Ammar
Jun Wang
OffRL
62
7
0
10 Oct 2024
The Power of Input: Benchmarking Zero-Shot Sim-To-Real Transfer of Reinforcement Learning Control Policies for Quadrotor Control
Alberto Dionigi
Gabriele Costante
Giuseppe Loianno
121
2
0
10 Oct 2024
Stop-N-Go: Search-based Conflict Resolution for Motion Planning of Multiple Robotic Manipulators
Gidon Han
Jeongwoo Park
Changjoo Nam
38
2
0
10 Oct 2024
Masked Generative Priors Improve World Models Sequence Modelling Capabilities
Cristian Meo
Mircea Lica
Zarif Ikram
Akihiro Nakano
Vedant Shah
Aniket Didolkar
Dianbo Liu
Anirudh Goyal
Justin Dauwels
OffRL
255
0
0
10 Oct 2024
Neuroplastic Expansion in Deep Reinforcement Learning
Jiashun Liu
J. Obando-Ceron
Rameswar Panda
L. Pan
129
6
0
10 Oct 2024
Zero-Shot Generalization of Vision-Based RL Without Data Augmentation
Sumeet Batra
Gaurav Sukhatme
OffRL
DRL
85
2
0
09 Oct 2024
Fostering Intrinsic Motivation in Reinforcement Learning with Pretrained Foundation Models
Alain Andres
Javier Del Ser
OffRL
60
0
0
09 Oct 2024
Safe Reinforcement Learning Filter for Multicopter Collision-Free Tracking under disturbances
Qihan Qi
Xinsong Yang
Gang Xia
68
1
0
09 Oct 2024
Effective Exploration Based on the Structural Information Principles
Xianghua Zeng
Hao Peng
Angsheng Li
71
2
0
09 Oct 2024
Solving Multi-Goal Robotic Tasks with Decision Transformer
Paul Gajewski
Dominik Zurek
Marcin Pietroñ
Kamil Faber
OffRL
66
1
0
08 Oct 2024
Learning in complex action spaces without policy gradients
Arash Tavakoli
Sina Ghiassian
Nemanja Rakićević
OffRL
74
0
0
08 Oct 2024
Diffusion Imitation from Observation
Bo-Ruei Huang
Chun-Kai Yang
Chun-Mao Lai
Dai-Jie Wu
Shao-Hua Sun
86
4
0
07 Oct 2024
Efficient Model-Based Reinforcement Learning Through Optimistic Thompson Sampling
Jasmine Bayrooti
Carl Henrik Ek
Amanda Prorok
201
0
0
07 Oct 2024
ETGL-DDPG: A Deep Deterministic Policy Gradient Algorithm for Sparse Reward Continuous Control
Ehsan Futuhi
Shayan Karimi
Chao Gao
Martin Müller
111
1
0
07 Oct 2024
Unpacking Failure Modes of Generative Policies: Runtime Monitoring of Consistency and Progress
Christopher Agia
Rohan Sinha
Jingyun Yang
Zi-ang Cao
Rika Antonova
Marco Pavone
Jeannette Bohg
94
9
0
06 Oct 2024
Bisimulation metric for Model Predictive Control
Yutaka Shimizu
Masayoshi Tomizuka
108
0
0
06 Oct 2024
Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
Zhaolin Gao
Wenhao Zhan
Jonathan D. Chang
Gokul Swamy
Kianté Brantley
Jason D. Lee
Wen Sun
OffRL
151
7
0
06 Oct 2024
Model-Based Reward Shaping for Adversarial Inverse Reinforcement Learning in Stochastic Environments
S. Zhan
Qingyuan Wu
Philip Wang
Yixuan Wang
Ruochen Jiao
Chao Huang
Qi Zhu
112
1
0
04 Oct 2024
GAP-RL: Grasps As Points for RL Towards Dynamic Object Grasping
Pengwei Xie
Siang Chen
Qianrun Chen
Wei Tang
Dingchang Hu
Yixiang Dai
Rui Chen
Guijin Wang
65
1
0
04 Oct 2024
Mitigating Adversarial Perturbations for Deep Reinforcement Learning via Vector Quantization
Tung M. Luu
Thanh Nguyen
Tee Joshua Tian Jin
Sungwoon Kim
Chang D. Yoo
AAML
83
0
0
04 Oct 2024
Multilingual Topic Classification in X: Dataset and Analysis
Dimosthenis Antypas
Asahi Ushio
Francesco Barbieri
Jose Camacho-Collados
64
5
0
04 Oct 2024
Hybrid Classical/RL Local Planner for Ground Robot Navigation
Vishnu D. Sharma
Jeongran Lee
M. Andrews
I. Hadžić
49
0
0
04 Oct 2024
MLLM as Retriever: Interactively Learning Multimodal Retrieval for Embodied Agents
Junpeng Yue
Xinru Xu
Börje F. Karlsson
Zongqing Lu
118
1
0
04 Oct 2024
Social coordination perpetuates stereotypic expectations and behaviors across generations in deep multi-agent reinforcement learning
Rebekah A. Gelpí
Yikai Tang
Ethan C. Jackson
William A. Cunningham
56
0
0
02 Oct 2024
Sampling from Energy-based Policies using Diffusion
V. Jain
Tara Akhound-Sadegh
Siamak Ravanbakhsh
DiffM
158
2
0
02 Oct 2024
Sparse Autoencoders Reveal Temporal Difference Learning in Large Language Models
Can Demircan
Tankred Saanum
Akshay K. Jagadish
Marcel Binz
Eric Schulz
64
4
0
02 Oct 2024
Dual Approximation Policy Optimization
Zhihan Xiong
Maryam Fazel
Lin Xiao
75
1
0
02 Oct 2024
Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep RL
Ghada Sokar
J. Obando-Ceron
Rameswar Panda
Hugo Larochelle
Pablo Samuel Castro
MoE
338
7
0
02 Oct 2024
Stabilizing the Kumaraswamy Distribution
Max Wasserman
Gonzalo Mateos
BDL
121
0
0
01 Oct 2024
Previous
1
2
3
...
10
11
12
...
81
82
83
Next