Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1801.01290
Cited By
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"
50 / 1,645 papers shown
Title
ManiSkill-HAB: A Benchmark for Low-Level Manipulation in Home Rearrangement Tasks
Arth Shukla
Stone Tao
Hao Su
98
6
0
09 Dec 2024
Dense Dynamics-Aware Reward Synthesis: Integrating Prior Experience with Demonstrations
Cevahir Köprülü
Po-han Li
Tianyu Qiu
Ruihan Zhao
T. Westenbroek
David Fridovich-Keil
Sandeep Chinchali
Ufuk Topcu
OffRL
97
0
0
02 Dec 2024
Object-centric proto-symbolic behavioural reasoning from pixels
R. S. V. Bergen
Justus F. Hübotter
Pablo Lanillos
LM&Ro
OCL
98
0
0
26 Nov 2024
Broad Critic Deep Actor Reinforcement Learning for Continuous Control
Shiron Thalagala
Pak Kin Wong
Xiaozheng Wang
Tianang Sun
OffRL
81
0
0
24 Nov 2024
Reward Fine-Tuning Two-Step Diffusion Models via Learning Differentiable Latent-Space Surrogate Reward
Zhiwei Jia
Yuesong Nan
Huixi Zhao
Gengdai Liu
EGVM
94
0
0
22 Nov 2024
Enhancing Exploration with Diffusion Policies in Hybrid Off-Policy RL: Application to Non-Prehensile Manipulation
Huy Le
Miroslav Gabriel
Tai Hoang
Gerhard Neumann
Ngo Anh Vien
116
1
0
22 Nov 2024
OCMDP: Observation-Constrained Markov Decision Process
Taiyi Wang
Jianheng Liu
Bryan Lee
Zhihao Wu
Yu Wu
36
1
0
11 Nov 2024
Non-Adversarial Inverse Reinforcement Learning via Successor Feature Matching
A. Jain
Harley Wiltzer
Jesse Farebrother
Irina Rish
Glen Berseth
Sanjiban Choudhury
60
1
0
11 Nov 2024
Grounding Video Models to Actions through Goal Conditioned Exploration
Yunhao Luo
Yilun Du
LM&Ro
VGen
90
1
0
11 Nov 2024
State Chrono Representation for Enhancing Generalization in Reinforcement Learning
Jianda Chen
Wen Zheng Terence Ng
Zichen Chen
Sinno Jialin Pan
Tianwei Zhang
OffRL
42
0
0
09 Nov 2024
Acceleration for Deep Reinforcement Learning using Parallel and Distributed Computing: A Survey
Zhihong Liu
Xin Xu
Peng Qiao
Dongsheng Li
OffRL
34
2
0
08 Nov 2024
Sharp Analysis for KL-Regularized Contextual Bandits and RLHF
Heyang Zhao
Chenlu Ye
Quanquan Gu
Tong Zhang
OffRL
65
3
0
07 Nov 2024
Environment as Policy: Learning to Race in Unseen Tracks
Hongze Wang
Jiaxu Xing
Nico Messikommer
Davide Scaramuzza
34
1
0
29 Oct 2024
Q-Distribution guided Q-learning for offline reinforcement learning: Uncertainty penalized Q-value via consistency model
Jing Zhang
Linjiajie Fang
Kexin Shi
Wenjia Wang
Bing-Yi Jing
OffRL
44
0
0
27 Oct 2024
OGBench: Benchmarking Offline Goal-Conditioned RL
Seohong Park
Kevin Frans
Benjamin Eysenbach
Sergey Levine
OffRL
67
10
0
26 Oct 2024
Prioritized Generative Replay
Renhao Wang
Kevin Frans
Pieter Abbeel
Sergey Levine
Alexei A. Efros
OnRL
DiffM
119
2
0
23 Oct 2024
Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration
Max Wilcoxson
Qiyang Li
Kevin Frans
Sergey Levine
SSL
OffRL
OnRL
64
0
0
23 Oct 2024
Uncovering RL Integration in SSL Loss: Objective-Specific Implications for Data-Efficient RL
Ömer Veysel Çağatan
Barış Akgün
OffRL
44
0
0
22 Oct 2024
Magnetic Preference Optimization: Achieving Last-iterate Convergence for Language Model Alignment
Mingzhi Wang
Chengdong Ma
Qizhi Chen
Linjian Meng
Yang Han
Jiancong Xiao
Zhaowei Zhang
Jing Huo
Weijie Su
Yaodong Yang
37
5
0
22 Oct 2024
Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization
Timofei Gritsaev
Nikita Morozov
S. Samsonov
D. Tiapkin
26
0
0
20 Oct 2024
Offline-to-online Reinforcement Learning for Image-based Grasping with Scarce Demonstrations
Bryan Chan
Anson Leung
James Bergstra
OffRL
OnRL
67
0
0
19 Oct 2024
Reward-free World Models for Online Imitation Learning
Shangzhe Li
Zhiao Huang
H. Su
OffRL
67
1
0
17 Oct 2024
Diffusing States and Matching Scores: A New Framework for Imitation Learning
Runzhe Wu
Yiding Chen
Gokul Swamy
Kianté Brantley
Wen Sun
DiffM
53
3
0
17 Oct 2024
Reinforcement Learning with Euclidean Data Augmentation for State-Based Continuous Control
Jinzhu Luo
Dingyang Chen
Qi Zhang
OffRL
28
0
0
16 Oct 2024
The State of Robot Motion Generation
Kostas E. Bekris
Joe H. Doerr
Patrick Meng
Sumanth Tangirala
3DV
41
2
0
16 Oct 2024
Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement
Zhi Wang
Li Zhang
Wenhao Wu
Yuanheng Zhu
Dongbin Zhao
C. L. Philip Chen
OffRL
53
6
0
15 Oct 2024
Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning
Jiayu Chen
Wentse Chen
Jeff Schneider
OffRL
46
2
0
15 Oct 2024
TOP-ERL: Transformer-based Off-Policy Episodic Reinforcement Learning
Ge Li
Dong Tian
Hongyi Zhou
Xinkai Jiang
Rudolf Lioutikov
Gerhard Neumann
OffRL
285
3
0
12 Oct 2024
MAD-TD: Model-Augmented Data stabilizes High Update Ratio RL
C. Voelcker
Marcel Hussing
Eric Eaton
Amir-massoud Farahmand
Igor Gilitschenski
46
2
0
11 Oct 2024
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization
Guanlin Liu
Kaixuan Ji
Ning Dai
Zheng Wu
Chen Dun
Q. Gu
Lin Yan
Quanquan Gu
Lin Yan
OffRL
LRM
73
10
0
11 Oct 2024
FRASA: An End-to-End Reinforcement Learning Agent for Fall Recovery and Stand Up of Humanoid Robots
Clément Gaspard
Marc Duclusaud
G. Passault
Mélodie Daniel
Olivier Ly
38
3
0
11 Oct 2024
Drama: Mamba-Enabled Model-Based Reinforcement Learning Is Sample and Parameter Efficient
Wenlong Wang
Ivana Dusparic
Yucheng Shi
Ke Zhang
Vinny Cahill
Mamba
248
0
0
11 Oct 2024
Masked Generative Priors Improve World Models Sequence Modelling Capabilities
Cristian Meo
Mircea Lica
Zarif Ikram
Akihiro Nakano
Vedant Shah
Aniket Didolkar
Dianbo Liu
Anirudh Goyal
Justin Dauwels
OffRL
90
0
0
10 Oct 2024
Neuroplastic Expansion in Deep Reinforcement Learning
Jiashun Liu
J. Obando-Ceron
Rameswar Panda
L. Pan
47
3
0
10 Oct 2024
Zero-Shot Generalization of Vision-Based RL Without Data Augmentation
Sumeet Batra
Gaurav Sukhatme
OffRL
DRL
41
2
0
09 Oct 2024
ETGL-DDPG: A Deep Deterministic Policy Gradient Algorithm for Sparse Reward Continuous Control
Ehsan Futuhi
Shayan Karimi
Chao Gao
Martin Müller
48
1
0
07 Oct 2024
Efficient Model-Based Reinforcement Learning Through Optimistic Thompson Sampling
Jasmine Bayrooti
Carl Henrik Ek
Amanda Prorok
50
0
0
07 Oct 2024
Unpacking Failure Modes of Generative Policies: Runtime Monitoring of Consistency and Progress
Christopher Agia
Rohan Sinha
Jingyun Yang
Zi-ang Cao
Rika Antonova
Marco Pavone
Jeannette Bohg
49
7
0
06 Oct 2024
Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
Zhaolin Gao
Wenhao Zhan
Jonathan D. Chang
Gokul Swamy
Kianté Brantley
Jason D. Lee
Wen Sun
OffRL
81
3
0
06 Oct 2024
MLLM as Retriever: Interactively Learning Multimodal Retrieval for Embodied Agents
Junpeng Yue
Xinru Xu
Börje F. Karlsson
Zongqing Lu
46
0
0
04 Oct 2024
Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep RL
Ghada Sokar
J. Obando-Ceron
Rameswar Panda
Hugo Larochelle
Pablo Samuel Castro
MoE
174
2
0
02 Oct 2024
Stabilizing the Kumaraswamy Distribution
Max Wasserman
Gonzalo Mateos
BDL
52
0
0
01 Oct 2024
Enabling Multi-Robot Collaboration from Single-Human Guidance
Zhengran Ji
Lingyu Zhang
Paul Sajda
Boyuan Chen
42
1
0
30 Sep 2024
Task-agnostic Pre-training and Task-guided Fine-tuning for Versatile Diffusion Planner
Chenyou Fan
Chenjia Bai
Zhao Shan
Haoran He
Yang Zhang
Zhen Wang
40
3
0
30 Sep 2024
LiRA: Light-Robust Adversary for Model-based Reinforcement Learning in Real World
Taisuke Kobayashi
71
2
0
29 Sep 2024
CurricuLLM: Automatic Task Curricula Design for Learning Complex Robot Skills using Large Language Models
Kanghyun Ryu
Qiayuan Liao
Zhongyu Li
Koushil Sreenath
Negar Mehr
Negar Mehr
LM&Ro
210
3
0
27 Sep 2024
Behavior evolution-inspired approach to walking gait reinforcement training for quadruped robots
Yu Wang
Wenchuan Jia
Yi Sun
Dong He
37
0
0
25 Sep 2024
The Roles of Generative Artificial Intelligence in Internet of Electric Vehicles
Hanwen Zhang
Dusit Niyato
Wei Zhang
Changyuan Zhao
Hongyang Du
Abbas Jamalipour
Sumei Sun
Yiyang Pei
AI4CE
53
2
0
24 Sep 2024
Safe Navigation for Robotic Digestive Endoscopy via Human Intervention-based Reinforcement Learning
Min Tan
Yushun Tao
Boyun Zheng
GaoSheng Xie
Lijuan Feng
Zeyang Xia
Jing Xiong
32
0
0
24 Sep 2024
Context-Based Meta Reinforcement Learning for Robust and Adaptable Peg-in-Hole Assembly Tasks
Ahmed Shokry
Walid Gomaa
Tobias Zaenker
Murad Dawood
Shady A. Maged
Mohammed I. Awad
Maren Bennewitz
Maren Bennewitz
OffRL
46
0
0
24 Sep 2024
Previous
1
2
3
4
5
...
31
32
33
Next