Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
1801.01290
Cited By
v1
v2 (latest)
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"
50 / 4,130 papers shown
Title
Value Improved Actor Critic Algorithms
Yaniv Oren
Moritz A. Zanger
Pascal R. van der Vaart
M. Spaan
Wendelin Bohmer
Wendelin Bohmer
OffRL
91
0
0
03 Jun 2024
Looking Backward: Retrospective Backward Synthesis for Goal-Conditioned GFlowNets
Haoran He
C. Chang
Huazhe Xu
Ling Pan
217
7
0
03 Jun 2024
Shared-unique Features and Task-aware Prioritized Sampling on Multi-task Reinforcement Learning
Po-Shao Lin
Jia-Fong Yeh
Yi-Ting Chen
Winston H. Hsu
87
0
0
02 Jun 2024
Learning Multimodal Behaviors from Scratch with Diffusion Policy Gradient
Zechu Li
Rickmer Krohn
Tao Chen
Anurag Ajay
Pulkit Agrawal
Georgia Chalvatzaki
DiffM
139
18
0
02 Jun 2024
FuRL: Visual-Language Models as Fuzzy Rewards for Reinforcement Learning
Yuwei Fu
Haichao Zhang
Di Wu
Wei Xu
Benoit Boulet
VLM
125
15
0
02 Jun 2024
Improving GFlowNets for Text-to-Image Diffusion Alignment
Dinghuai Zhang
Yizhe Zhang
Jiatao Gu
Ruixiang Zhang
J. Susskind
Navdeep Jaitly
Shuangfei Zhai
EGVM
144
10
0
02 Jun 2024
Exploring the limits of Hierarchical World Models in Reinforcement Learning
Robin Schiewer
Anand Subramoney
Laurenz Wiskott
93
1
0
01 Jun 2024
Do's and Don'ts: Learning Desirable Skills with Instruction Videos
Hyunseung Kim
ByungKun Lee
Hojoon Lee
Dongyoon Hwang
Donghu Kim
Jaegul Choo
151
1
0
01 Jun 2024
Self-Augmented Preference Optimization: Off-Policy Paradigms for Language Model Alignment
Yueqin Yin
Zhendong Wang
Yujia Xie
Weizhu Chen
Mingyuan Zhou
101
4
0
31 May 2024
HOPE: A Reinforcement Learning-based Hybrid Policy Path Planner for Diverse Parking Scenarios
Mingyang Jiang
Yueyuan Li
Songan Zhang
Siyuan Chen
Chunxiang Wang
Ming Yang
157
5
0
31 May 2024
Aquatic Navigation: A Challenging Benchmark for Deep Reinforcement Learning
Davide Corsi
Davide Camponogara
Alessandro Farinelli
OffRL
77
2
0
30 May 2024
Video-Language Critic: Transferable Reward Functions for Language-Conditioned Robotics
Minttu Alakuijala
Reginald McLean
Isaac Woungang
Nariman Farsad
Samuel Kaski
Pekka Marttinen
Kai Yuan
LM&Ro
79
1
0
30 May 2024
Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning
Tenglong Liu
Yang Li
Yixing Lan
Hao Gao
Wei Pan
Xin Xu
OffRL
118
8
0
30 May 2024
Learning from Random Demonstrations: Offline Reinforcement Learning with Importance-Sampled Diffusion Models
Zeyu Fang
Tian Lan
OffRL
120
2
0
30 May 2024
May the Dance be with You: Dance Generation Framework for Non-Humanoids
Hyemin Ahn
DiffM
VGen
101
1
0
30 May 2024
Bilevel reinforcement learning via the development of hyper-gradient without lower-level convexity
Yan Yang
Bin Gao
Ya-xiang Yuan
144
2
0
30 May 2024
OMPO: A Unified Framework for RL under Policy and Dynamics Shifts
Yu-Juan Luo
Tianying Ji
Gang Hua
Jianwei Zhang
Huazhe Xu
Xianyuan Zhan
OffRL
114
3
0
29 May 2024
Trust the Model Where It Trusts Itself -- Model-Based Actor-Critic with Uncertainty-Aware Rollout Adaption
Bernd Frauenknecht
Artur Eisele
Devdutt Subhasish
Friedrich Solowjow
Sebastian Trimpe
116
5
0
29 May 2024
Spectral-Risk Safe Reinforcement Learning with Convergence Guarantees
Dohyeong Kim
Taehyun Cho
Seung Han
Hojun Chung
Kyungjae Lee
Songhwai Oh
82
1
0
29 May 2024
Efficient Preference-based Reinforcement Learning via Aligned Experience Estimation
Fengshuo Bai
Rui Zhao
Hongming Zhang
Sijia Cui
Ying Wen
Yaodong Yang
Bo Xu
Lei Han
OffRL
95
8
0
29 May 2024
RLeXplore: Accelerating Research in Intrinsically-Motivated Reinforcement Learning
Mingqi Yuan
Roger Creus Castanyer
Bo Li
Xin Jin
Glen Berseth
Wenjun Zeng
180
0
0
29 May 2024
Model-Based Diffusion for Trajectory Optimization
Chaoyi Pan
Zeji Yi
Guanya Shi
Guannan Qu
101
13
0
28 May 2024
DTR-Bench: An in silico Environment and Benchmark Platform for Reinforcement Learning Based Dynamic Treatment Regime
Zhiyao Luo
Mingcheng Zhu
Fenglin Liu
Jiali Li
Yangchen Pan
Jiandong Zhou
Tingting Zhu
OffRL
75
3
0
28 May 2024
Offline-Boosted Actor-Critic: Adaptively Blending Optimal Historical Behaviors in Deep Off-Policy RL
Yu-Juan Luo
Tianying Ji
Gang Hua
Jianwei Zhang
Huazhe Xu
Xianyuan Zhan
OffRL
OnRL
119
3
0
28 May 2024
AlignIQL: Policy Alignment in Implicit Q-Learning through Constrained Optimization
Longxiang He
Li Shen
Junbo Tan
Xueqian Wang
113
4
0
28 May 2024
HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning
Shengchao Hu
Ziqing Fan
Li Shen
Ya Zhang
Yanfeng Wang
Dacheng Tao
OffRL
97
11
0
28 May 2024
Mollification Effects of Policy Gradient Methods
Tao Wang
Sylvia Herbert
Sicun Gao
104
1
0
28 May 2024
A Pontryagin Perspective on Reinforcement Learning
Onno Eberhard
Claire Vernade
Michael Muehlebach
140
3
0
28 May 2024
No
D
train
D_{\text{train}}
D
train
: Model-Agnostic Counterfactual Explanations Using Reinforcement Learning
Xiangyu Sun
Raquel Aoki
Kevin H. Wilson
81
1
0
28 May 2024
A Recipe for Unbounded Data Augmentation in Visual Reinforcement Learning
Abdulaziz Almuzairee
Nicklas Hansen
Henrik I. Christensen
81
7
0
27 May 2024
Rethinking Transformers in Solving POMDPs
Chenhao Lu
Ruizhe Shi
Yuyao Liu
Kaizhe Hu
Simon S. Du
Huazhe Xu
AI4CE
136
3
0
27 May 2024
Position: Foundation Agents as the Paradigm Shift for Decision Making
Xiaoqian Liu
Xingzhou Lou
Jianbin Jiao
Junge Zhang
OffRL
LLMAG
105
7
0
27 May 2024
Knowing What Not to Do: Leverage Language Model Insights for Action Space Pruning in Multi-agent Reinforcement Learning
Zhihao Liu
Xianliang Yang
Zichuan Liu
Yifan Xia
Wei Jiang
Yuanyu Zhang
Lijuan Li
Guoliang Fan
Lei Song
Bian Jiang
LLMAG
84
3
0
27 May 2024
Oracle-Efficient Reinforcement Learning for Max Value Ensembles
Marcel Hussing
Michael Kearns
Aaron Roth
S. B. Sengupta
Jessica Sorrell
73
0
0
27 May 2024
Symmetric Reinforcement Learning Loss for Robust Learning on Diverse Tasks and Model Scales
Ju-Seung Byun
Andrew Perrault
57
1
0
27 May 2024
Amortized Active Causal Induction with Deep Reinforcement Learning
Yashas Annadani
P. Tigas
Stefan Bauer
Adam Foster
88
1
0
26 May 2024
Provably Efficient Off-Policy Adversarial Imitation Learning with Convergence Guarantees
Yilei Chen
Vittorio Giammarino
James Queeney
I. Paschalidis
70
0
0
26 May 2024
Fast TRAC: A Parameter-Free Optimizer for Lifelong Reinforcement Learning
Aneesh Muppidi
Zhiyu Zhang
Heng Yang
78
6
0
26 May 2024
On the Algorithmic Bias of Aligning Large Language Models with RLHF: Preference Collapse and Matching Regularization
Jiancong Xiao
Ziniu Li
Xingyu Xie
E. Getzen
Cong Fang
Qi Long
Weijie J. Su
108
23
0
26 May 2024
RoboArm-NMP: a Learning Environment for Neural Motion Planning
Tom Jurgenson
Matan Sudry
Gal Avineri
Aviv Tamar
72
0
0
25 May 2024
Safe Deep Model-Based Reinforcement Learning with Lyapunov Functions
Harry Zhang
79
0
0
25 May 2024
Diffusion-based Reinforcement Learning via Q-weighted Variational Policy Optimization
Shutong Ding
Ke Hu
Zhenhao Zhang
Kan Ren
Weinan Zhang
Jingyi Yu
Jingya Wang
Ye-ling Shi
115
21
0
25 May 2024
Pausing Policy Learning in Non-stationary Reinforcement Learning
Hyunin Lee
Ming Jin
Javad Lavaei
Somayeh Sojoudi
OffRL
117
2
0
25 May 2024
Constrained Ensemble Exploration for Unsupervised Skill Discovery
Chenjia Bai
Rushuai Yang
Qiaosheng Zhang
Kang Xu
Yi Chen
Ting Xiao
Xuelong Li
OffRL
139
4
0
25 May 2024
Efficient Recurrent Off-Policy RL Requires a Context-Encoder-Specific Learning Rate
Fan Luo
Zuolin Tu
Zefang Huang
Yang Yu
OffRL
86
1
0
24 May 2024
How to Leverage Diverse Demonstrations in Offline Imitation Learning
Sheng Yue
Jiani Liu
Xingyuan Hua
Ju Ren
Sen Lin
Junshan Zhang
Yaoxue Zhang
OffRL
83
4
0
24 May 2024
Momentum-Based Federated Reinforcement Learning with Interaction and Communication Efficiency
Sheng Yue
Xingyuan Hua
Lili Chen
Ju Ren
49
1
0
24 May 2024
Diffusion Actor-Critic with Entropy Regulator
Yinuo Wang
Likun Wang
Yuxuan Jiang
Wenjun Zou
Tong Liu
...
Wenxuan Wang
Liming Xiao
Jiang Wu
Jingliang Duan
Shengbo Eben Li
DiffM
142
17
0
24 May 2024
Model-free reinforcement learning with noisy actions for automated experimental control in optics
Lea Richtmann
Viktoria-S. Schmiesing
Dennis Wilken
Jan Heine
Aaron Tranter
Avishek Anand
Tobias J. Osborne
M. Heurs
104
2
0
24 May 2024
MuDreamer: Learning Predictive World Models without Reconstruction
Maxime Burchi
Radu Timofte
75
4
0
23 May 2024
Previous
1
2
3
...
16
17
18
...
81
82
83
Next