Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1801.01290
Cited By
v1
v2 (latest)
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"
50 / 4,130 papers shown
Title
Model-Based Reparameterization Policy Gradient Methods: Theory and Practical Algorithms
Shenao Zhang
Boyi Liu
Zhaoran Wang
Tuo Zhao
68
2
0
30 Oct 2023
Refining Diffusion Planner for Reliable Behavior Synthesis by Automatic Detection of Infeasible Plans
Kyowoon Lee
Seongun Kim
Jaesik Choi
DiffM
86
11
0
30 Oct 2023
Variational Curriculum Reinforcement Learning for Unsupervised Discovery of Skills
Seongun Kim
Kyowoon Lee
Jaesik Choi
SSL
DRL
89
10
0
30 Oct 2023
Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised Learning
Zhaoyi Zhou
Chuning Zhu
Runlong Zhou
Qiwen Cui
Abhishek Gupta
S. S. Du
OffRL
82
9
0
30 Oct 2023
Diversify & Conquer: Outcome-directed Curriculum RL via Out-of-Distribution Disagreement
Daesol Cho
Seungjae Lee
H. J. Kim
OODD
102
2
0
30 Oct 2023
Optimization Landscape of Policy Gradient Methods for Discrete-time Static Output Feedback
Jingliang Duan
Jie Li
Xuyang Chen
Kai Zhao
Shengbo Eben Li
Lin Zhao
53
5
0
29 Oct 2023
Posterior Sampling with Delayed Feedback for Reinforcement Learning with Linear Function Approximation
Nikki Lijing Kuang
Ming Yin
Mengdi Wang
Yu Wang
Yian Ma
102
6
0
29 Oct 2023
Robot Control based on Motor Primitives -- A Comparison of Two Approaches
Moses C. Nah
Johannes Lachner
Neville Hogan
33
3
0
28 Oct 2023
Inverse Decision Modeling: Learning Interpretable Representations of Behavior
Daniel Jarrett
Alihan Huyuk
M. Schaar
AI4CE
92
28
0
28 Oct 2023
State-Action Similarity-Based Representations for Off-Policy Evaluation
Brahma S. Pavse
Josiah P. Hanna
OffRL
78
4
0
27 Oct 2023
Improving Intrinsic Exploration by Creating Stationary Objectives
Roger Creus Castanyer
Javier Civera
Taihú Pire
OffRL
120
4
0
27 Oct 2023
Train Once, Get a Family: State-Adaptive Balances for Offline-to-Online Reinforcement Learning
Shenzhi Wang
Qisen Yang
Jiawei Gao
Matthieu Lin
Hao Chen
Liwei Wu
Ning Jia
Shiji Song
Gao Huang
OffRL
105
15
0
27 Oct 2023
Learning Extrinsic Dexterity with Parameterized Manipulation Primitives
Shih-Min Yang
Martin Magnusson
J. A. Stork
Todor Stoyanov
103
6
0
26 Oct 2023
CQM: Curriculum Reinforcement Learning with a Quantized World Model
Seungjae Lee
Daesol Cho
Jonghae Park
H. J. Kim
87
9
0
26 Oct 2023
MaxEnt Loss: Constrained Maximum Entropy for Calibration under Out-of-Distribution Shift
Dexter Neo
Stefan Winkler
Tsuhan Chen
OODD
69
3
0
26 Oct 2023
Understanding and Addressing the Pitfalls of Bisimulation-based Representations in Offline Reinforcement Learning
Hongyu Zang
Xin-hui Li
Leiji Zhang
Yang Liu
Baigui Sun
Riashat Islam
Rémi Tachet des Combes
Romain Laroche
OffRL
109
5
0
26 Oct 2023
DSAC-C: Constrained Maximum Entropy for Robust Discrete Soft-Actor Critic
Dexter Neo
Tsuhan Chen
56
1
0
26 Oct 2023
Dynamics Generalisation in Reinforcement Learning via Adaptive Context-Aware Policies
Michael Beukman
Devon Jarvis
Richard Klein
Steven D. James
Benjamin Rosman
112
13
0
25 Oct 2023
Towards Control-Centric Representations in Reinforcement Learning from Images
Chen Liu
Hongyu Zang
Xin Li
Yong Heng
Yifei Wang
Zhen Fang
Yisen Wang
Mingzhong Wang
59
0
0
25 Oct 2023
State Sequences Prediction via Fourier Transform for Representation Learning
Mingxuan Ye
Yufei Kuang
Jie Wang
Rui Yang
Wen-gang Zhou
Houqiang Li
Feng Wu
AI4TS
99
9
0
24 Oct 2023
Good Better Best: Self-Motivated Imitation Learning for noisy Demonstrations
Ye Yuan
Xin Li
Yong Heng
Leiji Zhang
Mingzhong Wang
DiffM
103
2
0
24 Oct 2023
Robot Skill Generalization via Keypoint Integrated Soft Actor-Critic Gaussian Mixture Models
Iman Nematollahi
Kirill Yankov
Wolfram Burgard
Tim Welschehold
78
0
0
23 Oct 2023
Mind the Model, Not the Agent: The Primacy Bias in Model-based RL
Zhongjian Qiao
Jiafei Lyu
Xiu Li
70
3
0
23 Oct 2023
Learning to bag with a simulation-free reinforcement learning framework for robots
Francisco Munguia-Galeano
Jihong Zhu
Juan David Hernández
Ze Ji
63
0
0
22 Oct 2023
Application of deep and reinforcement learning to boundary control problems
Zenin Easa Panthakkalakath
J. Kardoš
Olaf Schenk
AI4CE
36
0
0
21 Oct 2023
Cold Diffusion on the Replay Buffer: Learning to Plan from Known Good States
Zidan Wang
Takeru Oba
Takuma Yoneda
Rui Shen
Matthew R. Walter
Bradly C. Stadie
DiffM
115
10
0
21 Oct 2023
Contrastive Preference Learning: Learning from Human Feedback without RL
Joey Hejna
Rafael Rafailov
Harshit S. Sikchi
Chelsea Finn
S. Niekum
W. B. Knox
Dorsa Sadigh
OffRL
127
55
0
20 Oct 2023
Reward Shaping for Happier Autonomous Cyber Security Agents
Elizabeth Bates
V. Mavroudis
Chris Hicks
77
15
0
20 Oct 2023
RL-X: A Deep Reinforcement Learning Library (not only) for RoboCup
Nico Bohlinger
Klaus Dorer
81
4
0
20 Oct 2023
PathRL: An End-to-End Path Generation Method for Collision Avoidance via Deep Reinforcement Learning
Wenhao Yu
Jie Peng
Quecheng Qiu
Hanyu Wang
Lu Zhang
Jianmin Ji
75
9
0
20 Oct 2023
Absolute Policy Optimization
Weiye Zhao
Feihan Li
Yifan Sun
Rui Chen
Tianhao Wei
Changliu Liu
148
4
0
20 Oct 2023
Generative Flow Networks as Entropy-Regularized RL
D. Tiapkin
Nikita Morozov
Alexey Naumov
Dmitry Vetrov
114
35
0
19 Oct 2023
Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning
Juan Rocamonde
Victoriano Montesinos
Elvis Nava
Ethan Perez
David Lindner
VLM
108
92
0
19 Oct 2023
Safe RLHF: Safe Reinforcement Learning from Human Feedback
Josef Dai
Xuehai Pan
Ruiyang Sun
Jiaming Ji
Xinbo Xu
Mickel Liu
Yizhou Wang
Yaodong Yang
147
364
0
19 Oct 2023
Using Experience Classification for Training Non-Markovian Tasks
Ruixuan Miao
Xu Lu
Cong Tian
Bin Yu
Zhenhua Duan
OffRL
56
0
0
18 Oct 2023
Butterfly Effects of SGD Noise: Error Amplification in Behavior Cloning and Autoregression
Adam Block
Dylan J. Foster
Akshay Krishnamurthy
Max Simchowitz
Cyril Zhang
90
7
0
17 Oct 2023
Keep Various Trajectories: Promoting Exploration of Ensemble Policies in Continuous Control
Chao Li
Chen Gong
Qiang He
Xinwen Hou
76
1
0
17 Oct 2023
Sim-to-Real Transfer of Adaptive Control Parameters for AUV Stabilization under Current Disturbance
Thomas Chaffre
J. Wheare
A. Lammas
Paulo E. Santos
G. Chenadec
Karl Sammut
Benoit Clement
62
2
0
17 Oct 2023
Enhanced Transformer Architecture for Natural Language Processing
Woohyeon Moon
Taeyoung Kim
Bumgeun Park
Dongsoo Har
78
0
0
17 Oct 2023
Enhancing Task Performance of Learned Simplified Models via Reinforcement Learning
Hien Bui
Michael Posa
78
1
0
15 Oct 2023
Reduced Policy Optimization for Continuous Control with Hard Constraints
Shutong Ding
Jingya Wang
Yali Du
Ye-ling Shi
65
6
0
14 Oct 2023
SAI: Solving AI Tasks with Systematic Artificial Intelligence in Communication Network
Lei Yao
Yong Zhang
Zilong Yan
Jialu Tian
67
3
0
13 Oct 2023
METRA: Scalable Unsupervised RL with Metric-Aware Abstraction
Seohong Park
Oleh Rybkin
Sergey Levine
OffRL
89
45
0
13 Oct 2023
DexCatch: Learning to Catch Arbitrary Objects with Dexterous Hands
Fengbo Lan
Shengjie Wang
Yunzhe Zhang
Haotian Xu
Oluwatosin Oseni
Yang Gao
Tao Zhang
87
5
0
13 Oct 2023
Learning RL-Policies for Joint Beamforming Without Exploration: A Batch Constrained Off-Policy Approach
Heasung Kim
S. Ankireddy
OffRL
45
0
0
12 Oct 2023
Offline Retraining for Online RL: Decoupled Policy Learning to Mitigate Exploration Bias
Max Sobol Mark
Archit Sharma
Fahim Tajwar
Rafael Rafailov
Sergey Levine
Chelsea Finn
OffRL
OnRL
111
2
0
12 Oct 2023
LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios
Yazhe Niu
Yuan Pu
Zhenjie Yang
Xueyan Li
Tong Zhou
Jiyuan Ren
Shuai Hu
Hongsheng Li
Yu Liu
141
15
0
12 Oct 2023
Generative Intrinsic Optimization: Intrinsic Control with Model Learning
Jianfei Ma
72
0
0
12 Oct 2023
What Matters to You? Towards Visual Representation Alignment for Robot Learning
Ran Tian
Chenfeng Xu
Masayoshi Tomizuka
Jitendra Malik
Andrea V. Bajcsy
80
10
0
11 Oct 2023
Deep Reinforcement Learning for Autonomous Cyber Operations: A Survey
Gregory Palmer
Chris Parry
Daniel J.B. Harrold
Chris Willis
AI4CE
90
1
0
11 Oct 2023
Previous
1
2
3
...
26
27
28
...
81
82
83
Next