ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1801.01290
  4. Cited By
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
ArXivPDFHTML

Papers citing "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"

50 / 1,748 papers shown
Title
Robot Skill Adaptation via Soft Actor-Critic Gaussian Mixture Models
Robot Skill Adaptation via Soft Actor-Critic Gaussian Mixture Models
Iman Nematollahi
Erick Rosete-Beas
Adrian Rofer
Tim Welschehold
Abhinav Valada
Wolfram Burgard
26
15
0
25 Nov 2021
Real-world challenges for multi-agent reinforcement learning in
  grid-interactive buildings
Real-world challenges for multi-agent reinforcement learning in grid-interactive buildings
Kingsley Nweye
Bo Liu
Peter Stone
Zoltán Nagy
OffRL
AI4CE
42
38
0
25 Nov 2021
Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning
Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning
Nicolai Dorka
Tim Welschehold
Joschka Boedecker
Wolfram Burgard
OffRL
42
9
0
24 Nov 2021
Learning State Representations via Retracing in Reinforcement Learning
Learning State Representations via Retracing in Reinforcement Learning
Changmin Yu
Dong Li
Jianye Hao
Jun Wang
Neil Burgess
35
7
0
24 Nov 2021
A Free Lunch from the Noise: Provable and Practical Exploration for
  Representation Learning
A Free Lunch from the Noise: Provable and Practical Exploration for Representation Learning
Tongzheng Ren
Tianjun Zhang
Csaba Szepesvári
Bo Dai
39
19
0
22 Nov 2021
Real-World Dexterous Object Manipulation based Deep Reinforcement
  Learning
Real-World Dexterous Object Manipulation based Deep Reinforcement Learning
Qingfeng Yao
Jilong Wang
Shuyu Yang
DRL
20
1
0
22 Nov 2021
Aggressive Q-Learning with Ensembles: Achieving Both High Sample
  Efficiency and High Asymptotic Performance
Aggressive Q-Learning with Ensembles: Achieving Both High Sample Efficiency and High Asymptotic Performance
Yanqiu Wu
Xinyue Chen
Che Wang
Yiming Zhang
Keith Ross
OffRL
19
9
0
17 Nov 2021
CleanRL: High-quality Single-file Implementations of Deep Reinforcement
  Learning Algorithms
CleanRL: High-quality Single-file Implementations of Deep Reinforcement Learning Algorithms
Shengyi Huang
Rousslan Fernand Julien Dossa
Chang Ye
Jeff Braga
OffRL
16
0
0
16 Nov 2021
GRI: General Reinforced Imitation and its Application to Vision-Based
  Autonomous Driving
GRI: General Reinforced Imitation and its Application to Vision-Based Autonomous Driving
Raphael Chekroun
Marin Toromanoff
Sascha Hornauer
Fabien Moutarde
44
60
0
16 Nov 2021
Learning to Execute: Efficient Learning of Universal Plan-Conditioned
  Policies in Robotics
Learning to Execute: Efficient Learning of Universal Plan-Conditioned Policies in Robotics
Ingmar Schubert
Danny Driess
Ozgur S. Oguz
Marc Toussaint
OffRL
24
1
0
15 Nov 2021
Learning Representations for Pixel-based Control: What Matters and Why?
Learning Representations for Pixel-based Control: What Matters and Why?
Manan Tomar
Utkarsh Aashu Mishra
Amy Zhang
Matthew E. Taylor
SSL
OffRL
41
25
0
15 Nov 2021
Deep Reinforcement Learning with Shallow Controllers: An Experimental
  Application to PID Tuning
Deep Reinforcement Learning with Shallow Controllers: An Experimental Application to PID Tuning
Nathan P. Lawrence
M. Forbes
Philip D. Loewen
Daniel G. McClement
Johan U. Backstrom
R. Bhushan Gopaluni
OffRL
30
72
0
13 Nov 2021
One model Packs Thousands of Items with Recurrent Conditional Query
  Learning
One model Packs Thousands of Items with Recurrent Conditional Query Learning
Dongda Li
Zhaoquan Gu
Yuexuan Wang
Changwei Ren
F. Lau
32
17
0
12 Nov 2021
Data-Efficient Deep Reinforcement Learning for Attitude Control of
  Fixed-Wing UAVs: Field Experiments
Data-Efficient Deep Reinforcement Learning for Attitude Control of Fixed-Wing UAVs: Field Experiments
Eivind Bøhn
E. M. Coates
D. Reinhardt
T. Johansen
30
27
0
07 Nov 2021
Imagine Networks
Imagine Networks
Seokjun Kim
Jaeeun Jang
Hyeoncheol Kim
GAN
AI4CE
18
2
0
04 Nov 2021
B-Pref: Benchmarking Preference-Based Reinforcement Learning
B-Pref: Benchmarking Preference-Based Reinforcement Learning
Kimin Lee
Laura M. Smith
Anca Dragan
Pieter Abbeel
OffRL
45
93
0
04 Nov 2021
Towards an Understanding of Default Policies in Multitask Policy
  Optimization
Towards an Understanding of Default Policies in Multitask Policy Optimization
Theodore H. Moskovitz
Michael Arbel
Jack Parker-Holder
Aldo Pacchiano
30
9
0
04 Nov 2021
RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement
  Learning
RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement Learning
Sabela Ramos
Sertan Girgin
Léonard Hussenot
Damien Vincent
Hanna Yakubovich
...
Piotr Stańczyk
Raphaël Marinier
Jeremiah Harmsen
Olivier Pietquin
Nikola Momchev
OffRL
38
24
0
04 Nov 2021
Dynamic Mirror Descent based Model Predictive Control for Accelerating
  Robot Learning
Dynamic Mirror Descent based Model Predictive Control for Accelerating Robot Learning
Utkarsh Aashu Mishra
Soumya R. Samineni
Prakhar Goel
Chandravaran Kunjeti
Himanshu Lodha
Aman Singh
Aditya Sagi
S. Bhatnagar
Shishir Kolathaya
32
3
0
04 Nov 2021
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Shangtong Zhang
Rémi Tachet des Combes
Romain Laroche
45
10
0
04 Nov 2021
Is Bang-Bang Control All You Need? Solving Continuous Control with
  Bernoulli Policies
Is Bang-Bang Control All You Need? Solving Continuous Control with Bernoulli Policies
Tim Seyde
Igor Gilitschenski
Wilko Schwarting
Bartolomeo Stellato
Martin Riedmiller
Markus Wulfmeier
Daniela Rus
43
44
0
03 Nov 2021
Smooth Imitation Learning via Smooth Costs and Smooth Policies
Smooth Imitation Learning via Smooth Costs and Smooth Policies
Sapana Chaudhary
Balaraman Ravindran
29
1
0
03 Nov 2021
Curriculum Offline Imitation Learning
Curriculum Offline Imitation Learning
Minghuan Liu
Hanye Zhao
Zhengyu Yang
Jian Shen
Weinan Zhang
Li Zhao
Tie-Yan Liu
OffRL
29
1
0
03 Nov 2021
Validate on Sim, Detect on Real -- Model Selection for Domain
  Randomization
Validate on Sim, Detect on Real -- Model Selection for Domain Randomization
Gal Leibovich
Guy Jacob
Shadi Endrawis
Gal Novik
Aviv Tamar
37
7
0
01 Nov 2021
Context Meta-Reinforcement Learning via Neuromodulation
Context Meta-Reinforcement Learning via Neuromodulation
Eseoghene Ben-Iwhiwhu
Jeffery Dick
Nicholas A. Ketz
Praveen K. Pilly
Andrea Soltoggio
OffRL
75
12
0
30 Oct 2021
Generalized Proximal Policy Optimization with Sample Reuse
Generalized Proximal Policy Optimization with Sample Reuse
James Queeney
I. Paschalidis
Christos G. Cassandras
OffRL
47
47
0
29 Oct 2021
Hindsight Goal Ranking on Replay Buffer for Sparse Reward Environment
Hindsight Goal Ranking on Replay Buffer for Sparse Reward Environment
Tung M. Luu
Chang D. Yoo
31
8
0
28 Oct 2021
Direct then Diffuse: Incremental Unsupervised Skill Discovery for State
  Covering and Goal Reaching
Direct then Diffuse: Incremental Unsupervised Skill Discovery for State Covering and Goal Reaching
Pierre-Alexandre Kamienny
Jean Tarbouriech
Sylvain Lamprier
A. Lazaric
Ludovic Denoyer
SSL
54
18
0
27 Oct 2021
Learning Domain Invariant Representations in Goal-conditioned Block MDPs
Learning Domain Invariant Representations in Goal-conditioned Block MDPs
Beining Han
Chongyi Zheng
Harris Chan
Keiran Paster
Michael Ruogu Zhang
Jimmy Ba
OOD
AI4CE
31
13
0
27 Oct 2021
RoMA: Robust Model Adaptation for Offline Model-based Optimization
RoMA: Robust Model Adaptation for Offline Model-based Optimization
Sihyun Yu
SungSoo Ahn
Le Song
Jinwoo Shin
OffRL
48
32
0
27 Oct 2021
Conflict-Averse Gradient Descent for Multi-task Learning
Conflict-Averse Gradient Descent for Multi-task Learning
Bo Liu
Xingchao Liu
Xiaojie Jin
Peter Stone
Qiang Liu
52
299
0
26 Oct 2021
Automating Control of Overestimation Bias for Reinforcement Learning
Automating Control of Overestimation Bias for Reinforcement Learning
Arsenii Kuznetsov
Alexander Grishin
Artem Tsypin
Arsenii Ashukha
Artur Kadurin
Dmitry Vetrov
OffRL
25
2
0
26 Oct 2021
Unsupervised Domain Adaptation with Dynamics-Aware Rewards in
  Reinforcement Learning
Unsupervised Domain Adaptation with Dynamics-Aware Rewards in Reinforcement Learning
Jinxin Liu
Hao Shen
Donglin Wang
Yachen Kang
Qiangxing Tian
37
19
0
25 Oct 2021
Goal-Aware Cross-Entropy for Multi-Target Reinforcement Learning
Goal-Aware Cross-Entropy for Multi-Target Reinforcement Learning
Kibeom Kim
Min Whoo Lee
Yoonsung Kim
Je-hwan Ryu
Minsu Lee
Byoung-Tak Zhang
29
8
0
25 Oct 2021
Policy Search using Dynamic Mirror Descent MPC for Model Free Off Policy
  RL
Policy Search using Dynamic Mirror Descent MPC for Model Free Off Policy RL
Aarush Gupta
30
0
0
23 Oct 2021
Off-Dynamics Inverse Reinforcement Learning from Hetero-Domain
Off-Dynamics Inverse Reinforcement Learning from Hetero-Domain
Yachen Kang
Jinxin Liu
Xin Cao
Donglin Wang
26
3
0
21 Oct 2021
Can Q-learning solve Multi Armed Bantids?
Can Q-learning solve Multi Armed Bantids?
R. Vivanti
OffRL
18
0
0
21 Oct 2021
Efficient Robotic Manipulation Through Offline-to-Online Reinforcement
  Learning and Goal-Aware State Information
Efficient Robotic Manipulation Through Offline-to-Online Reinforcement Learning and Goal-Aware State Information
Jin Li
Xianyuan Zhan
Zixu Xiao
Guyue Zhou
OffRL
OnRL
34
2
0
21 Oct 2021
Estimating Optimal Infinite Horizon Dynamic Treatment Regimes via
  pT-Learning
Estimating Optimal Infinite Horizon Dynamic Treatment Regimes via pT-Learning
Wenzhuo Zhou
Ruoqing Zhu
Annie Qu
45
22
0
20 Oct 2021
Feedback Linearization of Car Dynamics for Racing via Reinforcement
  Learning
Feedback Linearization of Car Dynamics for Racing via Reinforcement Learning
Michael Estrada
Sida Li
Xiangyu Cai
26
3
0
20 Oct 2021
Continuous Control with Action Quantization from Demonstrations
Continuous Control with Action Quantization from Demonstrations
Robert Dadashi
Léonard Hussenot
Damien Vincent
Sertan Girgin
Anton Raichuk
Matthieu Geist
Olivier Pietquin
OffRL
33
23
0
19 Oct 2021
Neural Network Compatible Off-Policy Natural Actor-Critic Algorithm
Neural Network Compatible Off-Policy Natural Actor-Critic Algorithm
Raghuram Bharadwaj Diddigi
Prateek Jain
P. J
S. Bhatnagar
CML
OffRL
27
3
0
19 Oct 2021
Learning Robotic Manipulation Skills Using an Adaptive Force-Impedance
  Action Space
Learning Robotic Manipulation Skills Using an Adaptive Force-Impedance Action Space
Maximilian Ulmer
Elie Aljalbout
Sascha Schwarz
Sami Haddadin
23
6
0
19 Oct 2021
Variance Reduction based Experience Replay for Policy Optimization
Variance Reduction based Experience Replay for Policy Optimization
Hua Zheng
Wei Xie
M. Feng
OffRL
46
2
0
17 Oct 2021
Safe Autonomous Racing via Approximate Reachability on Ego-vision
Safe Autonomous Racing via Approximate Reachability on Ego-vision
Bingqing Chen
Jonathan M Francis
Jean Oh
Eric Nyberg
Sylvia Herbert
59
14
0
14 Oct 2021
Offline Reinforcement Learning with Soft Behavior Regularization
Offline Reinforcement Learning with Soft Behavior Regularization
Haoran Xu
Xianyuan Zhan
Jianxiong Li
Honglei Yin
OffRL
31
31
0
14 Oct 2021
Maximum Entropy Differential Dynamic Programming
Maximum Entropy Differential Dynamic Programming
Oswin So
Ziyi Wang
Evangelos A. Theodorou
50
14
0
13 Oct 2021
Twice regularized MDPs and the equivalence between robustness and
  regularization
Twice regularized MDPs and the equivalence between robustness and regularization
E. Derman
Matthieu Geist
Shie Mannor
53
55
0
12 Oct 2021
Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in
  the Real World
Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World
Laura M. Smith
J. Kew
Xue Bin Peng
Sehoon Ha
Jie Tan
Sergey Levine
49
103
0
11 Oct 2021
Safe Reinforcement Learning Using Robust Control Barrier Functions
Safe Reinforcement Learning Using Robust Control Barrier Functions
Y. Emam
Gennaro Notomista
Paul Glotfelter
Z. Kira
M. Egerstedt
OffRL
35
39
0
11 Oct 2021
Previous
123...222324...333435
Next