ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1801.01290
  4. Cited By
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
v1v2 (latest)

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
ArXiv (abs)PDFHTML

Papers citing "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"

50 / 4,130 papers shown
Title
Model Predictive Control with Self-supervised Representation Learning
Model Predictive Control with Self-supervised Representation Learning
Jonas A. Matthies
Muhammad Burhan Hafez
Mostafa Kotb
S. Wermter
SSL
25
0
0
14 Apr 2023
NaviSTAR: Socially Aware Robot Navigation with Hybrid Spatio-Temporal
  Graph Transformer and Preference Learning
NaviSTAR: Socially Aware Robot Navigation with Hybrid Spatio-Temporal Graph Transformer and Preference Learning
Weizheng Wang
Ruiqi Wang
Le Mao
Byung-Cheol Min
87
14
0
12 Apr 2023
Habits and goals in synergy: a variational Bayesian framework for
  behavior
Habits and goals in synergy: a variational Bayesian framework for behavior
Dongqi Han
Kenji Doya
Dongsheng Li
Jun Tani
BDL
80
215
0
11 Apr 2023
Reinforcement Learning-Based Black-Box Model Inversion Attacks
Reinforcement Learning-Based Black-Box Model Inversion Attacks
Gyojin Han
Jaehyun Choi
Haeil Lee
Junmo Kim
MIACV
65
37
0
10 Apr 2023
RoboPianist: Dexterous Piano Playing with Deep Reinforcement Learning
RoboPianist: Dexterous Piano Playing with Deep Reinforcement Learning
Kevin Zakka
Philipp Wu
Laura M. Smith
Nimrod Gileadi
Taylor A. Howell
...
Sumeet Singh
Yuval Tassa
Pete Florence
Andy Zeng
Pieter Abbeel
113
32
0
09 Apr 2023
Stochastic Nonlinear Control via Finite-dimensional Spectral Dynamic Embedding
Stochastic Nonlinear Control via Finite-dimensional Spectral Dynamic Embedding
Zhaolin Ren
Tongzheng Ren
Haitong Ma
Na Li
Bo Dai
107
10
0
08 Apr 2023
Learning Robot Manipulation from Cross-Morphology Demonstration
Learning Robot Manipulation from Cross-Morphology Demonstration
G. Salhotra
Isabella Liu
Gaurav Sukhatme
LM&Ro
75
9
0
07 Apr 2023
CRISP: Curriculum inducing Primitive Informed Subgoal Prediction
CRISP: Curriculum inducing Primitive Informed Subgoal Prediction
Utsav Singh
Vinay P. Namboodiri
88
3
0
07 Apr 2023
AutoRL Hyperparameter Landscapes
AutoRL Hyperparameter Landscapes
Aditya Mohan
C. Benjamins
Konrad Wienecke
A. Dockhorn
Marius Lindauer
142
8
0
05 Apr 2023
Flipbot: Learning Continuous Paper Flipping via Coarse-to-Fine
  Exteroceptive-Proprioceptive Exploration
Flipbot: Learning Continuous Paper Flipping via Coarse-to-Fine Exteroceptive-Proprioceptive Exploration
Chao Zhao
Chunli Jiang
Junhao Cai
M. Y. Wang
Hongyu Yu
Qifeng Chen
75
4
0
05 Apr 2023
Exploration of Lightweight Single Image Denoising with Transformers and
  Truly Fair Training
Exploration of Lightweight Single Image Denoising with Transformers and Truly Fair Training
Haram Choi
Cheolwoong Na
Jinseop S. Kim
Jihoon Yang
ViT
58
3
0
04 Apr 2023
Quantum Imitation Learning
Quantum Imitation Learning
Zhihao Cheng
Kaining Zhang
Li Shen
Dacheng Tao
67
1
0
04 Apr 2023
Empirical Design in Reinforcement Learning
Empirical Design in Reinforcement Learning
Andrew Patterson
Samuel Neumann
Martha White
Adam White
122
30
0
03 Apr 2023
Generative Adversarial Neuroevolution for Control Behaviour Imitation
Generative Adversarial Neuroevolution for Control Behaviour Imitation
Maximilien Le Clei
Pierre C. Bellec
55
0
0
03 Apr 2023
Neuroevolution of Recurrent Architectures on Control Tasks
Neuroevolution of Recurrent Architectures on Control Tasks
Maximilien Le Clei
Pierre C. Bellec
40
4
0
03 Apr 2023
UniDexGrasp++: Improving Dexterous Grasping Policy Learning via
  Geometry-aware Curriculum and Iterative Generalist-Specialist Learning
UniDexGrasp++: Improving Dexterous Grasping Policy Learning via Geometry-aware Curriculum and Iterative Generalist-Specialist Learning
Weikang Wan
Haoran Geng
Yun-Hai Liu
Zikang Shan
Yaodong Yang
Li Yi
He Wang
172
101
0
02 Apr 2023
Experimentation Platforms Meet Reinforcement Learning: Bayesian
  Sequential Decision-Making for Continuous Monitoring
Experimentation Platforms Meet Reinforcement Learning: Bayesian Sequential Decision-Making for Continuous Monitoring
Runzhe Wan
Yu Liu
James McQueen
Doug Hains
Rui Song
OffRL
65
6
0
02 Apr 2023
On Context Distribution Shift in Task Representation Learning for
  Offline Meta RL
On Context Distribution Shift in Task Representation Learning for Offline Meta RL
Chenyang Zhao
Zihao Zhou
Bing-Quan Liu
OffRL
61
4
0
01 Apr 2023
Finetuning from Offline Reinforcement Learning: Challenges, Trade-offs
  and Practical Solutions
Finetuning from Offline Reinforcement Learning: Challenges, Trade-offs and Practical Solutions
Yicheng Luo
Jackie Kay
Edward Grefenstette
M. Deisenroth
OffRLOnRL
69
16
0
30 Mar 2023
MAHALO: Unifying Offline Reinforcement Learning and Imitation Learning
  from Observations
MAHALO: Unifying Offline Reinforcement Learning and Imitation Learning from Observations
Anqi Li
Byron Boots
Ching-An Cheng
OffRL
92
16
0
30 Mar 2023
Dependent Task Offloading in Edge Computing Using GNN and Deep
  Reinforcement Learning
Dependent Task Offloading in Edge Computing Using GNN and Deep Reinforcement Learning
Zequn Cao
Xiaoheng Deng
36
12
0
30 Mar 2023
Importance Sampling for Stochastic Gradient Descent in Deep Neural
  Networks
Importance Sampling for Stochastic Gradient Descent in Deep Neural Networks
Thibault Lahire
38
2
0
29 Mar 2023
Learning Complicated Manipulation Skills via Deterministic Policy with
  Limited Demonstrations
Learning Complicated Manipulation Skills via Deterministic Policy with Limited Demonstrations
Li Haofeng
C. Yiwen
Tan Jiayi
Marcelo H. Ang Jr
OffRL
35
2
0
29 Mar 2023
On-line reinforcement learning for optimization of real-life energy
  trading strategy
On-line reinforcement learning for optimization of real-life energy trading strategy
Lukasz Lepak
Pawel Wawrzyñski
64
0
0
28 Mar 2023
BC-IRL: Learning Generalizable Reward Functions from Demonstrations
BC-IRL: Learning Generalizable Reward Functions from Demonstrations
Andrew Szot
Amy Zhang
Dhruv Batra
Z. Kira
Franziska Meier
OODOffRL
89
9
0
28 Mar 2023
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value
  Regularization
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization
Haoran Xu
Li Jiang
Jianxiong Li
Zhuoran Yang
Zhaoran Wang
Victor Chan
Xianyuan Zhan
OffRL
101
85
0
28 Mar 2023
A Learning-based Adaptive Compliance Method for Symmetric Bi-manual
  Manipulation
A Learning-based Adaptive Compliance Method for Symmetric Bi-manual Manipulation
Yu-wen Cao
Shengjie Wang
Xiang Zheng
Wen-Xuan Ma
Tao Zhang
20
0
0
27 Mar 2023
Bi-Manual Block Assembly via Sim-to-Real Reinforcement Learning
Bi-Manual Block Assembly via Sim-to-Real Reinforcement Learning
Satoshi Kataoka
Youngseog Chung
Seyed Kamyar Seyed Ghasemipour
Pannag R Sanketi
S. Gu
Igor Mordatch
86
6
0
27 Mar 2023
Balancing policy constraint and ensemble size in uncertainty-based
  offline reinforcement learning
Balancing policy constraint and ensemble size in uncertainty-based offline reinforcement learning
Alex Beeson
Giovanni Montana
OffRL
70
13
0
26 Mar 2023
Multi-Task Reinforcement Learning in Continuous Control with Successor
  Feature-Based Concurrent Composition
Multi-Task Reinforcement Learning in Continuous Control with Successor Feature-Based Concurrent Composition
Y. Liu
Aamir Ahmad
81
4
0
24 Mar 2023
Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the
  MineRL BASALT 2022 Competition
Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the MineRL BASALT 2022 Competition
Stephanie Milani
Anssi Kanervisto
Karolis Ramanauskas
Sander Schulhoff
Brandon Houghton
...
Vinicius G. Goecks
Nicholas R. Waytowich
David Watkins
J. Miller
Rohin Shah
59
16
0
23 Mar 2023
Boosting Reinforcement Learning and Planning with Demonstrations: A
  Survey
Boosting Reinforcement Learning and Planning with Demonstrations: A Survey
Tongzhou Mu
H. Su
OffRL
83
1
0
23 Mar 2023
A Survey of Historical Learning: Learning Models with Learning History
A Survey of Historical Learning: Learning Models with Learning History
Xiang Li
Ge Wu
Lingfeng Yang
Wenzhe Wang
Renjie Song
Jian Yang
MUAI4TS
103
2
0
23 Mar 2023
Matryoshka Policy Gradient for Entropy-Regularized RL: Convergence and
  Global Optimality
Matryoshka Policy Gradient for Entropy-Regularized RL: Convergence and Global Optimality
François Ged
M. H. Veiga
83
0
0
22 Mar 2023
Wasserstein Auto-encoded MDPs: Formal Verification of Efficiently
  Distilled RL Policies with Many-sided Guarantees
Wasserstein Auto-encoded MDPs: Formal Verification of Efficiently Distilled RL Policies with Many-sided Guarantees
Florent Delgrange
Ann Nowé
Guillermo A. Pérez
OffRL
76
4
0
22 Mar 2023
A Hierarchical Hybrid Learning Framework for Multi-agent Trajectory
  Prediction
A Hierarchical Hybrid Learning Framework for Multi-agent Trajectory Prediction
Yujun Jiao
Mingze Miao
Zhishuai Yin
Chunyuan Lei
Xu Zhu
Linzhen Nie
Bo Tao
83
5
0
22 Mar 2023
Text2Motion: From Natural Language Instructions to Feasible Plans
Text2Motion: From Natural Language Instructions to Feasible Plans
Kevin Qinghong Lin
Christopher Agia
Toki Migimatsu
Marco Pavone
Jeannette Bohg
LM&Ro
171
284
0
21 Mar 2023
SACPlanner: Real-World Collision Avoidance with a Soft Actor Critic
  Local Planner and Polar State Representations
SACPlanner: Real-World Collision Avoidance with a Soft Actor Critic Local Planner and Polar State Representations
Khaled Nakhleh
Minahil Raza
Mack Tang
M. Andrews
Rinu Boney
I. Hadžić
Jeongran Lee
Atefeh Mohajeri
Karina Palyutina
68
6
0
21 Mar 2023
A Survey of Demonstration Learning
A Survey of Demonstration Learning
André Rosa de Sousa Porfírio Correia
Luís A. Alexandre
OffRL
70
20
0
20 Mar 2023
Deceptive Reinforcement Learning in Model-Free Domains
Deceptive Reinforcement Learning in Model-Free Domains
Alan Lewis
Tim Miller
71
5
0
20 Mar 2023
Hybrid Systems Neural Control with Region-of-Attraction Planner
Hybrid Systems Neural Control with Region-of-Attraction Planner
Yue Meng
Chuchu Fan
86
2
0
18 Mar 2023
Towards AI-controlled FES-restoration of movements: Learning cycling
  stimulation pattern with reinforcement learning
Towards AI-controlled FES-restoration of movements: Learning cycling stimulation pattern with reinforcement learning
Nat Wannawas
Aldo A. Faisal
40
1
0
17 Mar 2023
Adaptive Policy Learning for Offline-to-Online Reinforcement Learning
Adaptive Policy Learning for Offline-to-Online Reinforcement Learning
Han Zheng
Xufang Luo
Pengfei Wei
Xuan Song
Dongsheng Li
Jing Jiang
OffRLOnRL
74
24
0
14 Mar 2023
Reinforcement Learning-based Wavefront Sensorless Adaptive Optics
  Approaches for Satellite-to-Ground Laser Communication
Reinforcement Learning-based Wavefront Sensorless Adaptive Optics Approaches for Satellite-to-Ground Laser Communication
Payam Parvizi
Runnan Zou
C. Bellinger
R. Cheriton
D. Spinello
45
2
0
13 Mar 2023
Deploying Offline Reinforcement Learning with Human Feedback
Deploying Offline Reinforcement Learning with Human Feedback
Ziniu Li
Kelvin Xu
Liu Liu
Lanqing Li
Deheng Ye
P. Zhao
OffRL
96
2
0
13 Mar 2023
Twice Regularized Markov Decision Processes: The Equivalence between
  Robustness and Regularization
Twice Regularized Markov Decision Processes: The Equivalence between Robustness and Regularization
E. Derman
Yevgeniy Men
Matthieu Geist
Shie Mannor
68
2
0
12 Mar 2023
Synthetic Experience Replay
Synthetic Experience Replay
Cong Lu
Philip J. Ball
Yee Whye Teh
Jack Parker-Holder
OffRL
176
80
0
12 Mar 2023
Continual Visual Reinforcement Learning with A Life-Long World Model
Continual Visual Reinforcement Learning with A Life-Long World Model
Wendong Zhang
Wendong Zhang
Geng Chen
Siyu Gao
Yunbo Wang
Xiaokang Yang
Xiaokang Yang
CLL
97
3
0
12 Mar 2023
Understanding the Synergies between Quality-Diversity and Deep
  Reinforcement Learning
Understanding the Synergies between Quality-Diversity and Deep Reinforcement Learning
Bryan Lim
Manon Flageat
Antoine Cully
OnRL
81
7
0
10 Mar 2023
Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online
  Fine-Tuning
Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning
Mitsuhiko Nakamoto
Yuexiang Zhai
Anika Singh
Max Sobol Mark
Yi-An Ma
Chelsea Finn
Aviral Kumar
Sergey Levine
OffRLOnRL
193
125
0
09 Mar 2023
Previous
123...353637...818283
Next