ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1801.01290
  4. Cited By
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
v1v2 (latest)

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
ArXiv (abs)PDFHTML

Papers citing "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"

50 / 4,130 papers shown
Title
A Strong Baseline for Batch Imitation Learning
A Strong Baseline for Batch Imitation Learning
Matthew Smith
Lucas Maystre
Zhenwen Dai
K. Ciosek
OffRL
85
5
0
06 Feb 2023
Target-based Surrogates for Stochastic Optimization
Target-based Surrogates for Stochastic Optimization
J. Lavington
Sharan Vaswani
Reza Babanezhad
Mark Schmidt
Nicolas Le Roux
107
6
0
06 Feb 2023
Offline Learning of Closed-Loop Deep Brain Stimulation Controllers for
  Parkinson Disease Treatment
Offline Learning of Closed-Loop Deep Brain Stimulation Controllers for Parkinson Disease Treatment
Qitong Gao
Stephen L. Schimdt
Afsana Chowdhury
Guangyu Feng
Jennifer J. Peters
Katherine Genty
W. Grill
Dennis A. Turner
Miroslav Pajic
OffRL
81
11
0
05 Feb 2023
Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage
Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage
Masatoshi Uehara
Nathan Kallus
Jason D. Lee
Wen Sun
OffRL
115
5
0
05 Feb 2023
Developing Driving Strategies Efficiently: A Skill-Based Hierarchical
  Reinforcement Learning Approach
Developing Driving Strategies Efficiently: A Skill-Based Hierarchical Reinforcement Learning Approach
Yigit Gurses
Kaan Buyukdemirci
Y. Yildiz
73
5
0
04 Feb 2023
Online Reinforcement Learning in Non-Stationary Context-Driven Environments
Online Reinforcement Learning in Non-Stationary Context-Driven Environments
Pouya Hamadanian
Arash Nasr-Esfahany
Malte Schwarzkopf
Siddartha Sen
MohammadIman Alizadeh
CLLOffRL
164
0
0
04 Feb 2023
Better Training of GFlowNets with Local Credit and Incomplete
  Trajectories
Better Training of GFlowNets with Local Credit and Incomplete Trajectories
L. Pan
Nikolay Malkin
Dinghuai Zhang
Yoshua Bengio
108
73
0
03 Feb 2023
Mind the Gap: Offline Policy Optimization for Imperfect Rewards
Mind the Gap: Offline Policy Optimization for Imperfect Rewards
Jianxiong Li
Xiao Hu
Haoran Xu
Jingjing Liu
Xianyuan Zhan
Qing-Shan Jia
Ya Zhang
OffRL
83
20
0
03 Feb 2023
Learning Zero-Shot Cooperation with Humans, Assuming Humans Are Biased
Learning Zero-Shot Cooperation with Humans, Assuming Humans Are Biased
Chao Yu
Jiaxuan Gao
Weiling Liu
Bo Xu
Hao Tang
Jiaqi Yang
Yu Wang
Yi Wu
109
42
0
03 Feb 2023
MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion
  Control in Real Networks
MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion Control in Real Networks
Raffaele Galliera
A. Morelli
Roberto Fronteddu
N. Suri
54
4
0
02 Feb 2023
Is Model Ensemble Necessary? Model-based RL via a Single Model with
  Lipschitz Regularized Value Function
Is Model Ensemble Necessary? Model-based RL via a Single Model with Lipschitz Regularized Value Function
Ruijie Zheng
Xiyao Wang
Huazhe Xu
Furong Huang
113
15
0
02 Feb 2023
A general Markov decision process formalism for action-state
  entropy-regularized reward maximization
A general Markov decision process formalism for action-state entropy-regularized reward maximization
D. Grytskyy
Jorge Ramírez-Ruiz
R. Moreno-Bote
88
3
0
02 Feb 2023
Policy Expansion for Bridging Offline-to-Online Reinforcement Learning
Policy Expansion for Bridging Offline-to-Online Reinforcement Learning
Haichao Zhang
Weiwen Xu
Haonan Yu
CLLOffRLOnRL
134
69
0
02 Feb 2023
Distillation Policy Optimization
Distillation Policy Optimization
Jianfei Ma
OffRL
124
1
0
01 Feb 2023
Learning Cut Selection for Mixed-Integer Linear Programming via
  Hierarchical Sequence Model
Learning Cut Selection for Mixed-Integer Linear Programming via Hierarchical Sequence Model
Zhihai Wang
Xijun Li
Jie Wang
Yufei Kuang
Mingxuan Yuan
Jianguo Zeng
Yongdong Zhang
Feng Wu
86
42
0
01 Feb 2023
Bridging Physics-Informed Neural Networks with Reinforcement Learning:
  Hamilton-Jacobi-Bellman Proximal Policy Optimization (HJBPPO)
Bridging Physics-Informed Neural Networks with Reinforcement Learning: Hamilton-Jacobi-Bellman Proximal Policy Optimization (HJBPPO)
Amartya Mukherjee
Jun Liu
82
11
0
01 Feb 2023
QMP: Q-switch Mixture of Policies for Multi-Task Behavior Sharing
QMP: Q-switch Mixture of Policies for Multi-Task Behavior Sharing
Grace Zhang
Ayush Jain
Injune Hwang
Shao-Hua Sun
Joseph J. Lim
75
4
0
01 Feb 2023
Learning Universal Policies via Text-Guided Video Generation
Learning Universal Policies via Text-Guided Video Generation
Yilun Du
Mengjiao Yang
Bo Dai
H. Dai
Ofir Nachum
J. Tenenbaum
Dale Schuurmans
Pieter Abbeel
PINNLM&Ro
161
264
0
31 Jan 2023
Anti-Exploration by Random Network Distillation
Anti-Exploration by Random Network Distillation
Alexander Nikulin
Vladislav Kurenkov
Denis Tarasov
Sergey Kolesnikov
87
31
0
31 Jan 2023
Policy Gradient for Rectangular Robust Markov Decision Processes
Policy Gradient for Rectangular Robust Markov Decision Processes
Navdeep Kumar
E. Derman
Matthieu Geist
Kfir Y. Levy
Shie Mannor
89
23
0
31 Jan 2023
Learning Vision-based Robotic Manipulation Tasks Sequentially in Offline
  Reinforcement Learning Settings
Learning Vision-based Robotic Manipulation Tasks Sequentially in Offline Reinforcement Learning Settings
Sudhir Pratap Yadav
R. Nagar
S. Shah
OffRL
72
3
0
31 Jan 2023
Hierarchical Programmatic Reinforcement Learning via Learning to Compose
  Programs
Hierarchical Programmatic Reinforcement Learning via Learning to Compose Programs
Guanhui. Liu
En-Pei Hu
Pu-Jen Cheng
Hung-yi Lee
Shao-Hua Sun
148
18
0
30 Jan 2023
Guiding Online Reinforcement Learning with Action-Free Offline
  Pretraining
Guiding Online Reinforcement Learning with Action-Free Offline Pretraining
Deyao Zhu
Yuhui Wang
Jürgen Schmidhuber
Mohamed Elhoseiny
OffRLOnRL
83
8
0
30 Jan 2023
Planning Multiple Epidemic Interventions with Reinforcement Learning
Planning Multiple Epidemic Interventions with Reinforcement Learning
Anh Mai
Nikunj Gupta
A. Abouzeid
Dennis Shasha
89
4
0
30 Jan 2023
PAC-Bayesian Soft Actor-Critic Learning
PAC-Bayesian Soft Actor-Critic Learning
Bahareh Tasdighi
Abdullah Akgul
Manuel Haussmann
Kenny Kazimirzak Brink
M. Kandemir
123
4
0
30 Jan 2023
SaFormer: A Conditional Sequence Modeling Approach to Offline Safe
  Reinforcement Learning
SaFormer: A Conditional Sequence Modeling Approach to Offline Safe Reinforcement Learning
Qin Zhang
Linrui Zhang
Haoran Xu
Li Shen
Bowen Wang
Yongzhe Chang
Xueqian Wang
Bo Yuan
Dacheng Tao
OffRL
72
19
0
28 Jan 2023
Constrained Policy Optimization with Explicit Behavior Density for
  Offline Reinforcement Learning
Constrained Policy Optimization with Explicit Behavior Density for Offline Reinforcement Learning
Jing Zhang
Chi Zhang
Wenjia Wang
Bing-Yi Jing
OffRL
91
10
0
28 Jan 2023
Turbulence control in plane Couette flow using low-dimensional neural
  ODE-based models and deep reinforcement learning
Turbulence control in plane Couette flow using low-dimensional neural ODE-based models and deep reinforcement learning
Alec J. Linot
Kevin Zeng
M. Graham
AI4CE
59
19
0
28 Jan 2023
Beyond Exponentially Fast Mixing in Average-Reward Reinforcement
  Learning via Multi-Level Monte Carlo Actor-Critic
Beyond Exponentially Fast Mixing in Average-Reward Reinforcement Learning via Multi-Level Monte Carlo Actor-Critic
Wesley A Suttle
Amrit Singh Bedi
Bhrij Patel
Brian M Sadler
Alec Koppel
Dinesh Manocha
100
16
0
28 Jan 2023
STEERING: Stein Information Directed Exploration for Model-Based
  Reinforcement Learning
STEERING: Stein Information Directed Exploration for Model-Based Reinforcement Learning
Souradip Chakraborty
Amrit Singh Bedi
Alec Koppel
Mengdi Wang
Furong Huang
Dinesh Manocha
74
8
0
28 Jan 2023
A Memory Efficient Deep Reinforcement Learning Approach For Snake Game
  Autonomous Agents
A Memory Efficient Deep Reinforcement Learning Approach For Snake Game Autonomous Agents
Md. Rafat Rahman Tushar
Shahnewaz Siddique
33
6
0
27 Jan 2023
Outcome-directed Reinforcement Learning by Uncertainty & Temporal
  Distance-Aware Curriculum Goal Generation
Outcome-directed Reinforcement Learning by Uncertainty & Temporal Distance-Aware Curriculum Goal Generation
Daesol Cho
Seungjae Lee
H. J. Kim
142
15
0
27 Jan 2023
Improving Behavioural Cloning with Positive Unlabeled Learning
Improving Behavioural Cloning with Positive Unlabeled Learning
Qiang-qiang Wang
Robert McCarthy
David Córdova Bulens
Kevin McGuinness
Noel E. O'Connor
Nico Gürtler
Felix Widmaier
Francisco Roldan Sanchez
S. Redmond
OffRLOnRL
97
8
0
27 Jan 2023
Theoretical Analysis of Offline Imitation With Supplementary Dataset
Theoretical Analysis of Offline Imitation With Supplementary Dataset
Ziniu Li
Tian Xu
Y. Yu
Zhixun Luo
OffRL
64
2
0
27 Jan 2023
Generalized Munchausen Reinforcement Learning using Tsallis KL
  Divergence
Generalized Munchausen Reinforcement Learning using Tsallis KL Divergence
Lingwei Zhu
Zheng Chen
Takamitsu Matsubara
Martha White
107
1
0
27 Jan 2023
Learning to Generate All Feasible Actions
Learning to Generate All Feasible Actions
Mirco Theile
Daniele Bernardini
Raphael Trumpp
C. Piazza
Marco Caccamo
Alberto L. Sangiovanni-Vincentelli
66
2
0
26 Jan 2023
Model-based Offline Reinforcement Learning with Local Misspecification
Model-based Offline Reinforcement Learning with Local Misspecification
Kefan Dong
Yannis Flet-Berliac
Allen Nie
Emma Brunskill
OffRL
77
4
0
26 Jan 2023
Which Experiences Are Influential for Your Agent? Policy Iteration with Turn-over Dropout
Takuya Hiraoka
Takashi Onishi
Yoshimasa Tsuruoka
OffRL
69
0
0
26 Jan 2023
Trust Region-Based Safe Distributional Reinforcement Learning for
  Multiple Constraints
Trust Region-Based Safe Distributional Reinforcement Learning for Multiple Constraints
Dohyeong Kim
Kyungjae Lee
Songhwai Oh
70
10
0
26 Jan 2023
SMART: Self-supervised Multi-task pretrAining with contRol Transformers
SMART: Self-supervised Multi-task pretrAining with contRol Transformers
Yanchao Sun
Shuang Ma
Ratnesh Madaan
Rogerio Bonatti
Furong Huang
Ashish Kapoor
102
42
0
24 Jan 2023
Quasi-optimal Reinforcement Learning with Continuous Actions
Quasi-optimal Reinforcement Learning with Continuous Actions
Yuhan Li
Wenzhuo Zhou
Ruoqing Zhu
OffRL
83
5
0
21 Jan 2023
Reinforcement learning-based estimation for partial differential
  equations
Reinforcement learning-based estimation for partial differential equations
S. Mowlavi
M. Benosman
53
4
0
20 Jan 2023
AccDecoder: Accelerated Decoding for Neural-enhanced Video Analytics
AccDecoder: Accelerated Decoding for Neural-enhanced Video Analytics
Tingting Yuan
Liang Mi
Weijun Wang
Haipeng Dai
Xiaoming Fu
73
16
0
20 Jan 2023
Generative Slate Recommendation with Reinforcement Learning
Generative Slate Recommendation with Reinforcement Learning
Romain Deffayet
Thibaut Thonet
Jean-Michel Render
Maarten de Rijke
88
25
0
20 Jan 2023
Plan To Predict: Learning an Uncertainty-Foreseeing Model for
  Model-Based Reinforcement Learning
Plan To Predict: Learning an Uncertainty-Foreseeing Model for Model-Based Reinforcement Learning
Zifan Wu
Chao Yu
Chong Chen
Jianye Hao
H. Zhuo
73
19
0
20 Jan 2023
Multi-Agent Interplay in a Competitive Survival Environment
Multi-Agent Interplay in a Competitive Survival Environment
Andrea Fanti
74
0
0
19 Jan 2023
Automated deep reinforcement learning for real-time scheduling strategy
  of multi-energy system integrated with post-carbon and direct-air carbon
  captured system
Automated deep reinforcement learning for real-time scheduling strategy of multi-energy system integrated with post-carbon and direct-air carbon captured system
Tobi Michael Alabi
Nathan P. Lawrence
Lin Lu
Zaiyue Yang
R. Bhushan Gopaluni
24
29
0
18 Jan 2023
A reinforcement learning path planning approach for range-only
  underwater target localization with autonomous vehicles
A reinforcement learning path planning approach for range-only underwater target localization with autonomous vehicles
Ivan Masmitja
Mario Martin
K. Katija
S. Gomáriz
J. Navarro
57
6
0
17 Jan 2023
The Role of Baselines in Policy Gradient Optimization
The Role of Baselines in Policy Gradient Optimization
Jincheng Mei
Wesley Chung
Valentin Thomas
Bo Dai
Csaba Szepesvári
Dale Schuurmans
77
19
0
16 Jan 2023
Opponent-aware Role-based Learning in Team Competitive Markov Games
Opponent-aware Role-based Learning in Team Competitive Markov Games
Paramita Koley
Aurghya Maiti
Niloy Ganguly
Sourangshu Bhattacharya
77
1
0
14 Jan 2023
Previous
123...383940...818283
Next