ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1801.01290
  4. Cited By
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
v1v2 (latest)

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
ArXiv (abs)PDFHTML

Papers citing "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"

50 / 4,128 papers shown
Title
Learning-Driven Exploration for Reinforcement Learning
Learning-Driven Exploration for Reinforcement Learning
Muhammad Usama
D. Chang
67
11
0
17 Jun 2019
Deep Reinforcement Learning for Industrial Insertion Tasks with Visual
  Inputs and Natural Rewards
Deep Reinforcement Learning for Industrial Insertion Tasks with Visual Inputs and Natural Rewards
Gerrit Schoettler
Ashvin Nair
Jianlan Luo
Shikhar Bahl
J. A. Ojea
Eugen Solowjow
Sergey Levine
OffRL
59
193
0
13 Jun 2019
Goal-conditioned Imitation Learning
Goal-conditioned Imitation Learning
Yiming Ding
Carlos Florensa
Mariano Phielipp
Pieter Abbeel
97
228
0
13 Jun 2019
Efficient Exploration via State Marginal Matching
Efficient Exploration via State Marginal Matching
Lisa Lee
Benjamin Eysenbach
Emilio Parisotto
Eric Xing
Sergey Levine
Ruslan Salakhutdinov
147
248
0
12 Jun 2019
Boosting Soft Actor-Critic: Emphasizing Recent Experience without
  Forgetting the Past
Boosting Soft Actor-Critic: Emphasizing Recent Experience without Forgetting the Past
Che Wang
George Andriopoulos
51
45
0
10 Jun 2019
Transfer Learning by Modeling a Distribution over Policies
Transfer Learning by Modeling a Distribution over Policies
Disha Shrivastava
Eeshan Gunesh Dhekane
Riashat Islam
OODOffRL
27
0
0
09 Jun 2019
Watch, Try, Learn: Meta-Learning from Demonstrations and Reward
Watch, Try, Learn: Meta-Learning from Demonstrations and Reward
Allan Zhou
Eric Jang
Daniel Kappler
Alexander Herzog
Mohi Khansari
Paul Wohlhart
Yunfei Bai
Mrinal Kalakrishnan
Sergey Levine
Chelsea Finn
89
49
0
07 Jun 2019
Improving Exploration in Soft-Actor-Critic with Normalizing Flows
  Policies
Improving Exploration in Soft-Actor-Critic with Normalizing Flows Policies
Patrick Nadeem Ward
Ariella Smofsky
A. Bose
79
58
0
06 Jun 2019
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Aviral Kumar
Justin Fu
George Tucker
Sergey Levine
OffRLOnRL
156
1,070
0
03 Jun 2019
Safety Augmented Value Estimation from Demonstrations (SAVED): Safe Deep
  Model-Based RL for Sparse Cost Robotic Tasks
Safety Augmented Value Estimation from Demonstrations (SAVED): Safe Deep Model-Based RL for Sparse Cost Robotic Tasks
Brijen Thananjeyan
Ashwin Balakrishna
Ugo Rosolia
Felix Li
R. McAllister
Joseph E. Gonzalez
Sergey Levine
Francesco Borrelli
Ken Goldberg
OffRL
92
4
0
31 May 2019
Coordinated Exploration via Intrinsic Rewards for Multi-Agent
  Reinforcement Learning
Coordinated Exploration via Intrinsic Rewards for Multi-Agent Reinforcement Learning
Shariq Iqbal
Fei Sha
78
49
0
28 May 2019
Learning Efficient and Effective Exploration Policies with
  Counterfactual Meta Policy
Learning Efficient and Effective Exploration Policies with Counterfactual Meta Policy
Ruihan Yang
Qiwei Ye
Tie-Yan Liu
38
0
0
28 May 2019
SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards
SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards
S. Reddy
Anca Dragan
Sergey Levine
OffRL
69
52
0
27 May 2019
Interactive Differentiable Simulation
Interactive Differentiable Simulation
Eric Heiden
David Millard
Hejia Zhang
Gaurav Sukhatme
OODAI4CEPINN
94
50
0
26 May 2019
Composing Task-Agnostic Policies with Deep Reinforcement Learning
Composing Task-Agnostic Policies with Deep Reinforcement Learning
A. H. Qureshi
Jacob J. Johnson
Yuzhe Qin
Taylor Henderson
Byron Boots
Michael C. Yip
OffRL
76
30
0
25 May 2019
Adaptive Symmetric Reward Noising for Reinforcement Learning
Adaptive Symmetric Reward Noising for Reinforcement Learning
R. Vivanti
Talya D. Sohlberg-Baris
Shlomo Cohen
Orna Cohen
AAML
23
1
0
24 May 2019
Neural Temporal-Difference and Q-Learning Provably Converge to Global
  Optima
Neural Temporal-Difference and Q-Learning Provably Converge to Global Optima
Qi Cai
Zhuoran Yang
Jason D. Lee
Zhaoran Wang
66
32
0
24 May 2019
Distributional Policy Optimization: An Alternative Approach for
  Continuous Control
Distributional Policy Optimization: An Alternative Approach for Continuous Control
Chen Tessler
Guy Tennenholtz
Shie Mannor
OffRL
51
44
0
23 May 2019
Hierarchical Reinforcement Learning for Concurrent Discovery of Compound
  and Composable Policies
Hierarchical Reinforcement Learning for Concurrent Discovery of Compound and Composable Policies
Domingo Esteban
Leonel Rozo
D. Caldwell
OffRL
45
7
0
23 May 2019
Maximum Entropy-Regularized Multi-Goal Reinforcement Learning
Maximum Entropy-Regularized Multi-Goal Reinforcement Learning
Rui Zhao
Xudong Sun
Volker Tresp
69
83
0
21 May 2019
Stochastically Dominant Distributional Reinforcement Learning
Stochastically Dominant Distributional Reinforcement Learning
John D. Martin
Michal Lyskawinski
Xiaohu Li
Brendan Englot
66
24
0
17 May 2019
A Regularized Opponent Model with Maximum Entropy Objective
A Regularized Opponent Model with Maximum Entropy Objective
Zheng Tian
Ying Wen
Zhichen Gong
Faiz Punakkath
Shihao Zou
Jun Wang
52
31
0
17 May 2019
Leveraging exploration in off-policy algorithms via normalizing flows
Leveraging exploration in off-policy algorithms via normalizing flows
Bogdan Mazoure
T. Doan
A. Durand
R. Devon Hjelm
Joelle Pineau
OnRL
70
62
0
16 May 2019
Meta reinforcement learning as task inference
Meta reinforcement learning as task inference
Jan Humplik
Alexandre Galashov
Leonard Hasenclever
Pedro A. Ortega
Yee Whye Teh
N. Heess
OffRL
127
128
0
15 May 2019
Learning Novel Policies For Tasks
Learning Novel Policies For Tasks
Yunbo Zhang
Wenhao Yu
Greg Turk
56
34
0
13 May 2019
Metareasoning in Modular Software Systems: On-the-Fly Configuration
  using Reinforcement Learning with Rich Contextual Representations
Metareasoning in Modular Software Systems: On-the-Fly Configuration using Reinforcement Learning with Rich Contextual Representations
Aditya Modi
Debadeepta Dey
Alekh Agarwal
Adith Swaminathan
Besmira Nushi
Sean Andrist
Eric Horvitz
OffRLLRM
26
2
0
12 May 2019
Generalized Second Order Value Iteration in Markov Decision Processes
Generalized Second Order Value Iteration in Markov Decision Processes
Chandramouli Kamanchi
Raghuram Bharadwaj Diddigi
S. Bhatnagar
66
11
0
10 May 2019
Smoothing Policies and Safe Policy Gradients
Smoothing Policies and Safe Policy Gradients
Matteo Papini
Matteo Pirotta
Marcello Restelli
80
31
0
08 May 2019
Longitudinal Dynamic versus Kinematic Models for Car-Following Control
  Using Deep Reinforcement Learning
Longitudinal Dynamic versus Kinematic Models for Car-Following Control Using Deep Reinforcement Learning
Yuan Lin
J. McPhee
N. L. Azad
AI4CE
69
34
0
07 May 2019
Dimension-Wise Importance Sampling Weight Clipping for Sample-Efficient
  Reinforcement Learning
Dimension-Wise Importance Sampling Weight Clipping for Sample-Efficient Reinforcement Learning
Seungyul Han
Y. Sung
OffRL
68
20
0
07 May 2019
Information asymmetry in KL-regularized RL
Information asymmetry in KL-regularized RL
Alexandre Galashov
Siddhant M. Jayakumar
Leonard Hasenclever
Dhruva Tirumala
Jonathan Richard Schwarz
Guillaume Desjardins
Wojciech M. Czarnecki
Yee Whye Teh
Razvan Pascanu
N. Heess
OffRL
74
104
0
03 May 2019
Collaborative Evolutionary Reinforcement Learning
Collaborative Evolutionary Reinforcement Learning
Shauharda Khadka
Somdeb Majumdar
Tarek Nassar
Zach Dwiel
E. Tumer
Santiago Miret
Yinyin Liu
Kagan Tumer
68
100
0
02 May 2019
DAC: The Double Actor-Critic Architecture for Learning Options
DAC: The Double Actor-Critic Architecture for Learning Options
Shangtong Zhang
Shimon Whiteson
149
73
0
29 Apr 2019
Model-free Deep Reinforcement Learning for Urban Autonomous Driving
Model-free Deep Reinforcement Learning for Urban Autonomous Driving
Jianyu Chen
Bodi Yuan
Masayoshi Tomizuka
75
268
0
20 Apr 2019
Rogue-Gym: A New Challenge for Generalization in Reinforcement Learning
Rogue-Gym: A New Challenge for Generalization in Reinforcement Learning
Yuji Kanagawa
Tomoyuki Kaneko
73
14
0
17 Apr 2019
End-to-End Robotic Reinforcement Learning without Reward Engineering
End-to-End Robotic Reinforcement Learning without Reward Engineering
Avi Singh
Larry Yang
Kristian Hartikainen
Chelsea Finn
Sergey Levine
SSLOffRL
108
267
0
16 Apr 2019
Learning Probabilistic Multi-Modal Actor Models for Vision-Based Robotic
  Grasping
Learning Probabilistic Multi-Modal Actor Models for Vision-Based Robotic Grasping
Mengyuan Yan
A. Li
Mrinal Kalakrishnan
P. Pastor
57
18
0
15 Apr 2019
A Hitchhiker's Guide to Statistical Comparisons of Reinforcement
  Learning Algorithms
A Hitchhiker's Guide to Statistical Comparisons of Reinforcement Learning Algorithms
Cédric Colas
Olivier Sigaud
Pierre-Yves Oudeyer
77
64
0
15 Apr 2019
Only Relevant Information Matters: Filtering Out Noisy Samples to Boost
  RL
Only Relevant Information Matters: Filtering Out Noisy Samples to Boost RL
Yannis Flet-Berliac
Philippe Preux
49
2
0
08 Apr 2019
Guided Meta-Policy Search
Guided Meta-Policy Search
Russell Mendonca
Abhishek Gupta
Rosen Kralev
Pieter Abbeel
Sergey Levine
Chelsea Finn
68
57
0
01 Apr 2019
How to pick the domain randomization parameters for sim-to-real transfer
  of reinforcement learning policies?
How to pick the domain randomization parameters for sim-to-real transfer of reinforcement learning policies?
Q. Vuong
Sharad Vikram
H. Su
Sicun Gao
Henrik I. Christensen
OOD
84
49
0
28 Mar 2019
Generalized Off-Policy Actor-Critic
Generalized Off-Policy Actor-Critic
Shangtong Zhang
Wendelin Bohmer
Shimon Whiteson
OffRLCML
151
43
0
27 Mar 2019
AlphaX: eXploring Neural Architectures with Deep Neural Networks and
  Monte Carlo Tree Search
AlphaX: eXploring Neural Architectures with Deep Neural Networks and Monte Carlo Tree Search
Linnan Wang
Yiyang Zhao
Yuu Jinnai
Yuandong Tian
Rodrigo Fonseca
BDL
96
96
0
26 Mar 2019
Q-Learning for Continuous Actions with Cross-Entropy Guided Policies
Q-Learning for Continuous Actions with Cross-Entropy Guided Policies
Riley Simmons-Edler
Ben Eisner
E. Mitchell
Sebastian Seung
Daniel D. Lee
103
29
0
25 Mar 2019
On the use of Deep Autoencoders for Efficient Embedded Reinforcement
  Learning
On the use of Deep Autoencoders for Efficient Embedded Reinforcement Learning
Bharat Prakash
Mark Horton
Nicholas R. Waytowich
W. Hairston
Tim Oates
T. Mohsenin
40
19
0
25 Mar 2019
Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic
  Context Variables
Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables
Kate Rakelly
Aurick Zhou
Deirdre Quillen
Chelsea Finn
Sergey Levine
OffRL
98
665
0
19 Mar 2019
Truly Proximal Policy Optimization
Truly Proximal Policy Optimization
Yuhui Wang
Hao He
Chao Wen
Xiaoyang Tan
80
126
0
19 Mar 2019
Exploiting Hierarchy for Learning and Transfer in KL-regularized RL
Exploiting Hierarchy for Learning and Transfer in KL-regularized RL
Dhruva Tirumala
Hyeonwoo Noh
Alexandre Galashov
Leonard Hasenclever
Arun Ahuja
Greg Wayne
Razvan Pascanu
Yee Whye Teh
N. Heess
OffRL
72
44
0
18 Mar 2019
Policy Distillation and Value Matching in Multiagent Reinforcement
  Learning
Policy Distillation and Value Matching in Multiagent Reinforcement Learning
Samir Wadhwania
Dong-Ki Kim
Shayegan Omidshafiei
Jonathan P. How
36
26
0
15 Mar 2019
Deep Reinforcement Learning with Feedback-based Exploration
Deep Reinforcement Learning with Feedback-based Exploration
Jan Scholten
Daan Wout
C. Celemin
Jens Kober
62
4
0
14 Mar 2019
Previous
123...80818283
Next