Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1801.01290
Cited By
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"
50 / 4,044 papers shown
Title
FiDi-RL: Incorporating Deep Reinforcement Learning with Finite-Difference Policy Search for Efficient Learning of Continuous Control
Longxiang Shi
Shijian Li
LongBing Cao
Long Yang
Gang Zheng
Gang Pan
24
5
0
01 Jul 2019
Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog
Natasha Jaques
Asma Ghandeharioun
J. Shen
Craig Ferguson
Àgata Lapedriza
Noah J. Jones
S. Gu
Rosalind W. Picard
OffRL
50
338
0
30 Jun 2019
Learning Policies through Quantile Regression
Oliver Richter
Roger Wattenhofer
21
0
0
27 Jun 2019
Uncertainty-aware Model-based Policy Optimization
Tung-Long Vuong
Kenneth Tran
6
11
0
25 Jun 2019
Policy Optimization with Stochastic Mirror Descent
Long Yang
Yu Zhang
Gang Zheng
Qian Zheng
Pengfei Li
Jianhang Huang
Jun Wen
Gang Pan
58
34
0
25 Jun 2019
Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy
Boyi Liu
Qi Cai
Zhuoran Yang
Zhaoran Wang
35
109
0
25 Jun 2019
Deep Conservative Policy Iteration
Nino Vieillard
Olivier Pietquin
Matthieu Geist
22
26
0
24 Jun 2019
Ranking Policy Gradient
Kaixiang Lin
Jiayu Zhou
OffRL
27
7
0
24 Jun 2019
Disentangled Skill Embeddings for Reinforcement Learning
Janith C. Petangoda
Sergio Pascual-Diaz
Vincent Adam
Peter Vrancx
Jordi Grau-Moya
DRL
OffRL
29
15
0
21 Jun 2019
Exploring Model-based Planning with Policy Networks
Tingwu Wang
Jimmy Ba
49
148
0
20 Jun 2019
Calibrated Model-Based Deep Reinforcement Learning
Ali Malik
Volodymyr Kuleshov
Jiaming Song
Danny Nemer
Harlan Seymour
Stefano Ermon
25
55
0
19 Jun 2019
When to Trust Your Model: Model-Based Policy Optimization
Michael Janner
Justin Fu
Marvin Zhang
Sergey Levine
OffRL
39
936
0
19 Jun 2019
Reward Prediction Error as an Exploration Objective in Deep RL
Riley Simmons-Edler
Ben Eisner
Daniel Yang
Anthony Bisulco
E. Mitchell
Sebastian Seung
Daniel D. Lee
34
5
0
19 Jun 2019
Language as an Abstraction for Hierarchical Deep Reinforcement Learning
Yiding Jiang
S. Gu
Kevin Patrick Murphy
Chelsea Finn
OffRL
20
223
0
18 Jun 2019
Hierarchical Soft Actor-Critic: Adversarial Exploration via Mutual Information Optimization
Ari Azarafrooz
John Brock
16
3
0
17 Jun 2019
Is the Policy Gradient a Gradient?
Chris Nota
Philip S. Thomas
24
58
0
17 Jun 2019
Learning-Driven Exploration for Reinforcement Learning
Muhammad Usama
D. Chang
35
10
0
17 Jun 2019
Deep Reinforcement Learning for Industrial Insertion Tasks with Visual Inputs and Natural Rewards
Gerrit Schoettler
Ashvin Nair
Jianlan Luo
Shikhar Bahl
J. A. Ojea
Eugen Solowjow
Sergey Levine
OffRL
26
191
0
13 Jun 2019
Goal-conditioned Imitation Learning
Yiming Ding
Carlos Florensa
Mariano Phielipp
Pieter Abbeel
34
220
0
13 Jun 2019
Efficient Exploration via State Marginal Matching
Lisa Lee
Benjamin Eysenbach
Emilio Parisotto
Eric Xing
Sergey Levine
Ruslan Salakhutdinov
54
242
0
12 Jun 2019
Boosting Soft Actor-Critic: Emphasizing Recent Experience without Forgetting the Past
Che Wang
Keith Ross
21
45
0
10 Jun 2019
Transfer Learning by Modeling a Distribution over Policies
Disha Shrivastava
Eeshan Gunesh Dhekane
Riashat Islam
OOD
OffRL
16
0
0
09 Jun 2019
Watch, Try, Learn: Meta-Learning from Demonstrations and Reward
Allan Zhou
Eric Jang
Daniel Kappler
Alexander Herzog
Mohi Khansari
Paul Wohlhart
Yunfei Bai
Mrinal Kalakrishnan
Sergey Levine
Chelsea Finn
37
50
0
07 Jun 2019
Improving Exploration in Soft-Actor-Critic with Normalizing Flows Policies
Patrick Nadeem Ward
Ariella Smofsky
A. Bose
14
58
0
06 Jun 2019
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Aviral Kumar
Justin Fu
George Tucker
Sergey Levine
OffRL
OnRL
47
1,038
0
03 Jun 2019
Safety Augmented Value Estimation from Demonstrations (SAVED): Safe Deep Model-Based RL for Sparse Cost Robotic Tasks
Brijen Thananjeyan
Ashwin Balakrishna
Ugo Rosolia
Felix Li
R. McAllister
Joseph E. Gonzalez
Sergey Levine
Francesco Borrelli
Ken Goldberg
OffRL
22
4
0
31 May 2019
Coordinated Exploration via Intrinsic Rewards for Multi-Agent Reinforcement Learning
Shariq Iqbal
Fei Sha
14
49
0
28 May 2019
Learning Efficient and Effective Exploration Policies with Counterfactual Meta Policy
Ruihan Yang
Qiwei Ye
Tie-Yan Liu
30
0
0
28 May 2019
SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards
S. Reddy
Anca Dragan
Sergey Levine
OffRL
20
52
0
27 May 2019
Interactive Differentiable Simulation
Eric Heiden
David Millard
Hejia Zhang
Gaurav Sukhatme
OOD
AI4CE
PINN
8
50
0
26 May 2019
Composing Task-Agnostic Policies with Deep Reinforcement Learning
A. H. Qureshi
Jacob J. Johnson
Yuzhe Qin
Taylor Henderson
Byron Boots
Michael C. Yip
OffRL
22
30
0
25 May 2019
Adaptive Symmetric Reward Noising for Reinforcement Learning
R. Vivanti
Talya D. Sohlberg-Baris
Shlomo Cohen
Orna Cohen
AAML
21
1
0
24 May 2019
Neural Temporal-Difference and Q-Learning Provably Converge to Global Optima
Qi Cai
Zhuoran Yang
Jason D. Lee
Zhaoran Wang
47
30
0
24 May 2019
Distributional Policy Optimization: An Alternative Approach for Continuous Control
Chen Tessler
Guy Tennenholtz
Shie Mannor
OffRL
18
44
0
23 May 2019
Hierarchical Reinforcement Learning for Concurrent Discovery of Compound and Composable Policies
Domingo Esteban
Leonel Rozo
D. Caldwell
OffRL
17
7
0
23 May 2019
Maximum Entropy-Regularized Multi-Goal Reinforcement Learning
Rui Zhao
Xudong Sun
Volker Tresp
34
80
0
21 May 2019
Stochastically Dominant Distributional Reinforcement Learning
John D. Martin
Michal Lyskawinski
Xiaohu Li
Brendan Englot
28
24
0
17 May 2019
A Regularized Opponent Model with Maximum Entropy Objective
Zheng Tian
Ying Wen
Zhichen Gong
Faiz Punakkath
Shihao Zou
Jun Wang
30
31
0
17 May 2019
Leveraging exploration in off-policy algorithms via normalizing flows
Bogdan Mazoure
T. Doan
A. Durand
R. Devon Hjelm
Joelle Pineau
OnRL
35
60
0
16 May 2019
Meta reinforcement learning as task inference
Jan Humplik
Alexandre Galashov
Leonard Hasenclever
Pedro A. Ortega
Yee Whye Teh
N. Heess
OffRL
58
127
0
15 May 2019
Learning Novel Policies For Tasks
Yunbo Zhang
Wenhao Yu
Greg Turk
22
33
0
13 May 2019
Metareasoning in Modular Software Systems: On-the-Fly Configuration using Reinforcement Learning with Rich Contextual Representations
Aditya Modi
Debadeepta Dey
Alekh Agarwal
Adith Swaminathan
Besmira Nushi
Sean Andrist
Eric Horvitz
OffRL
LRM
24
1
0
12 May 2019
Generalized Second Order Value Iteration in Markov Decision Processes
Chandramouli Kamanchi
Raghuram Bharadwaj Diddigi
S. Bhatnagar
37
10
0
10 May 2019
Smoothing Policies and Safe Policy Gradients
Matteo Papini
Matteo Pirotta
Marcello Restelli
37
30
0
08 May 2019
Longitudinal Dynamic versus Kinematic Models for Car-Following Control Using Deep Reinforcement Learning
Yuan Lin
J. McPhee
N. L. Azad
AI4CE
33
34
0
07 May 2019
Dimension-Wise Importance Sampling Weight Clipping for Sample-Efficient Reinforcement Learning
Seungyul Han
Y. Sung
OffRL
22
20
0
07 May 2019
Information asymmetry in KL-regularized RL
Alexandre Galashov
Siddhant M. Jayakumar
Leonard Hasenclever
Dhruva Tirumala
Jonathan Richard Schwarz
Guillaume Desjardins
Wojciech M. Czarnecki
Yee Whye Teh
Razvan Pascanu
N. Heess
OffRL
25
102
0
03 May 2019
Collaborative Evolutionary Reinforcement Learning
Shauharda Khadka
Somdeb Majumdar
Tarek Nassar
Zach Dwiel
E. Tumer
Santiago Miret
Yinyin Liu
Kagan Tumer
29
100
0
02 May 2019
DAC: The Double Actor-Critic Architecture for Learning Options
Shangtong Zhang
Shimon Whiteson
30
72
0
29 Apr 2019
Model-free Deep Reinforcement Learning for Urban Autonomous Driving
Jianyu Chen
Bodi Yuan
Masayoshi Tomizuka
30
263
0
20 Apr 2019
Previous
1
2
3
...
78
79
80
81
Next