Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1502.05477
Cited By
v1
v2
v3
v4
v5 (latest)
Trust Region Policy Optimization
19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Trust Region Policy Optimization"
50 / 2,009 papers shown
Title
Data-efficient Hindsight Off-policy Option Learning
Markus Wulfmeier
Dushyant Rao
Roland Hafner
Thomas Lampe
A. Abdolmaleki
...
Michael Neunert
Dhruva Tirumala
Noah Y. Siegel
N. Heess
Martin Riedmiller
OffRL
93
47
0
30 Jul 2020
Natural Gradient Shared Control
Yoojin Oh
Shaowen Wu
Marc Toussaint
Jim Mainprice
71
9
0
30 Jul 2020
Understanding the Stability of Deep Control Policies for Biped Locomotion
Hwangpil Park
R. Yu
Yoonsang Lee
Kyungho Lee
Jehee Lee
52
9
0
30 Jul 2020
Modular Transfer Learning with Transition Mismatch Compensation for Excessive Disturbance Rejection
Tianming Wang
Wenjie Lu
H. Yu
Dikai Liu
87
1
0
29 Jul 2020
An Iterative LQR Controller for Off-Road and On-Road Vehicles using a Neural Network Dynamics Model
Akhil Nagariya
Srikanth Saripalli
87
30
0
28 Jul 2020
Munchausen Reinforcement Learning
Nino Vieillard
Olivier Pietquin
Matthieu Geist
OffRL
69
90
0
28 Jul 2020
Data-efficient visuomotor policy training using reinforcement learning and generative models
Ali Ghadirzadeh
Petra Poklukar
Ville Kyrki
Danica Kragic
Mårten Björkman
OffRL
112
9
0
26 Jul 2020
Maximum Mutation Reinforcement Learning for Scalable Control
Karush Suri
Xiaolong Shi
Konstantinos N. Plataniotis
Y. Lawryshyn
92
4
0
24 Jul 2020
Monte-Carlo Tree Search as Regularized Policy Optimization
Jean-Bastien Grill
Florent Altché
Yunhao Tang
Thomas Hubert
Michal Valko
Ioannis Antonoglou
Rémi Munos
124
75
0
24 Jul 2020
Bridging the Imitation Gap by Adaptive Insubordination
Luca Weihs
Unnat Jain
Iou-Jen Liu
Jordi Salvador
Svetlana Lazebnik
Aniruddha Kembhavi
Alex Schwing
91
36
0
23 Jul 2020
Learning Infinite-horizon Average-reward MDPs with Linear Function Approximation
Chen-Yu Wei
Mehdi Jafarnia-Jahromi
Haipeng Luo
Rahul Jain
94
43
0
23 Jul 2020
Approximation Benefits of Policy Gradient Methods with Aggregated States
Daniel Russo
125
7
0
22 Jul 2020
On Linear Convergence of Policy Gradient Methods for Finite MDPs
Jalaj Bhandari
Daniel Russo
122
61
0
21 Jul 2020
EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL
Seyed Kamyar Seyed Ghasemipour
Dale Schuurmans
S. Gu
OffRL
295
122
0
21 Jul 2020
Lagrangian Duality in Reinforcement Learning
Pranay Pasula
OffRL
30
0
0
20 Jul 2020
Off-Policy Reinforcement Learning for Efficient and Effective GAN Architecture Search
Yuan Tian
Qin Wang
Zhiwu Huang
Wen Li
Dengxin Dai
Minghao Yang
Jun Wang
Olga Fink
OffRL
91
61
0
17 Jul 2020
PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning
Alekh Agarwal
Mikael Henaff
Sham Kakade
Wen Sun
OffRL
94
110
0
16 Jul 2020
Robustifying Reinforcement Learning Agents via Action Space Adversarial Training
Kai Liang Tan
Yasaman Esfandiari
Xian Yeow Lee
Aakanksha
Soumik Sarkar
AAML
135
57
0
14 Jul 2020
Lifelong Policy Gradient Learning of Factored Policies for Faster Training Without Forgetting
Jorge Armando Mendez Mendez
Boyu Wang
Eric Eaton
CLL
73
38
0
14 Jul 2020
Single-partition adaptive Q-learning
J. Araújo
Mário A. T. Figueiredo
M. Botto
OffRL
65
2
0
14 Jul 2020
An Asymptotically Optimal Multi-Armed Bandit Algorithm and Hyperparameter Optimization
Yimin Huang
Yujun Li
Hanrong Ye
Zhenguo Li
Zhihua Zhang
68
7
0
11 Jul 2020
A Survey on Autonomous Vehicle Control in the Era of Mixed-Autonomy: From Physics-Based to AI-Guided Driving Policy Learning
Xuan Di
Rongye Shi
138
177
0
10 Jul 2020
SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning
Kimin Lee
Michael Laskin
A. Srinivas
Pieter Abbeel
OffRL
119
205
0
09 Jul 2020
Task-Agnostic Exploration via Policy Gradient of a Non-Parametric State Entropy Estimate
Mirco Mutti
Lorenzo Pratissoli
Marcello Restelli
75
19
0
09 Jul 2020
Responsive Safety in Reinforcement Learning by PID Lagrangian Methods
Adam Stooke
Joshua Achiam
Pieter Abbeel
115
302
0
08 Jul 2020
Variational Policy Gradient Method for Reinforcement Learning with General Utilities
Junyu Zhang
Alec Koppel
Amrit Singh Bedi
Csaba Szepesvári
Mengdi Wang
86
140
0
04 Jul 2020
Meta-SAC: Auto-tune the Entropy Temperature of Soft Actor-Critic via Metagradient
Yufei Wang
Tianwei Ni
77
21
0
03 Jul 2020
Verifiably Safe Exploration for End-to-End Reinforcement Learning
Nathan Hunt
Nathan Fulton
Sara Magliacane
Nghia Hoang
Subhro Das
Armando Solar-Lezama
OffRL
85
52
0
02 Jul 2020
Robust Inverse Reinforcement Learning under Transition Dynamics Mismatch
Luca Viano
Yu-ting Huang
Parameswaran Kamalaruban
Adrian Weller
Volkan Cevher
130
28
0
02 Jul 2020
Continual Learning: Tackling Catastrophic Forgetting in Deep Neural Networks with Replay Processes
Timothée Lesort
CLL
85
22
0
01 Jul 2020
Convex Regularization in Monte-Carlo Tree Search
Tuan Dam
Carlo DÉramo
Jan Peters
Joni Pajarinen
OffRL
81
11
0
01 Jul 2020
Fighting Failures with FIRE: Failure Identification to Reduce Expert Burden in Intervention-Based Learning
Trevor Ablett
Filip Marić
Jonathan Kelly
OffRL
106
6
0
01 Jul 2020
Extracting Latent State Representations with Linear Dynamics from Rich Observations
Abraham Frandsen
Rong Ge
30
2
0
29 Jun 2020
Lipschitzness Is All You Need To Tame Off-policy Generative Adversarial Imitation Learning
Lionel Blondé
Pablo Strasser
Alexandros Kalousis
90
22
0
28 Jun 2020
A Unifying Framework for Reinforcement Learning and Planning
Thomas M. Moerland
Joost Broekens
Aske Plaat
Catholijn M. Jonker
OffRL
131
9
0
26 Jun 2020
Off-Dynamics Reinforcement Learning: Training for Transfer with Domain Classifiers
Benjamin Eysenbach
Swapnil Asawa
Shreyas Chaudhari
Sergey Levine
Ruslan Salakhutdinov
108
94
0
24 Jun 2020
Retrospective Loss: Looking Back to Improve Training of Deep Neural Networks
Surgan Jandial
Ayush Chopra
Mausoom Sarkar
Piyush B. Gupta
Balaji Krishnamurthy
V. Balasubramanian
34
4
0
24 Jun 2020
Control-Aware Representations for Model-based Reinforcement Learning
Brandon Cui
Yinlam Chow
Mohammad Ghavamzadeh
BDL
91
13
0
24 Jun 2020
On the Global Optimality of Model-Agnostic Meta-Learning
Lingxiao Wang
Qi Cai
Zhuoran Yang
Zhaoran Wang
76
44
0
23 Jun 2020
Automatic Data Augmentation for Generalization in Deep Reinforcement Learning
Roberta Raileanu
M. Goldstein
Denis Yarats
Ilya Kostrikov
Rob Fergus
OffRL
65
110
0
23 Jun 2020
Graph Neural Networks and Reinforcement Learning for Behavior Generation in Semantic Environments
Patrick Hart
Alois Knoll
GNN
66
38
0
22 Jun 2020
dm_control: Software and Tasks for Continuous Control
Yuval Tassa
S. Tunyasuvunakool
Alistair Muldal
Yotam Doron
Piotr Trochim
...
Steven Bohez
J. Merel
Tom Erez
Timothy Lillicrap
N. Heess
LM&Ro
174
419
0
22 Jun 2020
Breaking the Curse of Many Agents: Provable Mean Embedding Q-Iteration for Mean-Field Reinforcement Learning
Lingxiao Wang
Zhuoran Yang
Zhaoran Wang
80
27
0
21 Jun 2020
Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with Asynchronous Reinforcement Learning
Aleksei Petrenko
Zhehui Huang
T. Kumar
Gaurav Sukhatme
V. Koltun
113
105
0
21 Jun 2020
Accelerating Safe Reinforcement Learning with Constraint-mismatched Policies
Tsung-Yen Yang
Justinian P. Rosca
Karthik Narasimhan
Peter J. Ramadge
103
19
0
20 Jun 2020
Competitive Policy Optimization
Manish Prajapat
Kamyar Azizzadenesheli
Alexander Liniger
Yisong Yue
Anima Anandkumar
58
15
0
18 Jun 2020
Parameterized MDPs and Reinforcement Learning Problems -- A Maximum Entropy Principle Based Framework
Amber Srivastava
S. Salapaka
77
11
0
17 Jun 2020
Automatic Curriculum Learning through Value Disagreement
Yunzhi Zhang
Pieter Abbeel
Lerrel Pinto
85
109
0
17 Jun 2020
COLREG-Compliant Collision Avoidance for Unmanned Surface Vehicle using Deep Reinforcement Learning
Eivind Meyer
Amalie Heiberg
Adil Rasheed
Omer San
76
74
0
16 Jun 2020
Model Embedding Model-Based Reinforcement Learning
Xiao Tan
Chao Qu
Junwu Xiong
James Y. Zhang
OffRL
39
0
0
16 Jun 2020
Previous
1
2
3
...
20
21
22
...
39
40
41
Next