Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1502.05477
Cited By
Trust Region Policy Optimization
19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Trust Region Policy Optimization"
50 / 3,098 papers shown
Title
CASSL: Curriculum Accelerated Self-Supervised Learning
Adithyavairavan Murali
Lerrel Pinto
Dhiraj Gandhi
Abhinav Gupta
SSL
27
35
0
04 Aug 2017
Meta-SGD: Learning to Learn Quickly for Few-Shot Learning
Zhenguo Li
Fengwei Zhou
Fei Chen
Hang Li
4
1,112
0
31 Jul 2017
Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards
Matej Vecerík
Todd Hester
Jonathan Scholz
Fumin Wang
Olivier Pietquin
Bilal Piot
N. Heess
Thomas Rothörl
Thomas Lampe
Martin Riedmiller
OffRL
21
658
0
27 Jul 2017
DARLA: Improving Zero-Shot Transfer in Reinforcement Learning
I. Higgins
Arka Pal
Andrei A. Rusu
Loic Matthey
Christopher P. Burgess
Alexander Pritzel
M. Botvinick
Charles Blundell
Alexander Lerchner
DRL
43
411
0
26 Jul 2017
Mutual Alignment Transfer Learning
Markus Wulfmeier
Ingmar Posner
Pieter Abbeel
16
60
0
25 Jul 2017
Learning Transferable Architectures for Scalable Image Recognition
Barret Zoph
Vijay Vasudevan
Jonathon Shlens
Quoc V. Le
95
5,557
0
21 Jul 2017
RAIL: Risk-Averse Imitation Learning
Anirban Santara
A. Naik
Balaraman Ravindran
Dipankar Das
Dheevatsa Mudigere
Sasikanth Avancha
Bharat Kaul
30
18
0
20 Jul 2017
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
78
18,385
0
20 Jul 2017
Imagination-Augmented Agents for Deep Reinforcement Learning
T. Weber
S. Racanière
David P. Reichert
Lars Buesing
A. Guez
...
Razvan Pascanu
Peter W. Battaglia
Demis Hassabis
David Silver
Daan Wierstra
LM&Ro
54
551
0
19 Jul 2017
Reverse Curriculum Generation for Reinforcement Learning
Carlos Florensa
David Held
Markus Wulfmeier
Michael Zhang
Pieter Abbeel
36
436
0
17 Jul 2017
Control of a Quadrotor with Reinforcement Learning
Jemin Hwangbo
Inkyu Sa
Roland Siegwart
Marco Hutter
21
477
0
17 Jul 2017
Efficient Architecture Search by Network Transformation
Han Cai
Tianyao Chen
Weinan Zhang
Yong Yu
Jun Wang
OOD
3DV
34
67
0
16 Jul 2017
ADAPT: Zero-Shot Adaptive Policy Transfer for Stochastic Dynamical Systems
James Harrison
Animesh Garg
Boris Ivanovic
Yuke Zhu
Silvio Savarese
Li Fei-Fei
Marco Pavone
13
25
0
15 Jul 2017
Distral: Robust Multitask Reinforcement Learning
Yee Whye Teh
V. Bapst
Wojciech M. Czarnecki
John Quan
J. Kirkpatrick
R. Hadsell
N. Heess
Razvan Pascanu
44
544
0
13 Jul 2017
Imitation from Observation: Learning to Imitate Behaviors from Raw Video via Context Translation
YuXuan Liu
Abhishek Gupta
Pieter Abbeel
Sergey Levine
53
375
0
11 Jul 2017
A Simple Neural Attentive Meta-Learner
Nikhil Mishra
Mostafa Rohaninejad
Xi Chen
Pieter Abbeel
OOD
26
199
0
11 Jul 2017
Learning Heuristic Search via Imitation
M. Bhardwaj
Sanjiban Choudhury
Sebastian Scherer
23
80
0
10 Jul 2017
Robust Imitation of Diverse Behaviors
Ziyun Wang
J. Merel
Scott E. Reed
Greg Wayne
Nando de Freitas
N. Heess
31
195
0
10 Jul 2017
Emergence of Locomotion Behaviours in Rich Environments
N. Heess
TB Dhruva
S. Sriram
Jay Lemmon
J. Merel
...
Tom Erez
Ziyun Wang
S. M. Ali Eslami
Martin Riedmiller
David Silver
143
928
0
07 Jul 2017
Learning human behaviors from motion capture by adversarial imitation
J. Merel
Yuval Tassa
TB Dhruva
S. Srinivasan
Jay Lemmon
Ziyun Wang
Greg Wayne
N. Heess
GAN
17
201
0
07 Jul 2017
Trust-PCL: An Off-Policy Trust Region Method for Continuous Control
Ofir Nachum
Mohammad Norouzi
Kelvin Xu
Dale Schuurmans
19
106
0
06 Jul 2017
ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games
Yuandong Tian
Qucheng Gong
Wenling Shang
Yuxin Wu
C. L. Zitnick
OffRL
27
126
0
04 Jul 2017
Teacher-Student Curriculum Learning
Tambet Matiisen
Avital Oliver
Taco S. Cohen
John Schulman
ODL
38
371
0
01 Jul 2017
Sample-efficient Actor-Critic Reinforcement Learning with Supervised Data for Dialogue Management
Pei-hao Su
Paweł Budzianowski
Stefan Ultes
Milica Gasic
S. Young
OffRL
24
129
0
01 Jul 2017
Noisy Networks for Exploration
Meire Fortunato
M. G. Azar
Bilal Piot
Jacob Menick
Ian Osband
...
Rémi Munos
Demis Hassabis
Olivier Pietquin
Charles Blundell
Shane Legg
8
887
0
30 Jun 2017
Path Integral Networks: End-to-End Differentiable Optimal Control
Masashi Okada
Luca Rigazio
T. Aoshima
PINN
21
56
0
29 Jun 2017
Energy-Based Sequence GANs for Recommendation and Their Connection to Imitation Learning
Jaeyoon Yoo
Heonseok Ha
Jihun Yi
Jeonghun Ryu
Chanju Kim
Jung-Woo Ha
Young-Han Kim
Sungroh Yoon
GAN
27
14
0
28 Jun 2017
Count-Based Exploration in Feature Space for Reinforcement Learning
Jarryd Martin
S. N. Sasikumar
Tom Everitt
Marcus Hutter
24
122
0
25 Jun 2017
Expected Policy Gradients
K. Ciosek
Shimon Whiteson
19
57
0
15 Jun 2017
Deep reinforcement learning from human preferences
Paul Christiano
Jan Leike
Tom B. Brown
Miljan Martic
Shane Legg
Dario Amodei
42
3,130
0
12 Jun 2017
Data-Efficient Policy Evaluation Through Behavior Policy Search
Josiah P. Hanna
Philip S. Thomas
Peter Stone
S. Niekum
OffRL
19
39
0
12 Jun 2017
Unlocking the Potential of Simulators: Design with RL in Mind
Rika Antonova
S. Cruciani
16
2
0
08 Jun 2017
Parameter Space Noise for Exploration
Matthias Plappert
Rein Houthooft
Prafulla Dhariwal
Szymon Sidor
Richard Y. Chen
Xi Chen
Tamim Asfour
Pieter Abbeel
Marcin Andrychowicz
29
593
0
06 Jun 2017
Actor-Critic for Linearly-Solvable Continuous MDP with Partially Known Dynamics
Tomoki Nishi
Prashant Doshi
Michael R. James
Danil Prokhorov
22
5
0
04 Jun 2017
Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning
S. Gu
Timothy Lillicrap
Zoubin Ghahramani
Richard Turner
Bernhard Schölkopf
Sergey Levine
OffRL
26
164
0
01 Jun 2017
The Atari Grand Challenge Dataset
Vitaly Kurin
Sebastian Nowozin
Katja Hofmann
Lucas Beyer
Bastian Leibe
OffRL
17
43
0
31 May 2017
Constrained Policy Optimization
Joshua Achiam
David Held
Aviv Tamar
Pieter Abbeel
11
1,302
0
30 May 2017
Multi-Modal Imitation Learning from Unstructured Demonstrations using Generative Adversarial Nets
Karol Hausman
Yevgen Chebotar
S. Schaal
Gaurav Sukhatme
Joseph J. Lim
GAN
30
147
0
30 May 2017
Fine-grained acceleration control for autonomous intersection management using deep reinforcement learning
H. Mirzaei
T. Givargis
14
8
0
30 May 2017
Learning End-to-end Multimodal Sensor Policies for Autonomous Navigation
Guan-Horng Liu
Avinash Siravuru
Sai P. Selvaraj
Manuela Veloso
George Kantor
13
69
0
30 May 2017
Role Playing Learning for Socially Concomitant Mobile Robot Navigation
Mingming Li
Rui Jiang
S. Ge
Tong-heng Lee
13
41
0
29 May 2017
Diagonal Rescaling For Neural Networks
Jean Lafond
Nicolas Vasilache
Léon Bottou
14
11
0
25 May 2017
Enhanced Experience Replay Generation for Efficient Reinforcement Learning
Vincent Huang
Tobias Ley
Martha Vlachou-Konchylaki
Wenfeng Hu
OnRL
GAN
SyDa
21
9
0
23 May 2017
A unified view of entropy-regularized Markov decision processes
Gergely Neu
Anders Jonsson
Vicencc Gómez
56
254
0
22 May 2017
Guide Actor-Critic for Continuous Control
Voot Tangkaratt
A. Abdolmaleki
Masashi Sugiyama
24
17
0
22 May 2017
Learning to Mix n-Step Returns: Generalizing lambda-Returns for Deep Reinforcement Learning
Sahil Sharma
J. GirishRaguvir
S. Ramesh
Balaraman Ravindran
11
6
0
21 May 2017
Learning to Factor Policies and Action-Value Functions: Factored Action Space Representations for Deep Reinforcement learning
Sahil Sharma
A. Suresh
Rahul Ramesh
Balaraman Ravindran
OffRL
17
36
0
20 May 2017
Model-Based Planning with Discrete and Continuous Actions
Mikael Henaff
William F. Whitney
Yann LeCun
30
16
0
19 May 2017
Automatic Goal Generation for Reinforcement Learning Agents
Carlos Florensa
David Held
Xinyang Geng
Pieter Abbeel
78
499
0
17 May 2017
Probabilistically Safe Policy Transfer
David Held
Zoe McCarthy
Michael Zhang
Fred Shentu
Pieter Abbeel
32
19
0
15 May 2017
Previous
1
2
3
...
59
60
61
62
Next