Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1707.06347
Cited By
v1
v2 (latest)
Proximal Policy Optimization Algorithms
20 July 2017
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Proximal Policy Optimization Algorithms"
17 / 8,517 papers shown
Title
Learning a Structured Neural Network Policy for a Hopping Task
Julian Viereck
Jules Kozolinsky
Alexander Herzog
Ludovic Righetti
85
12
0
29 Sep 2017
Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations
Aravind Rajeswaran
Vikash Kumar
Abhishek Gupta
Giulia Vezzani
John Schulman
E. Todorov
Sergey Levine
164
1,104
0
28 Sep 2017
Towards Optimally Decentralized Multi-Robot Collision Avoidance via Deep Reinforcement Learning
Pinxin Long
Tingxiang Fan
X. Liao
Wenxi Liu
Huatian Zhang
Jia Pan
OOD
95
458
0
28 Sep 2017
Neural Optimizer Search with Reinforcement Learning
Irwan Bello
Barret Zoph
Vijay Vasudevan
Quoc V. Le
ODL
90
386
0
21 Sep 2017
OptionGAN: Learning Joint Reward-Policy Options using Generative Adversarial Inverse Reinforcement Learning
Peter Henderson
Wei-Di Chang
Pierre-Luc Bacon
David Meger
Joelle Pineau
Doina Precup
GAN
77
73
0
20 Sep 2017
Deep Reinforcement Learning that Matters
Peter Henderson
Riashat Islam
Philip Bachman
Joelle Pineau
Doina Precup
David Meger
OffRL
149
1,964
0
19 Sep 2017
Learning Sampling Distributions for Robot Motion Planning
Brian Ichter
James Harrison
Marco Pavone
76
354
0
16 Sep 2017
TensorFlow Agents: Efficient Batched Reinforcement Learning in TensorFlow
Danijar Hafner
James Davidson
Vincent Vanhoucke
OffRL
57
49
0
08 Sep 2017
Deep Learning for Video Game Playing
Niels Justesen
Philip Bontrager
Julian Togelius
S. Risi
VLM
101
208
0
25 Aug 2017
A Brief Survey of Deep Reinforcement Learning
Kai Arulkumaran
M. Deisenroth
Miles Brundage
Anil Anthony Bharath
OffRL
143
2,830
0
19 Aug 2017
Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
Yuhuai Wu
Elman Mansimov
Shun Liao
Roger C. Grosse
Jimmy Ba
OffRL
127
630
0
17 Aug 2017
A Machine Learning Approach to Routing
Asaf Valadarsky
Michael Schapira
Dafna Shahaf
Aviv Tamar
71
38
0
10 Aug 2017
An Information-Theoretic Optimality Principle for Deep Reinforcement Learning
Felix Leibfried
Jordi Grau-Moya
Haitham Bou-Ammar
101
24
0
06 Aug 2017
Learning Transferable Architectures for Scalable Image Recognition
Barret Zoph
Vijay Vasudevan
Jonathon Shlens
Quoc V. Le
229
5,619
0
21 Jul 2017
Trust-PCL: An Off-Policy Trust Region Method for Continuous Control
Ofir Nachum
Mohammad Norouzi
Kelvin Xu
Dale Schuurmans
89
107
0
06 Jul 2017
Teacher-Student Curriculum Learning
Tambet Matiisen
Avital Oliver
Taco S. Cohen
John Schulman
ODL
109
382
0
01 Jul 2017
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
287
6,813
0
19 Feb 2015
Previous
1
2
3
...
169
170
171