Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1707.06347
Cited By
v1
v2 (latest)
Proximal Policy Optimization Algorithms
20 July 2017
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Proximal Policy Optimization Algorithms"
26 / 626 papers shown
Title
Deep Model-Based Reinforcement Learning for High-Dimensional Problems, a Survey
Aske Plaat
W. Kosters
Mike Preuss
BDL
OffRL
103
17
0
11 Aug 2020
Learning Reward Functions from Diverse Sources of Human Feedback: Optimally Integrating Demonstrations and Preferences
Erdem Biyik
Dylan P. Losey
Malayandi Palan
Nicholas C. Landolfi
Gleb Shevchuk
Dorsa Sadigh
73
118
0
24 Jun 2020
DrNAS: Dirichlet Neural Architecture Search
Xiangning Chen
Ruochen Wang
Minhao Cheng
Xiaocheng Tang
Cho-Jui Hsieh
OOD
70
103
0
18 Jun 2020
COLREG-Compliant Collision Avoidance for Unmanned Surface Vehicle using Deep Reinforcement Learning
Eivind Meyer
Amalie Heiberg
Adil Rasheed
Omer San
73
74
0
16 Jun 2020
ShieldNN: A Provably Safe NN Filter for Unsafe NN Controllers
James Ferlez
Mahmoud M. Elnaggar
Yasser Shoukry
C. Fleming
AAML
95
33
0
16 Jun 2020
Residual Force Control for Agile Human Behavior Imitation and Extended Motion Synthesis
Ye Yuan
Kris Kitani
133
77
0
12 Jun 2020
Learning the Travelling Salesperson Problem Requires Rethinking Generalization
Chaitanya K. Joshi
Quentin Cappart
Louis-Martin Rousseau
T. Laurent
204
120
0
12 Jun 2020
AMEIR: Automatic Behavior Modeling, Interaction Exploration and MLP Investigation in the Recommender System
Pengyu Zhao
Kecheng Xiao
Yuanxing Zhang
Kaigui Bian
Wei Yan
100
16
0
10 Jun 2020
Neuroevolution of Self-Interpretable Agents
Yujin Tang
Duong Nguyen
David R Ha
114
113
0
18 Mar 2020
The TrojAI Software Framework: An OpenSource tool for Embedding Trojans into Deep Learning Models
Kiran Karra
C. Ashcraft
Neil Fendley
66
35
0
13 Mar 2020
Dynamic Experience Replay
Jieliang Luo
Hui Li
209
24
0
04 Mar 2020
Learning to reinforcement learn for Neural Architecture Search
J. Gomez
Joaquin Vanschoren
60
8
0
09 Nov 2019
Navigation Agents for the Visually Impaired: A Sidewalk Simulator and Experiments
Martin Weiss
Simon Chamorro
Roger Girgis
Margaux Luck
Samira Ebrahimi Kahou
Joseph Paul Cohen
Derek Nowrouzezahrai
Doina Precup
Florian Golemo
C. Pal
123
11
0
29 Oct 2019
Learning Hierarchical Control for Robust In-Hand Manipulation
Tingguang Li
K. Srinivasan
Max Meng
Wenzhen Yuan
Jeannette Bohg
81
41
0
24 Oct 2019
Collision Avoidance in Pedestrian-Rich Environments with Deep Reinforcement Learning
Michael Everett
Yu Fan Chen
Jonathan P. How
OffRL
107
174
0
24 Oct 2019
Distributed Distributional Deterministic Policy Gradients
Gabriel Barth-Maron
Matthew W. Hoffman
David Budden
Will Dabney
Dan Horgan
TB Dhruva
Alistair Muldal
N. Heess
Timothy Lillicrap
OffRL
98
480
0
23 Apr 2018
Diversity-Driven Exploration Strategy for Deep Reinforcement Learning
Zhang-Wei Hong
Tzu-Yun Shann
Shih-Yang Su
Yi-Hsiang Chang
Chun-Yi Lee
84
124
0
13 Feb 2018
Emergence of Locomotion Behaviours in Rich Environments
N. Heess
TB Dhruva
S. Sriram
Jay Lemmon
J. Merel
...
Tom Erez
Ziyun Wang
S. M. Ali Eslami
Martin Riedmiller
David Silver
210
938
0
07 Jul 2017
Sample Efficient Actor-Critic with Experience Replay
Ziyun Wang
V. Bapst
N. Heess
Volodymyr Mnih
Rémi Munos
Koray Kavukcuoglu
Nando de Freitas
109
762
0
03 Nov 2016
OpenAI Gym
Greg Brockman
Vicki Cheung
Ludwig Pettersson
Jonas Schneider
John Schulman
Jie Tang
Wojciech Zaremba
OffRL
ODL
225
5,087
0
05 Jun 2016
Benchmarking Deep Reinforcement Learning for Continuous Control
Yan Duan
Xi Chen
Rein Houthooft
John Schulman
Pieter Abbeel
OffRL
98
1,695
0
22 Apr 2016
Asynchronous Methods for Deep Reinforcement Learning
Volodymyr Mnih
Adria Puigdomenech Badia
M. Berk Mirza
Alex Graves
Timothy Lillicrap
Tim Harley
David Silver
Koray Kavukcuoglu
210
8,881
0
04 Feb 2016
High-Dimensional Continuous Control Using Generalized Advantage Estimation
John Schulman
Philipp Moritz
Sergey Levine
Michael I. Jordan
Pieter Abbeel
OffRL
135
3,439
0
08 Jun 2015
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
281
6,801
0
19 Feb 2015
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
2.1K
150,364
0
22 Dec 2014
The Arcade Learning Environment: An Evaluation Platform for General Agents
Marc G. Bellemare
Yavar Naddaf
J. Veness
Michael Bowling
120
3,021
0
19 Jul 2012
Previous
1
2
3
...
11
12
13