Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1707.06347
Cited By
Proximal Policy Optimization Algorithms
20 July 2017
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Proximal Policy Optimization Algorithms"
50 / 6,962 papers shown
Title
Efficient Reinforcement Learning for StarCraft by Abstract Forward Models and Transfer Learning
Ruo-Ze Liu
Haifeng Guo
Xiaozhong Ji
Yang Yu
Zhen-Jia Pang
Zitai Xiao
Yuzhou Wu
Tong Lu
OffRL
19
13
0
02 Mar 2019
Regularity Normalization: Neuroscience-Inspired Unsupervised Attention across Neural Network Layers
Baihan Lin
19
2
0
27 Feb 2019
Neural Packet Classification
Eric Liang
Hang Zhu
Xin Jin
Ion Stoica
OffRL
35
120
0
27 Feb 2019
Design of intentional backdoors in sequential models
Zhaoyuan Yang
N. Iyer
Johan Reimann
Nurali Virani
SILM
AAML
17
38
0
26 Feb 2019
Cooperative Learning of Disjoint Syntax and Semantics
Serhii Havrylov
Germán Kruszewski
Armand Joulin
18
48
0
25 Feb 2019
Investigating Generalisation in Continuous Deep Reinforcement Learning
Chenyang Zhao
Olivier Sigaud
F. Stulp
Timothy M. Hospedales
OffRL
22
48
0
19 Feb 2019
Neural-encoding Human Experts' Domain Knowledge to Warm Start Reinforcement Learning
Andrew Silva
Matthew C. Gombolay
OffRL
27
20
0
15 Feb 2019
Robust Reinforcement Learning in POMDPs with Incomplete and Noisy Observations
Yuhui Wang
Hao He
Xiaoyang Tan
30
9
0
15 Feb 2019
Learn a Prior for RHEA for Better Online Planning
Xinyao Tong
W. Liu
Bin Li
OffRL
43
0
0
14 Feb 2019
Non-Asymptotic Analysis of Monte Carlo Tree Search
Devavrat Shah
Qiaomin Xie
Zhi Xu
19
9
0
14 Feb 2019
Deep Reinforcement Learning from Policy-Dependent Human Feedback
Dilip Arumugam
Jun Ki Lee
S. Saskin
Michael L. Littman
28
94
0
12 Feb 2019
VERIFAI: A Toolkit for the Design and Analysis of Artificial Intelligence-Based Systems
T. Dreossi
Daniel J. Fremont
Shromona Ghosh
Edward J. Kim
H. Ravanbakhsh
Marcell Vazquez-Chanlatte
S. Seshia
18
29
0
12 Feb 2019
Artificial Intelligence for Prosthetics - challenge solutions
L. Kidzinski
Carmichael F. Ong
Sharada Mohanty
Jennifer Hicks
Sean F. Carroll
...
E. Tumer
J. Watson
M. Salathé
Sergey Levine
Scott L. Delp
15
40
0
07 Feb 2019
A Meta-MDP Approach to Exploration for Lifelong Reinforcement Learning
Francisco M. Garcia
Philip S. Thomas
24
38
0
03 Feb 2019
Improving Evolutionary Strategies with Generative Neural Networks
Louis Faury
Clément Calauzènes
Olivier Fercoq
Syrine Krichene
27
12
0
31 Jan 2019
Go-Explore: a New Approach for Hard-Exploration Problems
Adrien Ecoffet
Joost Huizinga
Joel Lehman
Kenneth O. Stanley
Jeff Clune
AI4TS
24
362
0
30 Jan 2019
Discretizing Continuous Action Space for On-Policy Optimization
Yunhao Tang
Shipra Agrawal
OffRL
26
118
0
29 Jan 2019
Lyapunov-based Safe Policy Optimization for Continuous Control
Yinlam Chow
Ofir Nachum
Aleksandra Faust
Edgar A. Duénez-Guzmán
Mohammad Ghavamzadeh
33
244
0
28 Jan 2019
Designing a Multi-Objective Reward Function for Creating Teams of Robotic Bodyguards Using Deep Reinforcement Learning
Hassam Sheikh
Ladislau Bölöni
15
3
0
28 Jan 2019
The Assistive Multi-Armed Bandit
Lawrence Chan
Dylan Hadfield-Menell
S. Srinivasa
Anca Dragan
14
36
0
24 Jan 2019
Distillation Strategies for Proximal Policy Optimization
Sam Green
C. Vineyard
Ç. Koç
27
8
0
23 Jan 2019
Hierarchical Reinforcement Learning for Multi-agent MOBA Game
Zhijian Zhang
Haozheng Li
Lu Zhang
Tianyin Zheng
Ting Zhang
Xiong Hao
Xiaoxin Chen
Min Chen
Fangxu Xiao
Wei Zhou
14
15
0
23 Jan 2019
Trust Region Value Optimization using Kalman Filtering
Shirli Di-Castro Shashua
Shie Mannor
19
7
0
23 Jan 2019
Neuroflight: Next Generation Flight Control Firmware
W. Koch
R. Mancuso
Azer Bestavros
33
29
0
19 Jan 2019
On-Policy Trust Region Policy Optimisation with Replay Buffers
D. Kangin
N. Pugeault
OffRL
14
3
0
18 Jan 2019
Imitation-Regularized Offline Learning
Yifei Ma
Yu Wang
Balakrishnan
Balakrishnan Narayanaswamy
OffRL
8
22
0
15 Jan 2019
AutoPhase: Compiler Phase-Ordering for High Level Synthesis with Deep Reinforcement Learning
Ameer Haj-Ali
Qijing Huang
William S. Moses
J. Xiang
Ion Stoica
Krste Asanović
J. Wawrzynek
29
36
0
15 Jan 2019
Multi-Objective Reinforced Evolution in Mobile Neural Architecture Search
Xiangxiang Chu
Bo Zhang
Ruijun Xu
Hailong Ma
31
98
0
04 Jan 2019
A Theoretical Analysis of Deep Q-Learning
Jianqing Fan
Zhuoran Yang
Yuchen Xie
Zhaoran Wang
23
596
0
01 Jan 2019
Mid-Level Visual Representations Improve Generalization and Sample Efficiency for Learning Visuomotor Policies
Alexander Sax
Bradley Emi
Amir Zamir
Leonidas J. Guibas
Silvio Savarese
Jitendra Malik
SSL
39
16
0
31 Dec 2018
Learning to Walk via Deep Reinforcement Learning
Tuomas Haarnoja
Sehoon Ha
Aurick Zhou
Jie Tan
George Tucker
Sergey Levine
54
433
0
26 Dec 2018
VMAV-C: A Deep Attention-based Reinforcement Learning Algorithm for Model-based Control
Xingxing Liang
Qi Wang
Yanghe Feng
Zhong Liu
Jincai Huang
29
5
0
24 Dec 2018
TD-Regularized Actor-Critic Methods
Simone Parisi
Voot Tangkaratt
Jan Peters
Mohammad Emtiyaz Khan
OffRL
30
32
0
19 Dec 2018
Hierarchical Macro Strategy Model for MOBA Game AI
Bin Wu
Qiang Fu
Jing Liang
Peng-fei Qu
Xiaoqian Li
Liang Wang
Wei Liu
Wei Yang
Yongsheng Liu
31
63
0
19 Dec 2018
Learning Montezuma's Revenge from a Single Demonstration
Tim Salimans
Richard J. Chen
42
136
0
08 Dec 2018
Communication-Efficient Policy Gradient Methods for Distributed Reinforcement Learning
Tianyi Chen
Kaipeng Zhang
G. Giannakis
Tamer Basar
OffRL
29
41
0
07 Dec 2018
Zero-shot Deep Reinforcement Learning Driving Policy Transfer for Autonomous Vehicles based on Robust Control
Zhuo Xu
Chen Tang
Masayoshi Tomizuka
OffRL
27
35
0
07 Dec 2018
Quantifying Generalization in Reinforcement Learning
K. Cobbe
Oleg Klimov
Christopher Hesse
Taehoon Kim
John Schulman
OffRL
54
659
0
06 Dec 2018
Relative Entropy Regularized Policy Iteration
A. Abdolmaleki
Jost Tobias Springenberg
Jonas Degrave
Steven Bohez
Yuval Tassa
Dan Belov
N. Heess
Martin Riedmiller
27
72
0
05 Dec 2018
Adapting Auxiliary Losses Using Gradient Similarity
Yunshu Du
Wojciech M. Czarnecki
Siddhant M. Jayakumar
Mehrdad Farajtabar
Razvan Pascanu
Balaji Lakshminarayanan
35
156
0
05 Dec 2018
Mitigating Planner Overfitting in Model-Based Reinforcement Learning
Dilip Arumugam
David Abel
Kavosh Asadi
N. Gopalan
Christopher Grimm
Jun Ki Lee
Lucas Lehnert
Michael L. Littman
11
11
0
03 Dec 2018
Generative Adversarial Self-Imitation Learning
Yijie Guo
Junhyuk Oh
Satinder Singh
Honglak Lee
GAN
15
58
0
03 Dec 2018
Hardware Conditioned Policies for Multi-Robot Transfer Learning
Tao Chen
Adithyavairavan Murali
Abhinav Gupta
21
102
0
24 Nov 2018
Connecting the Dots Between MLE and RL for Sequence Prediction
Bowen Tan
Zhiting Hu
Zichao Yang
Ruslan Salakhutdinov
Eric Xing
28
24
0
24 Nov 2018
Guiding Policies with Language via Meta-Learning
John D. Co-Reyes
Abhishek Gupta
Suvansh Sanjeev
Nick Altieri
Jacob Andreas
John DeNero
Pieter Abbeel
Sergey Levine
LM&Ro
26
63
0
19 Nov 2018
Scalable agent alignment via reward modeling: a research direction
Jan Leike
David M. Krueger
Tom Everitt
Miljan Martic
Vishal Maini
Shane Legg
34
397
0
19 Nov 2018
Towards Governing Agent's Efficacy: Action-Conditional
β
β
β
-VAE for Deep Transparent Reinforcement Learning
John Yang
Gyujeong Lee
Minsung Hyun
Simyung Chang
Nojun Kwak
29
3
0
11 Nov 2018
Sample-Efficient Policy Learning based on Completely Behavior Cloning
Qiming Zou
Ling Wang
K. Lu
Yu Li
OffRL
22
0
0
09 Nov 2018
Meta-Learning for Multi-objective Reinforcement Learning
Xi Chen
Ali Ghadirzadeh
Mårten Björkman
Pablo G. Cámara
OffRL
23
54
0
08 Nov 2018
Correlation Filter Selection for Visual Tracking Using Reinforcement Learning
Yanchun Xie
Jimin Xiao
Hassan Jameel Asghar
Jeyarajan Thiyagalingam
Dali Kaafar
18
21
0
08 Nov 2018
Previous
1
2
3
...
136
137
138
139
140
Next