ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1707.06347
  4. Cited By
Proximal Policy Optimization Algorithms
v1v2 (latest)

Proximal Policy Optimization Algorithms

20 July 2017
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Proximal Policy Optimization Algorithms"

50 / 8,597 papers shown
Title
Sim-to-(Multi)-Real: Transfer of Low-Level Robust Control Policies to
  Multiple Quadrotors
Sim-to-(Multi)-Real: Transfer of Low-Level Robust Control Policies to Multiple Quadrotors
Artem Molchanov
Tao Chen
Wolfgang Hönig
James A. Preiss
Nora Ayanian
Gaurav Sukhatme
159
111
0
11 Mar 2019
Learning to Paint With Model-based Deep Reinforcement Learning
Learning to Paint With Model-based Deep Reinforcement Learning
Zhewei Huang
Wen Heng
Shuchang Zhou
GAN
117
156
0
11 Mar 2019
Sample-Efficient Model-Free Reinforcement Learning with Off-Policy
  Critics
Sample-Efficient Model-Free Reinforcement Learning with Off-Policy Critics
Denis Steckelmacher
Hélène Plisnier
D. Roijers
A. Nowé
OffRL
67
17
0
11 Mar 2019
Orthogonal Estimation of Wasserstein Distances
Orthogonal Estimation of Wasserstein Distances
Mark Rowland
Jiri Hron
Yunhao Tang
K. Choromanski
Tamás Sarlós
Adrian Weller
91
43
0
09 Mar 2019
Adaptive Power System Emergency Control using Deep Reinforcement
  Learning
Adaptive Power System Emergency Control using Deep Reinforcement Learning
Qiuhua Huang
Renke Huang
Weituo Hao
Jie Tan
Rui Fan
Zhenyu Huang
111
279
0
09 Mar 2019
Pixel-Attentive Policy Gradient for Multi-Fingered Grasping in Cluttered
  Scenes
Pixel-Attentive Policy Gradient for Multi-Fingered Grasping in Cluttered Scenes
Bohan Wu
Iretiayo Akinola
Peter K. Allen
54
34
0
08 Mar 2019
Distributed Policy Learning Based Random Access for Diversified QoS
  Requirements
Distributed Policy Learning Based Random Access for Diversified QoS Requirements
Zhiyuan Jiang
Sheng Zhou
Z. Niu
25
13
0
06 Mar 2019
Training in Task Space to Speed Up and Guide Reinforcement Learning
Training in Task Space to Speed Up and Guide Reinforcement Learning
Guillaume Bellegarda
Katie Byl
51
19
0
06 Mar 2019
Using Natural Language for Reward Shaping in Reinforcement Learning
Using Natural Language for Reward Shaping in Reinforcement Learning
Prasoon Goyal
S. Niekum
Raymond J. Mooney
LM&Ro
106
183
0
05 Mar 2019
Learning Exploration Policies for Navigation
Learning Exploration Policies for Navigation
Tao Chen
Saurabh Gupta
Abhinav Gupta
EgoV
83
239
0
05 Mar 2019
Deep Active Localization
Deep Active Localization
S. Gottipati
K. Seo
Dhaivat Bhatt
Vincent Mai
Krishna Murthy Jatavallabhula
Liam Paull
89
38
0
05 Mar 2019
Learning Dynamics Model in Reinforcement Learning by Incorporating the
  Long Term Future
Learning Dynamics Model in Reinforcement Learning by Incorporating the Long Term Future
Nan Rosemary Ke
Amanpreet Singh
Ahmed Touati
Anirudh Goyal
Yoshua Bengio
Devi Parikh
Dhruv Batra
78
48
0
05 Mar 2019
Episodic Learning with Control Lyapunov Functions for Uncertain Robotic
  Systems
Episodic Learning with Control Lyapunov Functions for Uncertain Robotic Systems
Andrew J. Taylor
Victor D. Dorobantu
Hoang Minh Le
Yisong Yue
Aaron D. Ames
190
79
0
04 Mar 2019
Model Primitive Hierarchical Lifelong Reinforcement Learning
Model Primitive Hierarchical Lifelong Reinforcement Learning
Bohan Wu
Jayesh K. Gupta
Mykel J. Kochenderfer
OffRL
45
10
0
04 Mar 2019
Sim-to-Real Transfer for Biped Locomotion
Sim-to-Real Transfer for Biped Locomotion
Wenhao Yu
Visak C. V. Kumar
Greg Turk
Chenxi Liu
60
115
0
04 Mar 2019
Hybrid Actor-Critic Reinforcement Learning in Parameterized Action Space
Hybrid Actor-Critic Reinforcement Learning in Parameterized Action Space
Zhou Fan
Ruilong Su
Weinan Zhang
Yong Yu
112
133
0
04 Mar 2019
NoRML: No-Reward Meta Learning
NoRML: No-Reward Meta Learning
Yuxiang Yang
Ken Caluwaerts
Atil Iscen
Jie Tan
Chelsea Finn
77
27
0
04 Mar 2019
Asynchronous Episodic Deep Deterministic Policy Gradient: Towards
  Continuous Control in Computationally Complex Environments
Asynchronous Episodic Deep Deterministic Policy Gradient: Towards Continuous Control in Computationally Complex Environments
Zhizheng Zhang
Jiale Chen
Zhibo Chen
Weiping Li
OffRL
93
61
0
03 Mar 2019
A Regularized Approach to Sparse Optimal Policy in Reinforcement
  Learning
A Regularized Approach to Sparse Optimal Policy in Reinforcement Learning
Xiang Li
Wenhao Yang
Zhihua Zhang
29
2
0
02 Mar 2019
Efficient Reinforcement Learning for StarCraft by Abstract Forward
  Models and Transfer Learning
Efficient Reinforcement Learning for StarCraft by Abstract Forward Models and Transfer Learning
Ruo-Ze Liu
Haifeng Guo
Xiaozhong Ji
Yang Yu
Zhen-Jia Pang
Zitai Xiao
Yuzhou Wu
Tong Lu
OffRL
97
13
0
02 Mar 2019
Model-Based Reinforcement Learning for Atari
Model-Based Reinforcement Learning for Atari
Lukasz Kaiser
Mohammad Babaeizadeh
Piotr Milos
B. Osinski
R. Campbell
...
Sergey Levine
Afroz Mohiuddin
Ryan Sepassi
George Tucker
Henryk Michalewski
OffRL
176
870
0
01 Mar 2019
Catalyst.RL: A Distributed Framework for Reproducible RL Research
Catalyst.RL: A Distributed Framework for Reproducible RL Research
Sergey Kolesnikov
Oleksii Hrinchuk
OffRL
42
8
0
28 Feb 2019
Regularity Normalization: Neuroscience-Inspired Unsupervised Attention
  across Neural Network Layers
Regularity Normalization: Neuroscience-Inspired Unsupervised Attention across Neural Network Layers
Baihan Lin
59
2
0
27 Feb 2019
Neural Packet Classification
Neural Packet Classification
Eric Liang
Hang Zhu
Xin Jin
Ion Stoica
OffRL
78
122
0
27 Feb 2019
S-TRIGGER: Continual State Representation Learning via Self-Triggered
  Generative Replay
S-TRIGGER: Continual State Representation Learning via Self-Triggered Generative Replay
Hugo Caselles-Dupré
Michael Garcia Ortiz
David Filliat
62
16
0
25 Feb 2019
Cooperative Learning of Disjoint Syntax and Semantics
Cooperative Learning of Disjoint Syntax and Semantics
Serhii Havrylov
Germán Kruszewski
Armand Joulin
75
48
0
25 Feb 2019
Investigating Generalisation in Continuous Deep Reinforcement Learning
Investigating Generalisation in Continuous Deep Reinforcement Learning
Chenyang Zhao
Olivier Sigaud
F. Stulp
Timothy M. Hospedales
OffRL
89
48
0
19 Feb 2019
Neural-encoding Human Experts' Domain Knowledge to Warm Start
  Reinforcement Learning
Neural-encoding Human Experts' Domain Knowledge to Warm Start Reinforcement Learning
Andrew Silva
Matthew C. Gombolay
OffRL
74
20
0
15 Feb 2019
Robust Reinforcement Learning in POMDPs with Incomplete and Noisy
  Observations
Robust Reinforcement Learning in POMDPs with Incomplete and Noisy Observations
Yuhui Wang
Hao He
Xiaoyang Tan
50
10
0
15 Feb 2019
Learning to Control Self-Assembling Morphologies: A Study of
  Generalization via Modularity
Learning to Control Self-Assembling Morphologies: A Study of Generalization via Modularity
Deepak Pathak
Chris Xiaoxuan Lu
Trevor Darrell
Phillip Isola
Alexei A. Efros
53
135
0
14 Feb 2019
Learn a Prior for RHEA for Better Online Planning
Learn a Prior for RHEA for Better Online Planning
Xinyao Tong
W. Liu
Bin Li
OffRL
107
0
0
14 Feb 2019
Non-Asymptotic Analysis of Monte Carlo Tree Search
Non-Asymptotic Analysis of Monte Carlo Tree Search
Devavrat Shah
Qiaomin Xie
Zhi Xu
34
9
0
14 Feb 2019
Deep Reinforcement Learning from Policy-Dependent Human Feedback
Deep Reinforcement Learning from Policy-Dependent Human Feedback
Dilip Arumugam
Jun Ki Lee
S. Saskin
Michael L. Littman
76
100
0
12 Feb 2019
Meta-Curvature
Meta-Curvature
Eunbyung Park
Junier B. Oliva
BDL
78
124
0
09 Feb 2019
Compatible Natural Gradient Policy Search
Compatible Natural Gradient Policy Search
Joni Pajarinen
Hong Linh Thai
R. Akrour
Jan Peters
Gerhard Neumann
69
22
0
07 Feb 2019
Artificial Intelligence for Prosthetics - challenge solutions
Artificial Intelligence for Prosthetics - challenge solutions
L. Kidzinski
Carmichael F. Ong
Sharada Mohanty
Jennifer Hicks
Sean F. Carroll
...
E. Tumer
J. Watson
M. Salathé
Sergey Levine
Scott L. Delp
55
42
0
07 Feb 2019
Separating value functions across time-scales
Separating value functions across time-scales
Joshua Romoff
Peter Henderson
Ahmed Touati
Emma Brunskill
Joelle Pineau
Yann Ollivier
78
25
0
05 Feb 2019
Obstacle Tower: A Generalization Challenge in Vision, Control, and
  Planning
Obstacle Tower: A Generalization Challenge in Vision, Control, and Planning
Arthur Juliani
Ahmed Khalifa
Vincent-Pierre Berges
Jonathan Harper
Ervin Teng
Hunter Henry
A. Crespi
Julian Togelius
Danny Lange
75
144
0
04 Feb 2019
The Natural Language of Actions
The Natural Language of Actions
Guy Tennenholtz
Shie Mannor
71
60
0
04 Feb 2019
A Meta-MDP Approach to Exploration for Lifelong Reinforcement Learning
A Meta-MDP Approach to Exploration for Lifelong Reinforcement Learning
Francisco M. Garcia
Philip S. Thomas
102
41
0
03 Feb 2019
Visual Rationalizations in Deep Reinforcement Learning for Atari Games
Visual Rationalizations in Deep Reinforcement Learning for Atari Games
L. Weitkamp
Elise van der Pol
Zeynep Akata
82
27
0
01 Feb 2019
Policy Consolidation for Continual Reinforcement Learning
Policy Consolidation for Continual Reinforcement Learning
Christos Kaplanis
Murray Shanahan
Claudia Clopath
CLLOffRL
78
51
0
01 Feb 2019
Learning Action Representations for Reinforcement Learning
Learning Action Representations for Reinforcement Learning
Yash Chandak
Georgios Theocharous
James E. Kostas
Scott M. Jordan
Philip S. Thomas
73
164
0
01 Feb 2019
Improving Evolutionary Strategies with Generative Neural Networks
Improving Evolutionary Strategies with Generative Neural Networks
Louis Faury
Clément Calauzènes
Olivier Fercoq
Syrine Krichene
65
13
0
31 Jan 2019
Go-Explore: a New Approach for Hard-Exploration Problems
Go-Explore: a New Approach for Hard-Exploration Problems
Adrien Ecoffet
Joost Huizinga
Joel Lehman
Kenneth O. Stanley
Jeff Clune
AI4TS
110
370
0
30 Jan 2019
InfoBot: Transfer and Exploration via the Information Bottleneck
InfoBot: Transfer and Exploration via the Information Bottleneck
Anirudh Goyal
Riashat Islam
Daniel Strouse
Zafarali Ahmed
M. Botvinick
Hugo Larochelle
Yoshua Bengio
Sergey Levine
OffRL
125
167
0
30 Jan 2019
Discretizing Continuous Action Space for On-Policy Optimization
Discretizing Continuous Action Space for On-Policy Optimization
Yunhao Tang
Shipra Agrawal
OffRL
103
124
0
29 Jan 2019
Self-organization of action hierarchy and compositionality by
  reinforcement learning with recurrent neural networks
Self-organization of action hierarchy and compositionality by reinforcement learning with recurrent neural networks
Dongqi Han
Kenji Doya
Jun Tani
AI4CE
102
20
0
29 Jan 2019
Lyapunov-based Safe Policy Optimization for Continuous Control
Lyapunov-based Safe Policy Optimization for Continuous Control
Yinlam Chow
Ofir Nachum
Aleksandra Faust
Edgar A. Duénez-Guzmán
Mohammad Ghavamzadeh
99
247
0
28 Jan 2019
Making Deep Q-learning methods robust to time discretization
Making Deep Q-learning methods robust to time discretization
Corentin Tallec
Léonard Blier
Yann Ollivier
OODOffRL
67
91
0
28 Jan 2019
Previous
123...165166167...170171172
Next