ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1707.06347
  4. Cited By
Proximal Policy Optimization Algorithms
v1v2 (latest)

Proximal Policy Optimization Algorithms

20 July 2017
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Proximal Policy Optimization Algorithms"

50 / 8,517 papers shown
Title
Model-Ensemble Trust-Region Policy Optimization
Model-Ensemble Trust-Region Policy Optimization
Thanard Kurutach
I. Clavera
Yan Duan
Aviv Tamar
Pieter Abbeel
84
453
0
28 Feb 2018
Computational Theories of Curiosity-Driven Learning
Computational Theories of Curiosity-Driven Learning
Pierre-Yves Oudeyer
82
65
0
28 Feb 2018
The Mirage of Action-Dependent Baselines in Reinforcement Learning
The Mirage of Action-Dependent Baselines in Reinforcement Learning
George Tucker
Surya Bhupatiraju
S. Gu
Richard Turner
Zoubin Ghahramani
Sergey Levine
OffRL
96
127
0
27 Feb 2018
Reinforcement and Imitation Learning for Diverse Visuomotor Skills
Reinforcement and Imitation Learning for Diverse Visuomotor Skills
Yuke Zhu
Ziyun Wang
J. Merel
Andrei A. Rusu
Tom Erez
...
S. Tunyasuvunakool
János Kramár
R. Hadsell
Nando de Freitas
N. Heess
SSL
120
320
0
26 Feb 2018
Addressing Function Approximation Error in Actor-Critic Methods
Addressing Function Approximation Error in Actor-Critic Methods
Scott Fujimoto
H. V. Hoof
David Meger
OffRL
212
5,233
0
26 Feb 2018
Multi-Goal Reinforcement Learning: Challenging Robotics Environments and
  Request for Research
Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research
Matthias Plappert
Marcin Andrychowicz
Alex Ray
Bob McGrew
Bowen Baker
...
Joshua Tobin
Maciek Chociej
Peter Welinder
Vikash Kumar
Wojciech Zaremba
75
573
0
26 Feb 2018
A DIRT-T Approach to Unsupervised Domain Adaptation
A DIRT-T Approach to Unsupervised Domain Adaptation
Rui Shu
Hung Bui
Hirokazu Narui
Stefano Ermon
80
628
0
23 Feb 2018
Verifying Controllers Against Adversarial Examples with Bayesian
  Optimization
Verifying Controllers Against Adversarial Examples with Bayesian Optimization
Shromona Ghosh
Felix Berkenkamp
G. Ranade
S. Qadeer
Ashish Kapoor
AAML
96
45
0
23 Feb 2018
Structured Control Nets for Deep Reinforcement Learning
Structured Control Nets for Deep Reinforcement Learning
Mario Srouji
Jian Zhang
Ruslan Salakhutdinov
75
43
0
22 Feb 2018
Clipped Action Policy Gradient
Clipped Action Policy Gradient
Yasuhiro Fujita
S. Maeda
OffRL
53
37
0
21 Feb 2018
Learning to Play with Intrinsically-Motivated Self-Aware Agents
Learning to Play with Intrinsically-Motivated Self-Aware Agents
Nick Haber
Damian Mrowca
Li Fei-Fei
Daniel L. K. Yamins
LRM
96
120
0
21 Feb 2018
Fourier Policy Gradients
Fourier Policy Gradients
M. Fellows
K. Ciosek
Shimon Whiteson
53
15
0
19 Feb 2018
Learning High-level Representations from Demonstrations
Learning High-level Representations from Demonstrations
Garrett Andersen
Peter Vrancx
Haitham Bou-Ammar
39
3
0
19 Feb 2018
GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement
  Learning Algorithms
GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms
Cédric Colas
Olivier Sigaud
Pierre-Yves Oudeyer
130
159
0
14 Feb 2018
Evolved Policy Gradients
Evolved Policy Gradients
Rein Houthooft
Richard Y. Chen
Phillip Isola
Bradly C. Stadie
Filip Wolski
Jonathan Ho
Pieter Abbeel
105
227
0
13 Feb 2018
Diversity-Driven Exploration Strategy for Deep Reinforcement Learning
Diversity-Driven Exploration Strategy for Deep Reinforcement Learning
Zhang-Wei Hong
Tzu-Yun Shann
Shih-Yang Su
Yi-Hsiang Chang
Chun-Yi Lee
101
124
0
13 Feb 2018
Hierarchical Learning for Modular Robots
Hierarchical Learning for Modular Robots
R. Kojcev
Nora Etxezarreta
Alejandro Hernández
Víctor Mayoral
46
4
0
12 Feb 2018
VR-Goggles for Robots: Real-to-sim Domain Adaptation for Visual Control
VR-Goggles for Robots: Real-to-sim Domain Adaptation for Visual Control
Jingwei Zhang
L. Tai
Peng Yun
Yufeng Xiong
Ming-Yuan Liu
Joschka Boedecker
Wolfram Burgard
79
122
0
01 Feb 2018
Learning Symmetric and Low-energy Locomotion
Learning Symmetric and Low-energy Locomotion
Wenhao Yu
Greg Turk
Chenxi Liu
122
186
0
24 Jan 2018
An Empirical Analysis of Proximal Policy Optimization with
  Kronecker-factored Natural Gradients
An Empirical Analysis of Proximal Policy Optimization with Kronecker-factored Natural Gradients
Jiaming Song
Yuhuai Wu
37
2
0
17 Jan 2018
Model-Based Action Exploration for Learning Dynamic Motion Skills
Model-Based Action Exploration for Learning Dynamic Motion Skills
Glen Berseth
M. van de Panne
43
0
0
11 Jan 2018
Expected Policy Gradients for Reinforcement Learning
Expected Policy Gradients for Reinforcement Learning
K. Ciosek
Shimon Whiteson
116
53
0
10 Jan 2018
Distributed Deep Reinforcement Learning: Learn how to play Atari games
  in 21 minutes
Distributed Deep Reinforcement Learning: Learn how to play Atari games in 21 minutes
Igor Adamski
R. Adamski
T. Grel
Adam Jedrych
Kamil Kaczmarek
Henryk Michalewski
OffRL
121
37
0
09 Jan 2018
Competitive Multi-agent Inverse Reinforcement Learning with Sub-optimal
  Demonstrations
Competitive Multi-agent Inverse Reinforcement Learning with Sub-optimal Demonstrations
Xingyu Wang
Diego Klabjan
67
40
0
07 Jan 2018
Jointly Learning to Construct and Control Agents using Deep
  Reinforcement Learning
Jointly Learning to Construct and Control Agents using Deep Reinforcement Learning
Charles B. Schaff
David Yunis
Ayan Chakrabarti
Matthew R. Walter
73
101
0
04 Jan 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
321
8,450
0
04 Jan 2018
SBEED: Convergent Reinforcement Learning with Nonlinear Function
  Approximation
SBEED: Convergent Reinforcement Learning with Nonlinear Function Approximation
Bo Dai
Albert Eaton Shaw
Lihong Li
Lin Xiao
Niao He
Zhen Liu
Jianshu Chen
Le Song
90
25
0
29 Dec 2017
Boosting the Actor with Dual Critic
Boosting the Actor with Dual Critic
Bo Dai
Albert Eaton Shaw
Niao He
Lihong Li
Le Song
70
46
0
29 Dec 2017
RLlib: Abstractions for Distributed Reinforcement Learning
RLlib: Abstractions for Distributed Reinforcement Learning
Eric Liang
Richard Liaw
Philipp Moritz
Robert Nishihara
Roy Fox
Ken Goldberg
Joseph E. Gonzalez
Michael I. Jordan
Ion Stoica
OffRLAI4CE
96
175
0
26 Dec 2017
Safe Policy Improvement with Baseline Bootstrapping
Safe Policy Improvement with Baseline Bootstrapping
Romain Laroche
P. Trichelair
Rémi Tachet des Combes
OffRL
96
201
0
19 Dec 2017
Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative
  for Training Deep Neural Networks for Reinforcement Learning
Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning
F. Such
Vashisht Madhavan
Edoardo Conti
Joel Lehman
Kenneth O. Stanley
Jeff Clune
126
697
0
18 Dec 2017
Ray: A Distributed Framework for Emerging AI Applications
Ray: A Distributed Framework for Emerging AI Applications
Philipp Moritz
Robert Nishihara
Stephanie Wang
Alexey Tumanov
Richard Liaw
...
Melih Elibol
Zongheng Yang
William Paul
Michael I. Jordan
Ion Stoica
GNN
131
1,269
0
16 Dec 2017
Bayesian Policy Gradients via Alpha Divergence Dropout Inference
Bayesian Policy Gradients via Alpha Divergence Dropout Inference
Peter Henderson
T. Doan
Riashat Islam
David Meger
BDL
85
13
0
06 Dec 2017
A Deeper Look at Experience Replay
A Deeper Look at Experience Replay
Shangtong Zhang
R. Sutton
OffRLVLM
107
275
0
04 Dec 2017
Progressive Neural Architecture Search
Progressive Neural Architecture Search
Chenxi Liu
Barret Zoph
Maxim Neumann
Jonathon Shlens
Wei Hua
Li Li
Li Fei-Fei
Alan Yuille
Jonathan Huang
Kevin Patrick Murphy
139
1,998
0
02 Dec 2017
Time Limits in Reinforcement Learning
Time Limits in Reinforcement Learning
Fabio Pardo
Arash Tavakoli
Vitaly Levdik
Petar Kormushev
CLL
103
161
0
01 Dec 2017
Comparing Deep Reinforcement Learning and Evolutionary Methods in
  Continuous Control
Comparing Deep Reinforcement Learning and Evolutionary Methods in Continuous Control
Shangtong Zhang
Osmar R. Zaiane
66
11
0
30 Nov 2017
Learnings Options End-to-End for Continuous Action Tasks
Learnings Options End-to-End for Continuous Action Tasks
Martin Klissarov
Pierre-Luc Bacon
J. Harb
Doina Precup
58
55
0
30 Nov 2017
Cascade Attribute Learning Network
Cascade Attribute Learning Network
Zhuo Xu
Haonan Chang
Masayoshi Tomizuka
45
4
0
24 Nov 2017
Action Branching Architectures for Deep Reinforcement Learning
Action Branching Architectures for Deep Reinforcement Learning
Arash Tavakoli
Fabio Pardo
Petar Kormushev
71
265
0
24 Nov 2017
Can Deep Reinforcement Learning Solve Erdos-Selfridge-Spencer Games?
Can Deep Reinforcement Learning Solve Erdos-Selfridge-Spencer Games?
M. Raghu
A. Irpan
Jacob Andreas
Robert D. Kleinberg
Quoc V. Le
Jon M. Kleinberg
105
28
0
07 Nov 2017
Policy Optimization by Genetic Distillation
Policy Optimization by Genetic Distillation
Tanmay Gangwani
Jian-wei Peng
62
18
0
03 Nov 2017
Transfer Learning to Learn with Multitask Neural Model Search
Transfer Learning to Learn with Multitask Neural Model Search
Catherine Wong
Andrea Gesmundo
44
5
0
30 Oct 2017
Diff-DAC: Distributed Actor-Critic for Average Multitask Deep
  Reinforcement Learning
Diff-DAC: Distributed Actor-Critic for Average Multitask Deep Reinforcement Learning
Sergio Valcarcel Macua
Aleksi Tukiainen
D. Hernández
David Baldazo
Enrique Munoz de Cote
S. Zazo
96
29
0
28 Oct 2017
Meta Learning Shared Hierarchies
Meta Learning Shared Hierarchies
Kevin Frans
Jonathan Ho
Xi Chen
Pieter Abbeel
John Schulman
73
355
0
26 Oct 2017
Deep Imitation Learning for Complex Manipulation Tasks from Virtual
  Reality Teleoperation
Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation
Tianhao Zhang
Zoe McCarthy
Owen Jow
Dennis Lee
Xi Chen
Ken Goldberg
Pieter Abbeel
SSL
142
662
0
12 Oct 2017
Emergent Complexity via Multi-Agent Competition
Emergent Complexity via Multi-Agent Competition
Trapit Bansal
J. Pachocki
Szymon Sidor
Ilya Sutskever
Igor Mordatch
75
391
0
10 Oct 2017
Continuous Adaptation via Meta-Learning in Nonstationary and Competitive
  Environments
Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments
Maruan Al-Shedivat
Trapit Bansal
Yuri Burda
Ilya Sutskever
Igor Mordatch
Pieter Abbeel
CLL
79
354
0
10 Oct 2017
Recurrent Deterministic Policy Gradient Method for Bipedal Locomotion on
  Rough Terrain Challenge
Recurrent Deterministic Policy Gradient Method for Bipedal Locomotion on Rough Terrain Challenge
Doo Re Song
Chuanyu Yang
C. McGreavy
Zhibin Li
167
30
0
08 Oct 2017
Parameter Sharing Deep Deterministic Policy Gradient for Cooperative
  Multi-agent Reinforcement Learning
Parameter Sharing Deep Deterministic Policy Gradient for Cooperative Multi-agent Reinforcement Learning
Xiangxiang Chu
Hangjun Ye
69
56
0
01 Oct 2017
Previous
123...169170171
Next