ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1707.06347
  4. Cited By
Proximal Policy Optimization Algorithms
v1v2 (latest)

Proximal Policy Optimization Algorithms

20 July 2017
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Proximal Policy Optimization Algorithms"

44 / 8,594 papers shown
Title
Competitive Multi-agent Inverse Reinforcement Learning with Sub-optimal
  Demonstrations
Competitive Multi-agent Inverse Reinforcement Learning with Sub-optimal Demonstrations
Xingyu Wang
Diego Klabjan
67
40
0
07 Jan 2018
Jointly Learning to Construct and Control Agents using Deep
  Reinforcement Learning
Jointly Learning to Construct and Control Agents using Deep Reinforcement Learning
Charles B. Schaff
David Yunis
Ayan Chakrabarti
Matthew R. Walter
73
101
0
04 Jan 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
321
8,455
0
04 Jan 2018
SBEED: Convergent Reinforcement Learning with Nonlinear Function
  Approximation
SBEED: Convergent Reinforcement Learning with Nonlinear Function Approximation
Bo Dai
Albert Eaton Shaw
Lihong Li
Lin Xiao
Niao He
Zhen Liu
Jianshu Chen
Le Song
90
25
0
29 Dec 2017
Boosting the Actor with Dual Critic
Boosting the Actor with Dual Critic
Bo Dai
Albert Eaton Shaw
Niao He
Lihong Li
Le Song
70
46
0
29 Dec 2017
RLlib: Abstractions for Distributed Reinforcement Learning
RLlib: Abstractions for Distributed Reinforcement Learning
Eric Liang
Richard Liaw
Philipp Moritz
Robert Nishihara
Roy Fox
Ken Goldberg
Joseph E. Gonzalez
Michael I. Jordan
Ion Stoica
OffRLAI4CE
96
175
0
26 Dec 2017
Safe Policy Improvement with Baseline Bootstrapping
Safe Policy Improvement with Baseline Bootstrapping
Romain Laroche
P. Trichelair
Rémi Tachet des Combes
OffRL
96
201
0
19 Dec 2017
Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative
  for Training Deep Neural Networks for Reinforcement Learning
Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning
F. Such
Vashisht Madhavan
Edoardo Conti
Joel Lehman
Kenneth O. Stanley
Jeff Clune
132
697
0
18 Dec 2017
Ray: A Distributed Framework for Emerging AI Applications
Ray: A Distributed Framework for Emerging AI Applications
Philipp Moritz
Robert Nishihara
Stephanie Wang
Alexey Tumanov
Richard Liaw
...
Melih Elibol
Zongheng Yang
William Paul
Michael I. Jordan
Ion Stoica
GNN
157
1,269
0
16 Dec 2017
Bayesian Policy Gradients via Alpha Divergence Dropout Inference
Bayesian Policy Gradients via Alpha Divergence Dropout Inference
Peter Henderson
T. Doan
Riashat Islam
David Meger
BDL
85
13
0
06 Dec 2017
A Deeper Look at Experience Replay
A Deeper Look at Experience Replay
Shangtong Zhang
R. Sutton
OffRLVLM
116
276
0
04 Dec 2017
Progressive Neural Architecture Search
Progressive Neural Architecture Search
Chenxi Liu
Barret Zoph
Maxim Neumann
Jonathon Shlens
Wei Hua
Li Li
Li Fei-Fei
Alan Yuille
Jonathan Huang
Kevin Patrick Murphy
150
2,000
0
02 Dec 2017
Time Limits in Reinforcement Learning
Time Limits in Reinforcement Learning
Fabio Pardo
Arash Tavakoli
Vitaly Levdik
Petar Kormushev
CLL
103
161
0
01 Dec 2017
Comparing Deep Reinforcement Learning and Evolutionary Methods in
  Continuous Control
Comparing Deep Reinforcement Learning and Evolutionary Methods in Continuous Control
Shangtong Zhang
Osmar R. Zaiane
66
11
0
30 Nov 2017
Learnings Options End-to-End for Continuous Action Tasks
Learnings Options End-to-End for Continuous Action Tasks
Martin Klissarov
Pierre-Luc Bacon
J. Harb
Doina Precup
58
55
0
30 Nov 2017
Cascade Attribute Learning Network
Cascade Attribute Learning Network
Zhuo Xu
Haonan Chang
Masayoshi Tomizuka
45
4
0
24 Nov 2017
Action Branching Architectures for Deep Reinforcement Learning
Action Branching Architectures for Deep Reinforcement Learning
Arash Tavakoli
Fabio Pardo
Petar Kormushev
71
265
0
24 Nov 2017
Can Deep Reinforcement Learning Solve Erdos-Selfridge-Spencer Games?
Can Deep Reinforcement Learning Solve Erdos-Selfridge-Spencer Games?
M. Raghu
A. Irpan
Jacob Andreas
Robert D. Kleinberg
Quoc V. Le
Jon M. Kleinberg
105
28
0
07 Nov 2017
Policy Optimization by Genetic Distillation
Policy Optimization by Genetic Distillation
Tanmay Gangwani
Jian-wei Peng
62
18
0
03 Nov 2017
Transfer Learning to Learn with Multitask Neural Model Search
Transfer Learning to Learn with Multitask Neural Model Search
Catherine Wong
Andrea Gesmundo
44
5
0
30 Oct 2017
Diff-DAC: Distributed Actor-Critic for Average Multitask Deep
  Reinforcement Learning
Diff-DAC: Distributed Actor-Critic for Average Multitask Deep Reinforcement Learning
Sergio Valcarcel Macua
Aleksi Tukiainen
D. Hernández
David Baldazo
Enrique Munoz de Cote
S. Zazo
100
29
0
28 Oct 2017
Meta Learning Shared Hierarchies
Meta Learning Shared Hierarchies
Kevin Frans
Jonathan Ho
Xi Chen
Pieter Abbeel
John Schulman
75
355
0
26 Oct 2017
Deep Imitation Learning for Complex Manipulation Tasks from Virtual
  Reality Teleoperation
Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation
Tianhao Zhang
Zoe McCarthy
Owen Jow
Dennis Lee
Xi Chen
Ken Goldberg
Pieter Abbeel
SSL
142
662
0
12 Oct 2017
Emergent Complexity via Multi-Agent Competition
Emergent Complexity via Multi-Agent Competition
Trapit Bansal
J. Pachocki
Szymon Sidor
Ilya Sutskever
Igor Mordatch
75
391
0
10 Oct 2017
Continuous Adaptation via Meta-Learning in Nonstationary and Competitive
  Environments
Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments
Maruan Al-Shedivat
Trapit Bansal
Yuri Burda
Ilya Sutskever
Igor Mordatch
Pieter Abbeel
CLL
79
354
0
10 Oct 2017
Recurrent Deterministic Policy Gradient Method for Bipedal Locomotion on
  Rough Terrain Challenge
Recurrent Deterministic Policy Gradient Method for Bipedal Locomotion on Rough Terrain Challenge
Doo Re Song
Chuanyu Yang
C. McGreavy
Zhibin Li
167
30
0
08 Oct 2017
Parameter Sharing Deep Deterministic Policy Gradient for Cooperative
  Multi-agent Reinforcement Learning
Parameter Sharing Deep Deterministic Policy Gradient for Cooperative Multi-agent Reinforcement Learning
Xiangxiang Chu
Hangjun Ye
69
56
0
01 Oct 2017
Learning a Structured Neural Network Policy for a Hopping Task
Learning a Structured Neural Network Policy for a Hopping Task
Julian Viereck
Jules Kozolinsky
Alexander Herzog
Ludovic Righetti
85
12
0
29 Sep 2017
Learning Complex Dexterous Manipulation with Deep Reinforcement Learning
  and Demonstrations
Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations
Aravind Rajeswaran
Vikash Kumar
Abhishek Gupta
Giulia Vezzani
John Schulman
E. Todorov
Sergey Levine
164
1,104
0
28 Sep 2017
Towards Optimally Decentralized Multi-Robot Collision Avoidance via Deep
  Reinforcement Learning
Towards Optimally Decentralized Multi-Robot Collision Avoidance via Deep Reinforcement Learning
Pinxin Long
Tingxiang Fan
X. Liao
Wenxi Liu
Huatian Zhang
Jia Pan
OOD
95
458
0
28 Sep 2017
Neural Optimizer Search with Reinforcement Learning
Neural Optimizer Search with Reinforcement Learning
Irwan Bello
Barret Zoph
Vijay Vasudevan
Quoc V. Le
ODL
90
386
0
21 Sep 2017
OptionGAN: Learning Joint Reward-Policy Options using Generative
  Adversarial Inverse Reinforcement Learning
OptionGAN: Learning Joint Reward-Policy Options using Generative Adversarial Inverse Reinforcement Learning
Peter Henderson
Wei-Di Chang
Pierre-Luc Bacon
David Meger
Joelle Pineau
Doina Precup
GAN
77
73
0
20 Sep 2017
Deep Reinforcement Learning that Matters
Deep Reinforcement Learning that Matters
Peter Henderson
Riashat Islam
Philip Bachman
Joelle Pineau
Doina Precup
David Meger
OffRL
149
1,968
0
19 Sep 2017
Learning Sampling Distributions for Robot Motion Planning
Learning Sampling Distributions for Robot Motion Planning
Brian Ichter
James Harrison
Marco Pavone
76
354
0
16 Sep 2017
TensorFlow Agents: Efficient Batched Reinforcement Learning in
  TensorFlow
TensorFlow Agents: Efficient Batched Reinforcement Learning in TensorFlow
Danijar Hafner
James Davidson
Vincent Vanhoucke
OffRL
57
49
0
08 Sep 2017
Deep Learning for Video Game Playing
Deep Learning for Video Game Playing
Niels Justesen
Philip Bontrager
Julian Togelius
S. Risi
VLM
101
208
0
25 Aug 2017
A Brief Survey of Deep Reinforcement Learning
A Brief Survey of Deep Reinforcement Learning
Kai Arulkumaran
M. Deisenroth
Miles Brundage
Anil Anthony Bharath
OffRL
143
2,830
0
19 Aug 2017
Scalable trust-region method for deep reinforcement learning using
  Kronecker-factored approximation
Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
Yuhuai Wu
Elman Mansimov
Shun Liao
Roger C. Grosse
Jimmy Ba
OffRL
127
630
0
17 Aug 2017
A Machine Learning Approach to Routing
A Machine Learning Approach to Routing
Asaf Valadarsky
Michael Schapira
Dafna Shahaf
Aviv Tamar
71
38
0
10 Aug 2017
An Information-Theoretic Optimality Principle for Deep Reinforcement
  Learning
An Information-Theoretic Optimality Principle for Deep Reinforcement Learning
Felix Leibfried
Jordi Grau-Moya
Haitham Bou-Ammar
101
24
0
06 Aug 2017
Learning Transferable Architectures for Scalable Image Recognition
Learning Transferable Architectures for Scalable Image Recognition
Barret Zoph
Vijay Vasudevan
Jonathon Shlens
Quoc V. Le
232
5,621
0
21 Jul 2017
Trust-PCL: An Off-Policy Trust Region Method for Continuous Control
Trust-PCL: An Off-Policy Trust Region Method for Continuous Control
Ofir Nachum
Mohammad Norouzi
Kelvin Xu
Dale Schuurmans
89
107
0
06 Jul 2017
Teacher-Student Curriculum Learning
Teacher-Student Curriculum Learning
Tambet Matiisen
Avital Oliver
Taco S. Cohen
John Schulman
ODL
109
382
0
01 Jul 2017
Trust Region Policy Optimization
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
287
6,820
0
19 Feb 2015
Previous
123...170171172