ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1707.06347
  4. Cited By
Proximal Policy Optimization Algorithms
v1v2 (latest)

Proximal Policy Optimization Algorithms

20 July 2017
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Proximal Policy Optimization Algorithms"

50 / 8,596 papers shown
Title
The Assistive Multi-Armed Bandit
The Assistive Multi-Armed Bandit
Lawrence Chan
Dylan Hadfield-Menell
S. Srinivasa
Anca Dragan
61
36
0
24 Jan 2019
Learning agile and dynamic motor skills for legged robots
Learning agile and dynamic motor skills for legged robots
Jemin Hwangbo
Joonho Lee
Alexey Dosovitskiy
Dario Bellicoso
Vassilios Tsounis
V. Koltun
Marco Hutter
162
1,321
0
24 Jan 2019
Ablation Studies in Artificial Neural Networks
Ablation Studies in Artificial Neural Networks
Richard Meyes
Melanie Lu
Constantin Waubert de Puiseau
Tobias Meisen
69
216
0
24 Jan 2019
Sample Complexity of Estimating the Policy Gradient for Nearly
  Deterministic Dynamical Systems
Sample Complexity of Estimating the Policy Gradient for Nearly Deterministic Dynamical Systems
Osbert Bastani
61
4
0
24 Jan 2019
Hierarchical Reinforcement Learning for Multi-agent MOBA Game
Hierarchical Reinforcement Learning for Multi-agent MOBA Game
Zhijian Zhang
Haozheng Li
Lu Zhang
Tianyin Zheng
Ting Zhang
Xiong Hao
Xiaoxin Chen
Min Chen
Fangxu Xiao
Wei Zhou
41
15
0
23 Jan 2019
Trust Region Value Optimization using Kalman Filtering
Trust Region Value Optimization using Kalman Filtering
Shirli Di-Castro Shashua
Shie Mannor
56
8
0
23 Jan 2019
Neuroflight: Next Generation Flight Control Firmware
Neuroflight: Next Generation Flight Control Firmware
W. Koch
R. Mancuso
Azer Bestavros
65
30
0
19 Jan 2019
Imitation-Regularized Offline Learning
Imitation-Regularized Offline Learning
Yifei Ma
Yu Wang
Balakrishnan
Balakrishnan Narayanaswamy
OffRL
61
22
0
15 Jan 2019
AutoPhase: Compiler Phase-Ordering for High Level Synthesis with Deep
  Reinforcement Learning
AutoPhase: Compiler Phase-Ordering for High Level Synthesis with Deep Reinforcement Learning
Ameer Haj-Ali
Qijing Huang
William S. Moses
J. Xiang
Ion Stoica
Krste Asanović
J. Wawrzynek
42
36
0
15 Jan 2019
Learning Accurate Extended-Horizon Predictions of High Dimensional
  Trajectories
Learning Accurate Extended-Horizon Predictions of High Dimensional Trajectories
B. Gaudet
R. Linares
R. Furfaro
21
1
0
12 Jan 2019
Motion Perception in Reinforcement Learning with Dynamic Objects
Motion Perception in Reinforcement Learning with Dynamic Objects
Artemij Amiranashvili
Alexey Dosovitskiy
V. Koltun
Thomas Brox
66
35
0
10 Jan 2019
Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly
  Complex and Diverse Learning Environments and Their Solutions
Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions
Rui Wang
Joel Lehman
Jeff Clune
Kenneth O. Stanley
121
250
0
07 Jan 2019
Recurrent Control Nets for Deep Reinforcement Learning
Recurrent Control Nets for Deep Reinforcement Learning
Vincent Liu
Ademi Adeniji
Nathaniel Lee
Jason Zhao
Mario Srouji
18
3
0
06 Jan 2019
Exploring applications of deep reinforcement learning for real-world
  autonomous driving systems
Exploring applications of deep reinforcement learning for real-world autonomous driving systems
V. Talpaert
Ibrahim Sobh
Ravi Kiran
Patrick Mannion
S. Yogamani
Ahmad El-Sallab
P. Pérez
70
74
0
06 Jan 2019
On the Utility of Model Learning in HRI
On the Utility of Model Learning in HRI
Gokul Swamy
Jens Schulz
Rohan Choudhury
Dylan Hadfield-Menell
Anca Dragan
76
54
0
04 Jan 2019
Multi-Objective Reinforced Evolution in Mobile Neural Architecture
  Search
Multi-Objective Reinforced Evolution in Mobile Neural Architecture Search
Xiangxiang Chu
Bo Zhang
Ruijun Xu
Hailong Ma
99
98
0
04 Jan 2019
A Theoretical Analysis of Deep Q-Learning
A Theoretical Analysis of Deep Q-Learning
Jianqing Fan
Zhuoran Yang
Yuchen Xie
Zhaoran Wang
193
611
0
01 Jan 2019
An Active Learning Framework for Efficient Robust Policy Search
An Active Learning Framework for Efficient Robust Policy Search
Sai Kiran Narayanaswami
N. Sudarsanam
Balaraman Ravindran
36
0
0
01 Jan 2019
Mid-Level Visual Representations Improve Generalization and Sample
  Efficiency for Learning Visuomotor Policies
Mid-Level Visual Representations Improve Generalization and Sample Efficiency for Learning Visuomotor Policies
Alexander Sax
Bradley Emi
Amir Zamir
Leonidas Guibas
Silvio Savarese
Jitendra Malik
SSL
77
16
0
31 Dec 2018
Learning to Design RNA
Learning to Design RNA
Frederic Runge
Daniel Stoll
Stefan Falkner
Frank Hutter
89
72
0
31 Dec 2018
Learn to Interpret Atari Agents
Learn to Interpret Atari Agents
Zhao Yang
S. Bai
Li Zhang
Philip Torr
80
29
0
29 Dec 2018
Learning to Walk via Deep Reinforcement Learning
Learning to Walk via Deep Reinforcement Learning
Tuomas Haarnoja
Sehoon Ha
Aurick Zhou
Jie Tan
George Tucker
Sergey Levine
137
442
0
26 Dec 2018
VMAV-C: A Deep Attention-based Reinforcement Learning Algorithm for
  Model-based Control
VMAV-C: A Deep Attention-based Reinforcement Learning Algorithm for Model-based Control
Xingxing Liang
Qi Wang
Yanghe Feng
Zhong Liu
Jincai Huang
65
5
0
24 Dec 2018
SNAS: Stochastic Neural Architecture Search
SNAS: Stochastic Neural Architecture Search
Sirui Xie
Hehui Zheng
Chunxiao Liu
Liang Lin
97
940
0
24 Dec 2018
Introducing Neuromodulation in Deep Neural Networks to Learn Adaptive
  Behaviours
Introducing Neuromodulation in Deep Neural Networks to Learn Adaptive Behaviours
Nicolas Vecoven
D. Ernst
Antoine Wehenkel
G. Drion
AI4CE
64
43
0
21 Dec 2018
NADPEx: An on-policy temporally consistent exploration method for deep
  reinforcement learning
NADPEx: An on-policy temporally consistent exploration method for deep reinforcement learning
Sirui Xie
Junning Huang
Lanxin Lei
Chunxiao Liu
Zheng Ma
Wayne Zhang
Liang Lin
54
8
0
21 Dec 2018
TD-Regularized Actor-Critic Methods
TD-Regularized Actor-Critic Methods
Simone Parisi
Voot Tangkaratt
Jan Peters
Mohammad Emtiyaz Khan
OffRL
61
31
0
19 Dec 2018
Hierarchical Macro Strategy Model for MOBA Game AI
Hierarchical Macro Strategy Model for MOBA Game AI
Bin Wu
Qiang Fu
Jing Liang
Peng-fei Qu
Xiaoqian Li
Liang Wang
Wei Liu
Wei Yang
Yongsheng Liu
88
63
0
19 Dec 2018
An Atari Model Zoo for Analyzing, Visualizing, and Comparing Deep
  Reinforcement Learning Agents
An Atari Model Zoo for Analyzing, Visualizing, and Comparing Deep Reinforcement Learning Agents
F. Such
Vashisht Madhavan
Rosanne Liu
Rui Wang
Pablo Samuel Castro
...
Jiale Zhi
Ludwig Schubert
Marc G. Bellemare
Jeff Clune
Joel Lehman
OffRL
73
54
0
17 Dec 2018
An Empirical Model of Large-Batch Training
An Empirical Model of Large-Batch Training
Sam McCandlish
Jared Kaplan
Dario Amodei
OpenAI Dota Team
76
280
0
14 Dec 2018
Soft Actor-Critic Algorithms and Applications
Soft Actor-Critic Algorithms and Applications
Tuomas Haarnoja
Aurick Zhou
Kristian Hartikainen
George Tucker
Sehoon Ha
...
Vikash Kumar
Henry Zhu
Abhishek Gupta
Pieter Abbeel
Sergey Levine
157
2,456
0
13 Dec 2018
Learning Montezuma's Revenge from a Single Demonstration
Learning Montezuma's Revenge from a Single Demonstration
Tim Salimans
Richard J. Chen
121
139
0
08 Dec 2018
Communication-Efficient Policy Gradient Methods for Distributed
  Reinforcement Learning
Communication-Efficient Policy Gradient Methods for Distributed Reinforcement Learning
Tianyi Chen
Kai Zhang
G. Giannakis
Tamer Basar
OffRL
100
41
0
07 Dec 2018
Zero-shot Deep Reinforcement Learning Driving Policy Transfer for
  Autonomous Vehicles based on Robust Control
Zero-shot Deep Reinforcement Learning Driving Policy Transfer for Autonomous Vehicles based on Robust Control
Zhuo Xu
Chen Tang
Masayoshi Tomizuka
OffRL
45
36
0
07 Dec 2018
Online Model Distillation for Efficient Video Inference
Online Model Distillation for Efficient Video Inference
Ravi Teja Mullapudi
Steven Chen
Keyi Zhang
Deva Ramanan
Kayvon Fatahalian
VGen
98
115
0
06 Dec 2018
Quantifying Generalization in Reinforcement Learning
Quantifying Generalization in Reinforcement Learning
K. Cobbe
Oleg Klimov
Christopher Hesse
Taehoon Kim
John Schulman
OffRL
141
677
0
06 Dec 2018
Relative Entropy Regularized Policy Iteration
Relative Entropy Regularized Policy Iteration
A. Abdolmaleki
Jost Tobias Springenberg
Jonas Degrave
Steven Bohez
Yuval Tassa
Dan Belov
N. Heess
Martin Riedmiller
68
72
0
05 Dec 2018
Adapting Auxiliary Losses Using Gradient Similarity
Adapting Auxiliary Losses Using Gradient Similarity
Yunshu Du
Wojciech M. Czarnecki
Siddhant M. Jayakumar
Mehrdad Farajtabar
Razvan Pascanu
Balaji Lakshminarayanan
115
158
0
05 Dec 2018
JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of
  Imperative Programs
JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs
Eunji Jeong
Sungwoo Cho
Gyeong-In Yu
Joo Seong Jeong
Dongjin Shin
Byung-Gon Chun
49
25
0
04 Dec 2018
Mitigating Planner Overfitting in Model-Based Reinforcement Learning
Mitigating Planner Overfitting in Model-Based Reinforcement Learning
Dilip Arumugam
David Abel
Kavosh Asadi
N. Gopalan
Christopher Grimm
Jun Ki Lee
Lucas Lehnert
Michael L. Littman
43
11
0
03 Dec 2018
Generative Adversarial Self-Imitation Learning
Generative Adversarial Self-Imitation Learning
Yijie Guo
Junhyuk Oh
Satinder Singh
Honglak Lee
GAN
97
59
0
03 Dec 2018
Modulated Policy Hierarchies
Modulated Policy Hierarchies
Alexander Pashevich
Danijar Hafner
James Davidson
Rahul Sukthankar
Cordelia Schmid
46
6
0
30 Nov 2018
Hierarchical Policy Design for Sample-Efficient Learning of Robot Table
  Tennis Through Self-Play
Hierarchical Policy Design for Sample-Efficient Learning of Robot Table Tennis Through Self-Play
R. Mahjourian
Navdeep Jaitly
N. Lazić
Sergey Levine
Risto Miikkulainen
73
16
0
30 Nov 2018
An Introduction to Deep Reinforcement Learning
An Introduction to Deep Reinforcement Learning
Vincent François-Lavet
Peter Henderson
Riashat Islam
Marc G. Bellemare
Joelle Pineau
OffRLAI4CE
170
1,277
0
30 Nov 2018
Exploring Restart Distributions
Exploring Restart Distributions
Arash Tavakoli
Vitaly Levdik
Riashat Islam
Christopher M. Smith
Petar Kormushev
OffRL
35
5
0
27 Nov 2018
Understanding the impact of entropy on policy optimization
Understanding the impact of entropy on policy optimization
Zafarali Ahmed
Nicolas Le Roux
Mohammad Norouzi
Dale Schuurmans
83
238
0
27 Nov 2018
Hardware Conditioned Policies for Multi-Robot Transfer Learning
Hardware Conditioned Policies for Multi-Robot Transfer Learning
Tao Chen
Adithyavairavan Murali
Abhinav Gupta
85
102
0
24 Nov 2018
Hierarchical visuomotor control of humanoids
Hierarchical visuomotor control of humanoids
J. Merel
Arun Ahuja
Vu Pham
S. Tunyasuvunakool
Siqi Liu
Dhruva Tirumala
N. Heess
Greg Wayne
115
97
0
23 Nov 2018
Model-Based Reinforcement Learning for Sepsis Treatment
Model-Based Reinforcement Learning for Sepsis Treatment
Aniruddh Raghu
Matthieu Komorowski
Sumeetpal S. Singh
OffRLLM&MA
54
54
0
23 Nov 2018
Learning Goal Embeddings via Self-Play for Hierarchical Reinforcement
  Learning
Learning Goal Embeddings via Self-Play for Hierarchical Reinforcement Learning
Sainbayar Sukhbaatar
Emily L. Denton
Arthur Szlam
Rob Fergus
SSL
87
43
0
22 Nov 2018
Previous
123...166167168...170171172
Next