ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.05477
  4. Cited By
Trust Region Policy Optimization

Trust Region Policy Optimization

19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
ArXivPDFHTML

Papers citing "Trust Region Policy Optimization"

50 / 3,098 papers shown
Title
Discrete Sequential Prediction of Continuous Actions for Deep RL
Discrete Sequential Prediction of Continuous Actions for Deep RL
Luke Metz
Julian Ibarz
Navdeep Jaitly
James Davidson
BDL
OffRL
28
117
0
14 May 2017
A General Safety Framework for Learning-Based Control in Uncertain
  Robotic Systems
A General Safety Framework for Learning-Based Control in Uncertain Robotic Systems
J. F. Fisac
Anayo K. Akametalu
Melanie Zeilinger
Shahab Kaynama
J. Gillula
Claire Tomlin
20
491
0
03 May 2017
Mapping Instructions and Visual Observations to Actions with
  Reinforcement Learning
Mapping Instructions and Visual Observations to Actions with Reinforcement Learning
Dipendra Kumar Misra
John Langford
Yoav Artzi
21
247
0
28 Apr 2017
Virtual to Real Reinforcement Learning for Autonomous Driving
Virtual to Real Reinforcement Learning for Autonomous Driving
Xinlei Pan
Yurong You
Ziyan Wang
Cewu Lu
OffRL
25
334
0
13 Apr 2017
Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep
  Reinforcement Learning
Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning
Baolin Peng
Xiujun Li
Lihong Li
Jianfeng Gao
Asli Celikyilmaz
Sungjin Lee
Kam-Fai Wong
BDL
38
190
0
10 Apr 2017
Data-efficient Deep Reinforcement Learning for Dexterous Manipulation
Data-efficient Deep Reinforcement Learning for Dexterous Manipulation
I. Popov
N. Heess
Timothy Lillicrap
Roland Hafner
Gabriel Barth-Maron
Matej Vecerík
Thomas Lampe
Yuval Tassa
Tom Erez
Martin Riedmiller
OffRL
31
263
0
10 Apr 2017
Stochastic Neural Networks for Hierarchical Reinforcement Learning
Stochastic Neural Networks for Hierarchical Reinforcement Learning
Carlos Florensa
Yan Duan
Pieter Abbeel
BDL
47
360
0
10 Apr 2017
Stein Variational Policy Gradient
Stein Variational Policy Gradient
Yang Liu
Prajit Ramachandran
Qiang Liu
Jian-wei Peng
19
138
0
07 Apr 2017
Learning Visual Servoing with Deep Features and Fitted Q-Iteration
Learning Visual Servoing with Deep Features and Fitted Q-Iteration
Alex X. Lee
Sergey Levine
Pieter Abbeel
SSL
30
73
0
31 Mar 2017
Fast Optimization of Wildfire Suppression Policies with SMAC
Fast Optimization of Wildfire Suppression Policies with SMAC
Sean McGregor
Rachel Houtman
Claire A. Montgomery
Ronald A. Metoyer
Thomas G. Dietterich
24
2
0
28 Mar 2017
DART: Noise Injection for Robust Imitation Learning
DART: Noise Injection for Robust Imitation Learning
Michael Laskey
Jonathan Lee
Roy Fox
Anca Dragan
Ken Goldberg
47
244
0
27 Mar 2017
Deep Deterministic Policy Gradient for Urban Traffic Light Control
Deep Deterministic Policy Gradient for Urban Traffic Light Control
Noe Casas
32
165
0
27 Mar 2017
InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations
InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations
Yunzhu Li
Jiaming Song
Stefano Ermon
19
44
0
26 Mar 2017
Failures of Gradient-Based Deep Learning
Failures of Gradient-Based Deep Learning
Shai Shalev-Shwartz
Ohad Shamir
Shaked Shammah
ODL
UQCV
34
198
0
23 Mar 2017
One-Shot Imitation Learning
One-Shot Imitation Learning
Yan Duan
Marcin Andrychowicz
Bradly C. Stadie
Jonathan Ho
Jonas Schneider
Ilya Sutskever
Pieter Abbeel
Wojciech Zaremba
OffRL
23
682
0
21 Mar 2017
Domain Randomization for Transferring Deep Neural Networks from
  Simulation to the Real World
Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World
Joshua Tobin
Rachel Fong
Alex Ray
Jonas Schneider
Wojciech Zaremba
Pieter Abbeel
19
2,926
0
20 Mar 2017
Learning to Navigate Cloth using Haptics
Learning to Navigate Cloth using Haptics
Alexander Clegg
Wenhao Yu
Zackory M. Erickson
Jie Tan
Chenxi Liu
Greg Turk
21
23
0
20 Mar 2017
Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play
Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play
Sainbayar Sukhbaatar
Zeming Lin
Ilya Kostrikov
Gabriel Synnaeve
Arthur Szlam
Rob Fergus
SSL
28
331
0
15 Mar 2017
Sensor Fusion for Robot Control through Deep Reinforcement Learning
Sensor Fusion for Robot Control through Deep Reinforcement Learning
Steven Bohez
Tim Verbelen
E. D. Coninck
B. Vankeirsbilck
Pieter Simoens
Bart Dhoedt
SSL
10
29
0
13 Mar 2017
Evolution Strategies as a Scalable Alternative to Reinforcement Learning
Evolution Strategies as a Scalable Alternative to Reinforcement Learning
Tim Salimans
Jonathan Ho
Xi Chen
Szymon Sidor
Ilya Sutskever
45
1,516
0
10 Mar 2017
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Chelsea Finn
Pieter Abbeel
Sergey Levine
OOD
493
11,727
0
09 Mar 2017
Combining Model-Based and Model-Free Updates for Trajectory-Centric
  Reinforcement Learning
Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning
Yevgen Chebotar
Karol Hausman
Marvin Zhang
Gaurav Sukhatme
S. Schaal
Sergey Levine
34
159
0
08 Mar 2017
Learning a Unified Control Policy for Safe Falling
Learning a Unified Control Policy for Safe Falling
Visak C. V. Kumar
Sehoon Ha
Karen Liu
15
19
0
08 Mar 2017
Robust Adversarial Reinforcement Learning
Robust Adversarial Reinforcement Learning
Lerrel Pinto
James Davidson
Rahul Sukthankar
Abhinav Gupta
OOD
63
844
0
08 Mar 2017
Towards Generalization and Simplicity in Continuous Control
Towards Generalization and Simplicity in Continuous Control
Aravind Rajeswaran
Kendall Lowrey
E. Todorov
Sham Kakade
OffRL
41
276
0
08 Mar 2017
Surprise-Based Intrinsic Motivation for Deep Reinforcement Learning
Surprise-Based Intrinsic Motivation for Deep Reinforcement Learning
Joshua Achiam
S. Shankar Sastry
37
235
0
06 Mar 2017
Third-Person Imitation Learning
Third-Person Imitation Learning
Bradly C. Stadie
Pieter Abbeel
Ilya Sutskever
10
234
0
06 Mar 2017
EX2: Exploration with Exemplar Models for Deep Reinforcement Learning
EX2: Exploration with Exemplar Models for Deep Reinforcement Learning
Justin Fu
John D. Co-Reyes
Sergey Levine
OffRL
31
155
0
03 Mar 2017
FeUdal Networks for Hierarchical Reinforcement Learning
FeUdal Networks for Hierarchical Reinforcement Learning
A. Vezhnevets
Simon Osindero
Tom Schaul
N. Heess
Max Jaderberg
David Silver
Koray Kavukcuoglu
FedML
24
898
0
03 Mar 2017
Deeply AggreVaTeD: Differentiable Imitation Learning for Sequential
  Prediction
Deeply AggreVaTeD: Differentiable Imitation Learning for Sequential Prediction
Wen Sun
Arun Venkatraman
Geoffrey J. Gordon
Byron Boots
J. Andrew Bagnell
29
232
0
03 Mar 2017
Deep Predictive Policy Training using Reinforcement Learning
Deep Predictive Policy Training using Reinforcement Learning
Ali Ghadirzadeh
A. Maki
Danica Kragic
Mårten Björkman
28
128
0
02 Mar 2017
Reinforcement Learning for Pivoting Task
Reinforcement Learning for Pivoting Task
Rika Antonova
S. Cruciani
Christian Smith
Danica Kragic
8
68
0
01 Mar 2017
Bridging the Gap Between Value and Policy Based Reinforcement Learning
Bridging the Gap Between Value and Policy Based Reinforcement Learning
Ofir Nachum
Mohammad Norouzi
Kelvin Xu
Dale Schuurmans
32
466
0
28 Feb 2017
Reinforcement Learning with Deep Energy-Based Policies
Reinforcement Learning with Deep Energy-Based Policies
Tuomas Haarnoja
Haoran Tang
Pieter Abbeel
Sergey Levine
26
1,316
0
27 Feb 2017
Learning Control for Air Hockey Striking using Deep Reinforcement
  Learning
Learning Control for Air Hockey Striking using Deep Reinforcement Learning
Ayal Taitler
N. Shimkin
17
10
0
26 Feb 2017
Beating the World's Best at Super Smash Bros. with Deep Reinforcement
  Learning
Beating the World's Best at Super Smash Bros. with Deep Reinforcement Learning
Vlad Firoiu
William F. Whitney
J. Tenenbaum
12
36
0
21 Feb 2017
Learning to Multi-Task by Active Sampling
Learning to Multi-Task by Active Sampling
Sahil Sharma
Ashutosh Jha
Parikshit Hegde
Balaraman Ravindran
21
21
0
20 Feb 2017
Cognitive Mapping and Planning for Visual Navigation
Cognitive Mapping and Planning for Visual Navigation
Saurabh Gupta
Varun Tolani
James Davidson
Sergey Levine
Rahul Sukthankar
Jitendra Malik
25
710
0
13 Feb 2017
Preparing for the Unknown: Learning a Universal Policy with Online
  System Identification
Preparing for the Unknown: Learning a Universal Policy with Online System Identification
Wenhao Yu
Jie Tan
Chenxi Liu
Greg Turk
OffRL
23
306
0
08 Feb 2017
Adversarial Attacks on Neural Network Policies
Adversarial Attacks on Neural Network Policies
Sandy Huang
Nicolas Papernot
Ian Goodfellow
Yan Duan
Pieter Abbeel
MLAU
AAML
8
830
0
08 Feb 2017
Uncertainty-Aware Reinforcement Learning for Collision Avoidance
Uncertainty-Aware Reinforcement Learning for Collision Avoidance
G. Kahn
Adam R. Villaflor
Vitchyr H. Pong
Pieter Abbeel
Sergey Levine
35
312
0
03 Feb 2017
Deep Reinforcement Learning for Robotic Manipulation-The state of the
  art
Deep Reinforcement Learning for Robotic Manipulation-The state of the art
S. Amarjyoti
23
65
0
31 Jan 2017
Expert Level control of Ramp Metering based on Multi-task Deep
  Reinforcement Learning
Expert Level control of Ramp Metering based on Multi-task Deep Reinforcement Learning
Francois Belletti
Daniel Haziza
G. Gomes
Alexandre M. Bayen
11
139
0
30 Jan 2017
Deep Reinforcement Learning: An Overview
Deep Reinforcement Learning: An Overview
Yuxi Li
OffRL
VLM
104
1,505
0
25 Jan 2017
Imitating Driver Behavior with Generative Adversarial Networks
Imitating Driver Behavior with Generative Adversarial Networks
Alex Kuefler
Jeremy Morton
T. Wheeler
Mykel Kochenderfer
GAN
35
405
0
24 Jan 2017
Scalable and Incremental Learning of Gaussian Mixture Models
Scalable and Incremental Learning of Gaussian Mixture Models
R. Pinto
P. Engel
21
10
0
14 Jan 2017
A K-fold Method for Baseline Estimation in Policy Gradient Algorithms
A K-fold Method for Baseline Estimation in Policy Gradient Algorithms
N. Kota
Abhishek Mishra
Sunil Srinivasa
Xi
Xi Chen
Pieter Abbeel
OffRL
21
0
0
03 Jan 2017
Deep Reinforcement Learning with Successor Features for Navigation
  across Similar Environments
Deep Reinforcement Learning with Successor Features for Navigation across Similar Environments
Jingwei Zhang
Jost Tobias Springenberg
Joschka Boedecker
Wolfram Burgard
22
294
0
16 Dec 2016
Reinforcement Learning With Temporal Logic Rewards
Reinforcement Learning With Temporal Logic Rewards
Xiao Li
C. Vasile
C. Belta
17
214
0
11 Dec 2016
Model-based Adversarial Imitation Learning
Model-based Adversarial Imitation Learning
Nir Baram
Oron Anschel
Shie Mannor
GAN
22
42
0
07 Dec 2016
Previous
123...606162
Next