ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.05477
  4. Cited By
Trust Region Policy Optimization

Trust Region Policy Optimization

19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
ArXivPDFHTML

Papers citing "Trust Region Policy Optimization"

50 / 3,098 papers shown
Title
Tsallis Reinforcement Learning: A Unified Framework for Maximum Entropy
  Reinforcement Learning
Tsallis Reinforcement Learning: A Unified Framework for Maximum Entropy Reinforcement Learning
Kyungjae Lee
Sungyub Kim
Sungbin Lim
Sungjoon Choi
Songhwai Oh
27
28
0
31 Jan 2019
Contrasting Exploration in Parameter and Action Space: A Zeroth-Order
  Optimization Perspective
Contrasting Exploration in Parameter and Action Space: A Zeroth-Order Optimization Perspective
Anirudh Vemula
Wen Sun
J. Andrew Bagnell
21
40
0
31 Jan 2019
A Theory of Regularized Markov Decision Processes
A Theory of Regularized Markov Decision Processes
M. Geist
B. Scherrer
Olivier Pietquin
23
313
0
31 Jan 2019
Discretizing Continuous Action Space for On-Policy Optimization
Discretizing Continuous Action Space for On-Policy Optimization
Yunhao Tang
Shipra Agrawal
OffRL
26
118
0
29 Jan 2019
Trust Region-Guided Proximal Policy Optimization
Trust Region-Guided Proximal Policy Optimization
Yuhui Wang
Hao He
Xiaoyang Tan
Yaozhong Gan
OffRL
26
55
0
29 Jan 2019
Lyapunov-based Safe Policy Optimization for Continuous Control
Lyapunov-based Safe Policy Optimization for Continuous Control
Yinlam Chow
Ofir Nachum
Aleksandra Faust
Edgar A. Duénez-Guzmán
Mohammad Ghavamzadeh
33
244
0
28 Jan 2019
Making Deep Q-learning methods robust to time discretization
Making Deep Q-learning methods robust to time discretization
Corentin Tallec
Léonard Blier
Yann Ollivier
OOD
OffRL
6
89
0
28 Jan 2019
Imitation Learning from Imperfect Demonstration
Imitation Learning from Imperfect Demonstration
Yueh-hua Wu
Nontawat Charoenphakdee
Han Bao
Voot Tangkaratt
Masashi Sugiyama
13
157
0
27 Jan 2019
Q-learning with UCB Exploration is Sample Efficient for Infinite-Horizon
  MDP
Q-learning with UCB Exploration is Sample Efficient for Infinite-Horizon MDP
Kefan Dong
Yuanhao Wang
Xiaoyu Chen
Liwei Wang
OffRL
19
95
0
27 Jan 2019
Model-based Deep Reinforcement Learning for Dynamic Portfolio
  Optimization
Model-based Deep Reinforcement Learning for Dynamic Portfolio Optimization
Pengqian Yu
J. Lee
Ilya Kulyatin
Zekun Shi
Sakyasingha Dasgupta
11
64
0
25 Jan 2019
Learning agile and dynamic motor skills for legged robots
Learning agile and dynamic motor skills for legged robots
Jemin Hwangbo
Joonho Lee
Alexey Dosovitskiy
Dario Bellicoso
Vassilios Tsounis
V. Koltun
Marco Hutter
19
1,274
0
24 Jan 2019
Sample Complexity of Estimating the Policy Gradient for Nearly
  Deterministic Dynamical Systems
Sample Complexity of Estimating the Policy Gradient for Nearly Deterministic Dynamical Systems
Osbert Bastani
22
4
0
24 Jan 2019
Feudal Multi-Agent Hierarchies for Cooperative Reinforcement Learning
Feudal Multi-Agent Hierarchies for Cooperative Reinforcement Learning
S. Ahilan
Peter Dayan
19
75
0
24 Jan 2019
Trust Region Value Optimization using Kalman Filtering
Trust Region Value Optimization using Kalman Filtering
Shirli Di-Castro Shashua
Shie Mannor
19
7
0
23 Jan 2019
Robust Recovery Controller for a Quadrupedal Robot using Deep
  Reinforcement Learning
Robust Recovery Controller for a Quadrupedal Robot using Deep Reinforcement Learning
Joonho Lee
Jemin Hwangbo
Marco Hutter
17
86
0
22 Jan 2019
Towards Learning to Imitate from a Single Video Demonstration
Towards Learning to Imitate from a Single Video Demonstration
Glen Berseth
Florian Golemo
C. Pal
29
6
0
22 Jan 2019
Neuroflight: Next Generation Flight Control Firmware
Neuroflight: Next Generation Flight Control Firmware
W. Koch
R. Mancuso
Azer Bestavros
41
29
0
19 Jan 2019
On-Policy Trust Region Policy Optimisation with Replay Buffers
On-Policy Trust Region Policy Optimisation with Replay Buffers
D. Kangin
N. Pugeault
OffRL
19
3
0
18 Jan 2019
Imitation-Regularized Offline Learning
Imitation-Regularized Offline Learning
Yifei Ma
Yu Wang
Balakrishnan
Balakrishnan Narayanaswamy
OffRL
16
22
0
15 Jan 2019
Adaptive Guidance with Reinforcement Meta-Learning
Adaptive Guidance with Reinforcement Meta-Learning
B. Gaudet
R. Linares
6
15
0
12 Jan 2019
Learning Accurate Extended-Horizon Predictions of High Dimensional
  Trajectories
Learning Accurate Extended-Horizon Predictions of High Dimensional Trajectories
B. Gaudet
R. Linares
R. Furfaro
11
1
0
12 Jan 2019
Improving Coordination in Small-Scale Multi-Agent Deep Reinforcement
  Learning through Memory-driven Communication
Improving Coordination in Small-Scale Multi-Agent Deep Reinforcement Learning through Memory-driven Communication
E. Pesce
Giovanni Montana
17
72
0
12 Jan 2019
A New Tensioning Method using Deep Reinforcement Learning for Surgical
  Pattern Cutting
A New Tensioning Method using Deep Reinforcement Learning for Surgical Pattern Cutting
Thanh Thi Nguyen
Ngoc Duy Nguyen
Fernando Bello
S. Nahavandi
25
38
0
10 Jan 2019
Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly
  Complex and Diverse Learning Environments and Their Solutions
Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions
Rui Wang
Joel Lehman
Jeff Clune
Kenneth O. Stanley
47
241
0
07 Jan 2019
Recurrent Control Nets for Deep Reinforcement Learning
Recurrent Control Nets for Deep Reinforcement Learning
Vincent Liu
Ademi Adeniji
Nathaniel Lee
Jason Zhao
Mario Srouji
14
3
0
06 Jan 2019
Exploring applications of deep reinforcement learning for real-world
  autonomous driving systems
Exploring applications of deep reinforcement learning for real-world autonomous driving systems
V. Talpaert
Ibrahim Sobh
Ravi Kiran
Patrick Mannion
S. Yogamani
Ahmad El-Sallab
P. Pérez
26
74
0
06 Jan 2019
On the Utility of Model Learning in HRI
On the Utility of Model Learning in HRI
Gokul Swamy
Jens Schulz
Rohan Choudhury
Dylan Hadfield-Menell
Anca Dragan
6
52
0
04 Jan 2019
Accelerating Goal-Directed Reinforcement Learning by Model
  Characterization
Accelerating Goal-Directed Reinforcement Learning by Model Characterization
Shoubhik Debnath
Gaurav Sukhatme
Lantao Liu
21
3
0
04 Jan 2019
A Theoretical Analysis of Deep Q-Learning
A Theoretical Analysis of Deep Q-Learning
Jianqing Fan
Zhuoran Yang
Yuchen Xie
Zhaoran Wang
30
596
0
01 Jan 2019
An Active Learning Framework for Efficient Robust Policy Search
An Active Learning Framework for Efficient Robust Policy Search
Sai Kiran Narayanaswami
N. Sudarsanam
Balaraman Ravindran
11
0
0
01 Jan 2019
Deep Reinforcement Learning for Multi-Agent Systems: A Review of
  Challenges, Solutions and Applications
Deep Reinforcement Learning for Multi-Agent Systems: A Review of Challenges, Solutions and Applications
Thanh Thi Nguyen
Ngoc Duy Nguyen
S. Nahavandi
27
775
0
31 Dec 2018
Dynamic Planning Networks
Dynamic Planning Networks
Norman L. Tasfi
Miriam A. M. Capretz
19
5
0
28 Dec 2018
Deconfounding Reinforcement Learning in Observational Settings
Deconfounding Reinforcement Learning in Observational Settings
Chaochao Lu
Bernhard Schölkopf
José Miguel Hernández-Lobato
CML
OOD
39
73
0
26 Dec 2018
Learning to Walk via Deep Reinforcement Learning
Learning to Walk via Deep Reinforcement Learning
Tuomas Haarnoja
Sehoon Ha
Aurick Zhou
Jie Tan
George Tucker
Sergey Levine
54
433
0
26 Dec 2018
VMAV-C: A Deep Attention-based Reinforcement Learning Algorithm for
  Model-based Control
VMAV-C: A Deep Attention-based Reinforcement Learning Algorithm for Model-based Control
Xingxing Liang
Qi Wang
Yanghe Feng
Zhong Liu
Jincai Huang
29
5
0
24 Dec 2018
Introducing Neuromodulation in Deep Neural Networks to Learn Adaptive
  Behaviours
Introducing Neuromodulation in Deep Neural Networks to Learn Adaptive Behaviours
Nicolas Vecoven
D. Ernst
Antoine Wehenkel
G. Drion
AI4CE
8
42
0
21 Dec 2018
NADPEx: An on-policy temporally consistent exploration method for deep
  reinforcement learning
NADPEx: An on-policy temporally consistent exploration method for deep reinforcement learning
Sirui Xie
Junning Huang
Lanxin Lei
Chunxiao Liu
Zheng Ma
Wayne Zhang
Liang Lin
33
8
0
21 Dec 2018
Derivative-Free Methods for Policy Optimization: Guarantees for Linear
  Quadratic Systems
Derivative-Free Methods for Policy Optimization: Guarantees for Linear Quadratic Systems
Dhruv Malik
A. Pananjady
Kush S. Bhatia
K. Khamaru
Peter L. Bartlett
Martin J. Wainwright
25
198
0
20 Dec 2018
TD-Regularized Actor-Critic Methods
TD-Regularized Actor-Critic Methods
Simone Parisi
Voot Tangkaratt
Jan Peters
Mohammad Emtiyaz Khan
OffRL
30
32
0
19 Dec 2018
Toward Multimodal Model-Agnostic Meta-Learning
Toward Multimodal Model-Agnostic Meta-Learning
Risto Vuorio
Shao-Hua Sun
Hexiang Hu
Joseph J. Lim
60
31
0
18 Dec 2018
An Atari Model Zoo for Analyzing, Visualizing, and Comparing Deep
  Reinforcement Learning Agents
An Atari Model Zoo for Analyzing, Visualizing, and Comparing Deep Reinforcement Learning Agents
F. Such
Vashisht Madhavan
Rosanne Liu
Rui Wang
Pablo Samuel Castro
...
Jiale Zhi
Ludwig Schubert
Marc G. Bellemare
Jeff Clune
Joel Lehman
OffRL
27
54
0
17 Dec 2018
A Logarithmic Barrier Method For Proximal Policy Optimization
A Logarithmic Barrier Method For Proximal Policy Optimization
Cheng Zeng
Hongming Zhang
14
2
0
16 Dec 2018
Gold Seeker: Information Gain from Policy Distributions for
  Goal-oriented Vision-and-Langauge Reasoning
Gold Seeker: Information Gain from Policy Distributions for Goal-oriented Vision-and-Langauge Reasoning
Ehsan Abbasnejad
Iman Abbasnejad
Qi Wu
Javen Qinfeng Shi
Anton Van Den Hengel
OffRL
33
5
0
16 Dec 2018
Simulation to Scaled City: Zero-Shot Policy Transfer for Traffic Control
  via Autonomous Vehicles
Simulation to Scaled City: Zero-Shot Policy Transfer for Traffic Control via Autonomous Vehicles
Kathy Jang
Eugene Vinitsky
Behdad Chalaki
Ben Remer
Logan E. Beaver
Andreas A. Malikopoulos
Alexandre M. Bayen
23
89
0
14 Dec 2018
Soft Actor-Critic Algorithms and Applications
Soft Actor-Critic Algorithms and Applications
Tuomas Haarnoja
Aurick Zhou
Kristian Hartikainen
George Tucker
Sehoon Ha
...
Vikash Kumar
Henry Zhu
Abhishek Gupta
Pieter Abbeel
Sergey Levine
83
2,371
0
13 Dec 2018
KF-LAX: Kronecker-factored curvature estimation for control variate
  optimization in reinforcement learning
KF-LAX: Kronecker-factored curvature estimation for control variate optimization in reinforcement learning
Mohammad Firouzi
19
0
0
11 Dec 2018
Communication-Efficient Policy Gradient Methods for Distributed
  Reinforcement Learning
Communication-Efficient Policy Gradient Methods for Distributed Reinforcement Learning
Tianyi Chen
Kai Zhang
G. Giannakis
Tamer Basar
OffRL
29
41
0
07 Dec 2018
Off-Policy Deep Reinforcement Learning without Exploration
Off-Policy Deep Reinforcement Learning without Exploration
Scott Fujimoto
David Meger
Doina Precup
OffRL
BDL
43
1,582
0
07 Dec 2018
Top-K Off-Policy Correction for a REINFORCE Recommender System
Top-K Off-Policy Correction for a REINFORCE Recommender System
Minmin Chen
Alex Beutel
Paul Covington
Sagar Jain
Francois Belletti
Ed H. Chi
CML
OffRL
33
474
0
06 Dec 2018
Relative Entropy Regularized Policy Iteration
Relative Entropy Regularized Policy Iteration
A. Abdolmaleki
Jost Tobias Springenberg
Jonas Degrave
Steven Bohez
Yuval Tassa
Dan Belov
N. Heess
Martin Riedmiller
27
72
0
05 Dec 2018
Previous
123...515253...606162
Next