ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.05477
  4. Cited By
Trust Region Policy Optimization

Trust Region Policy Optimization

19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
ArXivPDFHTML

Papers citing "Trust Region Policy Optimization"

50 / 1,216 papers shown
Title
Deep Active Learning with Adaptive Acquisition
Deep Active Learning with Adaptive Acquisition
Manuel Haussmann
Fred Hamprecht
M. Kandemir
22
41
0
27 Jun 2019
From self-tuning regulators to reinforcement learning and back again
From self-tuning regulators to reinforcement learning and back again
Nikolai Matni
Alexandre Proutière
Anders Rantzer
Stephen Tu
27
88
0
27 Jun 2019
Compositional Transfer in Hierarchical Reinforcement Learning
Compositional Transfer in Hierarchical Reinforcement Learning
Markus Wulfmeier
A. Abdolmaleki
Roland Hafner
Jost Tobias Springenberg
Michael Neunert
Tim Hertweck
Thomas Lampe
Noah Y. Siegel
N. Heess
Martin Riedmiller
30
27
0
26 Jun 2019
Optimistic Proximal Policy Optimization
Optimistic Proximal Policy Optimization
Takahisa Imagawa
Takuya Hiraoka
Yoshimasa Tsuruoka
15
4
0
25 Jun 2019
Neural Proximal/Trust Region Policy Optimization Attains Globally
  Optimal Policy
Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy
Boyi Liu
Qi Cai
Zhuoran Yang
Zhaoran Wang
30
108
0
25 Jun 2019
Modern Deep Reinforcement Learning Algorithms
Modern Deep Reinforcement Learning Algorithms
Sergey Ivanov
A. Dýakonov
OffRL
26
38
0
24 Jun 2019
Learning Belief Representations for Imitation Learning in POMDPs
Learning Belief Representations for Imitation Learning in POMDPs
Tanmay Gangwani
Joel Lehman
Qiang Liu
Jian Peng
24
36
0
22 Jun 2019
Reinforcement Learning with Convex Constraints
Reinforcement Learning with Convex Constraints
Sobhan Miryoosefi
Kianté Brantley
Hal Daumé
Miroslav Dudík
Robert Schapire
17
90
0
21 Jun 2019
Variable Impedance Control in End-Effector Space: An Action Space for
  Reinforcement Learning in Contact-Rich Tasks
Variable Impedance Control in End-Effector Space: An Action Space for Reinforcement Learning in Contact-Rich Tasks
Roberto Martín-Martín
Michelle A. Lee
Rachel Gardner
Silvio Savarese
Jeannette Bohg
Animesh Garg
25
194
0
20 Jun 2019
Global Convergence of Policy Gradient Methods to (Almost) Locally
  Optimal Policies
Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies
Kaipeng Zhang
Alec Koppel
Haoqi Zhu
Tamer Basar
44
186
0
19 Jun 2019
RIDM: Reinforced Inverse Dynamics Modeling for Learning from a Single
  Observed Demonstration
RIDM: Reinforced Inverse Dynamics Modeling for Learning from a Single Observed Demonstration
Brahma S. Pavse
F. Torabi
Josiah P. Hanna
Garrett A. Warnell
Peter Stone
27
33
0
18 Jun 2019
NeoNav: Improving the Generalization of Visual Navigation via Generating
  Next Expected Observations
NeoNav: Improving the Generalization of Visual Navigation via Generating Next Expected Observations
Qiaoyun Wu
Tianyi Zhou
Jun Wang
Kai Xu
16
15
0
17 Jun 2019
Is the Policy Gradient a Gradient?
Is the Policy Gradient a Gradient?
Chris Nota
Philip S. Thomas
8
57
0
17 Jun 2019
Learning-Driven Exploration for Reinforcement Learning
Learning-Driven Exploration for Reinforcement Learning
Muhammad Usama
D. Chang
26
10
0
17 Jun 2019
Goal-conditioned Imitation Learning
Goal-conditioned Imitation Learning
Yiming Ding
Carlos Florensa
Mariano Phielipp
Pieter Abbeel
34
219
0
13 Jun 2019
Sub-policy Adaptation for Hierarchical Reinforcement Learning
Sub-policy Adaptation for Hierarchical Reinforcement Learning
Alexander C. Li
Carlos Florensa
I. Clavera
Pieter Abbeel
29
71
0
13 Jun 2019
Deep Reinforcement Learning for Cyber Security
Deep Reinforcement Learning for Cyber Security
Thanh Thi Nguyen
Vijay Janapa Reddi
OffRL
AI4CE
10
313
0
13 Jun 2019
Search on the Replay Buffer: Bridging Planning and Reinforcement
  Learning
Search on the Replay Buffer: Bridging Planning and Reinforcement Learning
Benjamin Eysenbach
Ruslan Salakhutdinov
Sergey Levine
OffRL
32
286
0
12 Jun 2019
Learning the Graphical Structure of Electronic Health Records with Graph
  Convolutional Transformer
Learning the Graphical Structure of Electronic Health Records with Graph Convolutional Transformer
Edward Choi
Zhen Xu
Yujia Li
Michael W. Dusenberry
Gerardo Flores
Yuan Xue
Andrew M. Dai
MedIm
24
238
0
11 Jun 2019
Learning Powerful Policies by Using Consistent Dynamics Model
Learning Powerful Policies by Using Consistent Dynamics Model
Shagun Sodhani
Anirudh Goyal
T. Deleu
Yoshua Bengio
Sergey Levine
Jian Tang
OffRL
19
5
0
11 Jun 2019
Learning to Score Behaviors for Guided Policy Optimization
Learning to Score Behaviors for Guided Policy Optimization
Aldo Pacchiano
Jack Parker-Holder
Yunhao Tang
A. Choromańska
K. Choromanski
Michael I. Jordan
19
38
0
11 Jun 2019
Exploration via Hindsight Goal Generation
Exploration via Hindsight Goal Generation
Zhizhou Ren
Kefan Dong
Yuanshuo Zhou
Qiang Liu
Jian-wei Peng
35
85
0
10 Jun 2019
Exploiting the Sign of the Advantage Function to Learn Deterministic
  Policies in Continuous Domains
Exploiting the Sign of the Advantage Function to Learn Deterministic Policies in Continuous Domains
Matthieu Zimmer
Paul Weng
24
7
0
10 Jun 2019
Reducing the variance in online optimization by transporting past
  gradients
Reducing the variance in online optimization by transporting past gradients
Sébastien M. R. Arnold
Pierre-Antoine Manzagol
Reza Babanezhad
Ioannis Mitliagkas
Nicolas Le Roux
26
28
0
08 Jun 2019
Empirical Likelihood for Contextual Bandits
Empirical Likelihood for Contextual Bandits
Nikos Karampatziakis
John Langford
Paul Mineiro
OffRL
23
9
0
07 Jun 2019
Machine Learning and System Identification for Estimation in Physical
  Systems
Machine Learning and System Identification for Estimation in Physical Systems
Fredrik Bagge Carlson
OOD
16
5
0
05 Jun 2019
Global Optimality Guarantees For Policy Gradient Methods
Global Optimality Guarantees For Policy Gradient Methods
Jalaj Bhandari
Daniel Russo
37
186
0
05 Jun 2019
Policy Optimization Provably Converges to Nash Equilibria in Zero-Sum
  Linear Quadratic Games
Policy Optimization Provably Converges to Nash Equilibria in Zero-Sum Linear Quadratic Games
Kaipeng Zhang
Zhuoran Yang
Tamer Basar
32
125
0
31 May 2019
Advantage Amplification in Slowly Evolving Latent-State Environments
Advantage Amplification in Slowly Evolving Latent-State Environments
Martin Mladenov
Ofer Meshi
Jayden Ooi
Dale Schuurmans
Craig Boutilier
OffRL
18
9
0
29 May 2019
An Improved Convergence Analysis of Stochastic Variance-Reduced Policy
  Gradient
An Improved Convergence Analysis of Stochastic Variance-Reduced Policy Gradient
Pan Xu
F. Gao
Quanquan Gu
10
93
0
29 May 2019
Adversarial Imitation Learning from Incomplete Demonstrations
Adversarial Imitation Learning from Incomplete Demonstrations
Mingfei Sun
Xiaojuan Ma
16
28
0
29 May 2019
Snooping Attacks on Deep Reinforcement Learning
Snooping Attacks on Deep Reinforcement Learning
Matthew J. Inkawhich
Yiran Chen
Hai Helen Li
AAML
22
25
0
28 May 2019
Learning to Discretize: Solving 1D Scalar Conservation Laws via Deep
  Reinforcement Learning
Learning to Discretize: Solving 1D Scalar Conservation Laws via Deep Reinforcement Learning
Yufei Wang
Ziju Shen
Zichao Long
Bin Dong
AI4CE
PINN
13
40
0
27 May 2019
Policy Search by Target Distribution Learning for Continuous Control
Policy Search by Target Distribution Learning for Continuous Control
Chuheng Zhang
Yuanqi Li
Jian Li
23
6
0
27 May 2019
Composing Task-Agnostic Policies with Deep Reinforcement Learning
Composing Task-Agnostic Policies with Deep Reinforcement Learning
A. H. Qureshi
Jacob J. Johnson
Yuzhe Qin
Taylor Henderson
Byron Boots
Michael C. Yip
OffRL
22
30
0
25 May 2019
Adaptive Symmetric Reward Noising for Reinforcement Learning
Adaptive Symmetric Reward Noising for Reinforcement Learning
R. Vivanti
Talya D. Sohlberg-Baris
Shlomo Cohen
Orna Cohen
AAML
21
1
0
24 May 2019
Trajectory-Based Off-Policy Deep Reinforcement Learning
Trajectory-Based Off-Policy Deep Reinforcement Learning
Andreas Doerr
Michael Volpp
Marc Toussaint
Sebastian Trimpe
Christian Daniel
OffRL
29
2
0
14 May 2019
Lessons from Contextual Bandit Learning in a Customer Support Bot
Lessons from Contextual Bandit Learning in a Customer Support Bot
Nikos Karampatziakis
Sebastian Kochman
Jade Huang
Paul Mineiro
Kathy Osborne
Weizhu Chen
12
6
0
06 May 2019
P3O: Policy-on Policy-off Policy Optimization
P3O: Policy-on Policy-off Policy Optimization
Rasool Fakoor
Pratik Chaudhari
Alex Smola
OffRL
29
51
0
05 May 2019
ARSM: Augment-REINFORCE-Swap-Merge Estimator for Gradient
  Backpropagation Through Categorical Variables
ARSM: Augment-REINFORCE-Swap-Merge Estimator for Gradient Backpropagation Through Categorical Variables
Mingzhang Yin
Yuguang Yue
Mingyuan Zhou
19
23
0
04 May 2019
DAC: The Double Actor-Critic Architecture for Learning Options
DAC: The Double Actor-Critic Architecture for Learning Options
Shangtong Zhang
Shimon Whiteson
30
72
0
29 Apr 2019
Deep Neuroevolution of Recurrent and Discrete World Models
Deep Neuroevolution of Recurrent and Discrete World Models
S. Risi
Kenneth O. Stanley
OCL
19
53
0
28 Apr 2019
Neural Logic Reinforcement Learning
Neural Logic Reinforcement Learning
Zhengyao Jiang
Shan Luo
NAI
27
71
0
24 Apr 2019
Model-free Deep Reinforcement Learning for Urban Autonomous Driving
Model-free Deep Reinforcement Learning for Urban Autonomous Driving
Jianyu Chen
Bodi Yuan
Masayoshi Tomizuka
27
262
0
20 Apr 2019
Decoupled Data Based Approach for Learning to Control Nonlinear
  Dynamical Systems
Decoupled Data Based Approach for Learning to Control Nonlinear Dynamical Systems
Ran A. Wang
Karthikeya S. Parunandi
Dan Yu
D. Kalathil
S. Chakravorty
23
11
0
17 Apr 2019
End-to-End Robotic Reinforcement Learning without Reward Engineering
End-to-End Robotic Reinforcement Learning without Reward Engineering
Avi Singh
Larry Yang
Kristian Hartikainen
Chelsea Finn
Sergey Levine
SSL
OffRL
46
266
0
16 Apr 2019
Model-Free Reinforcement Learning for Financial Portfolios: A Brief
  Survey
Model-Free Reinforcement Learning for Financial Portfolios: A Brief Survey
Yoshiharu Sato
OffRL
24
32
0
10 Apr 2019
Multi-Preference Actor Critic
Multi-Preference Actor Critic
Ishan Durugkar
Matthew J. Hausknecht
Adith Swaminathan
Patrick MacAlpine
19
1
0
05 Apr 2019
Risk Averse Robust Adversarial Reinforcement Learning
Risk Averse Robust Adversarial Reinforcement Learning
Xinlei Pan
Daniel Seita
Yang Gao
John F. Canny
AAML
16
96
0
31 Mar 2019
Autoregressive Policies for Continuous Control Deep Reinforcement
  Learning
Autoregressive Policies for Continuous Control Deep Reinforcement Learning
D. Korenkevych
A. R. Mahmood
Gautham Vasan
James Bergstra
24
28
0
27 Mar 2019
Previous
123...181920...232425
Next