ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.05477
  4. Cited By
Trust Region Policy Optimization
v1v2v3v4v5 (latest)

Trust Region Policy Optimization

19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
ArXiv (abs)PDFHTML

Papers citing "Trust Region Policy Optimization"

50 / 2,024 papers shown
Title
Co-training for Policy Learning
Co-training for Policy Learning
Jialin Song
Ravi Lanka
Yisong Yue
M. Ono
OffRL
66
20
0
03 Jul 2019
Dynamics-Aware Unsupervised Discovery of Skills
Dynamics-Aware Unsupervised Discovery of Skills
Archit Sharma
S. Gu
Sergey Levine
Vikash Kumar
Karol Hausman
138
414
0
02 Jul 2019
Modified Actor-Critics
Modified Actor-Critics
Erinc Merdivan
S. Hanke
Matthieu Geist
52
2
0
02 Jul 2019
FiDi-RL: Incorporating Deep Reinforcement Learning with Finite-Difference Policy Search for Efficient Learning of Continuous Control
Longxiang Shi
Shijian Li
LongBing Cao
Long Yang
Gang Zheng
Gang Pan
41
5
0
01 Jul 2019
Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human
  Preferences in Dialog
Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog
Natasha Jaques
Asma Ghandeharioun
J. Shen
Craig Ferguson
Àgata Lapedriza
Noah J. Jones
S. Gu
Rosalind W. Picard
OffRL
174
344
0
30 Jun 2019
Demonstration-Guided Deep Reinforcement Learning of Control Policies for
  Dexterous Human-Robot Interaction
Demonstration-Guided Deep Reinforcement Learning of Control Policies for Dexterous Human-Robot Interaction
Sammy Christen
Stefan Stevšić
Otmar Hilliges
71
24
0
27 Jun 2019
Deep Active Learning with Adaptive Acquisition
Deep Active Learning with Adaptive Acquisition
Manuel Haussmann
Fred Hamprecht
M. Kandemir
95
41
0
27 Jun 2019
From self-tuning regulators to reinforcement learning and back again
From self-tuning regulators to reinforcement learning and back again
Nikolai Matni
Alexandre Proutiere
Anders Rantzer
Stephen Tu
127
88
0
27 Jun 2019
Compositional Transfer in Hierarchical Reinforcement Learning
Compositional Transfer in Hierarchical Reinforcement Learning
Markus Wulfmeier
A. Abdolmaleki
Roland Hafner
Jost Tobias Springenberg
Michael Neunert
Tim Hertweck
Thomas Lampe
Noah Y. Siegel
N. Heess
Martin Riedmiller
119
27
0
26 Jun 2019
Optimistic Proximal Policy Optimization
Optimistic Proximal Policy Optimization
Takahisa Imagawa
Takuya Hiraoka
Yoshimasa Tsuruoka
83
4
0
25 Jun 2019
Neural Proximal/Trust Region Policy Optimization Attains Globally
  Optimal Policy
Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy
Boyi Liu
Qi Cai
Zhuoran Yang
Zhaoran Wang
105
111
0
25 Jun 2019
Modern Deep Reinforcement Learning Algorithms
Modern Deep Reinforcement Learning Algorithms
Sergey Ivanov
A. Dýakonov
OffRL
74
39
0
24 Jun 2019
Deep Conservative Policy Iteration
Deep Conservative Policy Iteration
Nino Vieillard
Olivier Pietquin
Matthieu Geist
49
26
0
24 Jun 2019
Learning Belief Representations for Imitation Learning in POMDPs
Learning Belief Representations for Imitation Learning in POMDPs
Tanmay Gangwani
Joel Lehman
Qiang Liu
Jian Peng
59
37
0
22 Jun 2019
Reinforcement Learning with Convex Constraints
Reinforcement Learning with Convex Constraints
Sobhan Miryoosefi
Kianté Brantley
Hal Daumé
Miroslav Dudík
Robert Schapire
61
93
0
21 Jun 2019
Continual Reinforcement Learning with Diversity Exploration and
  Adversarial Self-Correction
Continual Reinforcement Learning with Diversity Exploration and Adversarial Self-Correction
Fengda Zhu
Xiaojun Chang
Runhao Zeng
Mingkui Tan
CLL
54
3
0
21 Jun 2019
Variable Impedance Control in End-Effector Space: An Action Space for
  Reinforcement Learning in Contact-Rich Tasks
Variable Impedance Control in End-Effector Space: An Action Space for Reinforcement Learning in Contact-Rich Tasks
Roberto Martín-Martín
Michelle A. Lee
Rachel Gardner
Silvio Savarese
Jeannette Bohg
Animesh Garg
95
198
0
20 Jun 2019
Exploring Model-based Planning with Policy Networks
Exploring Model-based Planning with Policy Networks
Tingwu Wang
Jimmy Ba
128
150
0
20 Jun 2019
When to Trust Your Model: Model-Based Policy Optimization
When to Trust Your Model: Model-Based Policy Optimization
Michael Janner
Justin Fu
Marvin Zhang
Sergey Levine
OffRL
133
965
0
19 Jun 2019
Reward Prediction Error as an Exploration Objective in Deep RL
Reward Prediction Error as an Exploration Objective in Deep RL
Riley Simmons-Edler
Ben Eisner
Daniel Yang
Anthony Bisulco
E. Mitchell
Sebastian Seung
Daniel D. Lee
69
5
0
19 Jun 2019
Wasserstein Adversarial Imitation Learning
Wasserstein Adversarial Imitation Learning
Huang Xiao
Michael Herman
Joerg Wagner
Sebastian Ziesche
Jalal Etesami
T. H. Linh
56
72
0
19 Jun 2019
Gap-Increasing Policy Evaluation for Efficient and Noise-Tolerant
  Reinforcement Learning
Gap-Increasing Policy Evaluation for Efficient and Noise-Tolerant Reinforcement Learning
Tadashi Kozuno
Dongqi Han
Kenji Doya
OffRL
53
2
0
18 Jun 2019
RIDM: Reinforced Inverse Dynamics Modeling for Learning from a Single
  Observed Demonstration
RIDM: Reinforced Inverse Dynamics Modeling for Learning from a Single Observed Demonstration
Brahma S. Pavse
F. Torabi
Josiah P. Hanna
Garrett A. Warnell
Peter Stone
99
33
0
18 Jun 2019
NeoNav: Improving the Generalization of Visual Navigation via Generating
  Next Expected Observations
NeoNav: Improving the Generalization of Visual Navigation via Generating Next Expected Observations
Qiaoyun Wu
Tianyi Zhou
Jun Wang
Kai Xu
170
15
0
17 Jun 2019
Is the Policy Gradient a Gradient?
Is the Policy Gradient a Gradient?
Chris Nota
Philip S. Thomas
101
58
0
17 Jun 2019
Learning-Driven Exploration for Reinforcement Learning
Learning-Driven Exploration for Reinforcement Learning
Muhammad Usama
D. Chang
67
11
0
17 Jun 2019
Direct Policy Gradients: Direct Optimization of Policies in Discrete
  Action Spaces
Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces
Guy Lorberbom
Chris J. Maddison
N. Heess
Tamir Hazan
Daniel Tarlow
92
8
0
14 Jun 2019
Goal-conditioned Imitation Learning
Goal-conditioned Imitation Learning
Yiming Ding
Carlos Florensa
Mariano Phielipp
Pieter Abbeel
101
228
0
13 Jun 2019
Sub-policy Adaptation for Hierarchical Reinforcement Learning
Sub-policy Adaptation for Hierarchical Reinforcement Learning
Alexander C. Li
Carlos Florensa
I. Clavera
Pieter Abbeel
102
74
0
13 Jun 2019
Deep Reinforcement Learning for Cyber Security
Deep Reinforcement Learning for Cyber Security
Thanh Thi Nguyen
Vijay Janapa Reddi
OffRLAI4CE
117
335
0
13 Jun 2019
Conditioning of Reinforcement Learning Agents and its Policy
  Regularization Application
Conditioning of Reinforcement Learning Agents and its Policy Regularization Application
Arip Asadulaev
Igor Kuznetsov
Gideon Stein
Andrey Filchenkov
18
0
0
13 Jun 2019
Efficient Exploration via State Marginal Matching
Efficient Exploration via State Marginal Matching
Lisa Lee
Benjamin Eysenbach
Emilio Parisotto
Eric Xing
Sergey Levine
Ruslan Salakhutdinov
147
248
0
12 Jun 2019
Search on the Replay Buffer: Bridging Planning and Reinforcement
  Learning
Search on the Replay Buffer: Bridging Planning and Reinforcement Learning
Benjamin Eysenbach
Ruslan Salakhutdinov
Sergey Levine
OffRL
105
293
0
12 Jun 2019
Learning the Graphical Structure of Electronic Health Records with Graph
  Convolutional Transformer
Learning the Graphical Structure of Electronic Health Records with Graph Convolutional Transformer
Edward Choi
Zhen Xu
Yujia Li
Michael W. Dusenberry
Gerardo Flores
Yuan Xue
Andrew M. Dai
MedIm
88
246
0
11 Jun 2019
Learning Powerful Policies by Using Consistent Dynamics Model
Learning Powerful Policies by Using Consistent Dynamics Model
Shagun Sodhani
Anirudh Goyal
T. Deleu
Yoshua Bengio
Sergey Levine
Jian Tang
OffRL
47
5
0
11 Jun 2019
Learning to Score Behaviors for Guided Policy Optimization
Learning to Score Behaviors for Guided Policy Optimization
Aldo Pacchiano
Jack Parker-Holder
Yunhao Tang
A. Choromańska
K. Choromanski
Michael I. Jordan
101
39
0
11 Jun 2019
Exploration via Hindsight Goal Generation
Exploration via Hindsight Goal Generation
Zhizhou Ren
Kefan Dong
Yuanshuo Zhou
Qiang Liu
Jian-wei Peng
91
90
0
10 Jun 2019
Exploiting the Sign of the Advantage Function to Learn Deterministic
  Policies in Continuous Domains
Exploiting the Sign of the Advantage Function to Learn Deterministic Policies in Continuous Domains
Matthieu Zimmer
Paul Weng
35
7
0
10 Jun 2019
Reducing the variance in online optimization by transporting past
  gradients
Reducing the variance in online optimization by transporting past gradients
Sébastien M. R. Arnold
Pierre-Antoine Manzagol
Reza Babanezhad
Ioannis Mitliagkas
Nicolas Le Roux
93
28
0
08 Jun 2019
Empirical Likelihood for Contextual Bandits
Empirical Likelihood for Contextual Bandits
Nikos Karampatziakis
John Langford
Paul Mineiro
OffRL
147
9
0
07 Jun 2019
How to Initialize your Network? Robust Initialization for WeightNorm &
  ResNets
How to Initialize your Network? Robust Initialization for WeightNorm & ResNets
Devansh Arpit
Victor Campos
Yoshua Bengio
83
59
0
05 Jun 2019
Continuous Control for Automated Lane Change Behavior Based on Deep
  Deterministic Policy Gradient Algorithm
Continuous Control for Automated Lane Change Behavior Based on Deep Deterministic Policy Gradient Algorithm
Pin Wang
Hanhan Li
Ching-yao Chan
67
52
0
05 Jun 2019
Machine Learning and System Identification for Estimation in Physical
  Systems
Machine Learning and System Identification for Estimation in Physical Systems
Fredrik Bagge Carlson
OOD
56
5
0
05 Jun 2019
Global Optimality Guarantees For Policy Gradient Methods
Global Optimality Guarantees For Policy Gradient Methods
Jalaj Bhandari
Daniel Russo
132
193
0
05 Jun 2019
BayesSim: adaptive domain randomization via probabilistic inference for
  robotics simulators
BayesSim: adaptive domain randomization via probabilistic inference for robotics simulators
F. Ramos
Rafael Possas
Dieter Fox
62
158
0
04 Jun 2019
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Aviral Kumar
Justin Fu
George Tucker
Sergey Levine
OffRLOnRL
156
1,070
0
03 Jun 2019
Deep Reinforcement Learning Architecture for Continuous Power Allocation
  in High Throughput Satellites
Deep Reinforcement Learning Architecture for Continuous Power Allocation in High Throughput Satellites
J. Luis
Markus Guerster
Iñigo Del Portillo
E. Crawley
B. Cameron
19
18
0
03 Jun 2019
Harnessing Reinforcement Learning for Neural Motion Planning
Harnessing Reinforcement Learning for Neural Motion Planning
Tom Jurgenson
Aviv Tamar
OOD
109
65
0
01 Jun 2019
Policy Optimization Provably Converges to Nash Equilibria in Zero-Sum
  Linear Quadratic Games
Policy Optimization Provably Converges to Nash Equilibria in Zero-Sum Linear Quadratic Games
Jianchao Tan
Zhuoran Yang
Tamer Basar
119
128
0
31 May 2019
Reinforcement Learning Experience Reuse with Policy Residual
  Representation
Reinforcement Learning Experience Reuse with Policy Residual Representation
Wen-Ji Zhou
Yang Yu
Yingfeng Chen
Kai Guan
Tangjie Lv
Changjie Fan
Zhi Zhou
OffRL
19
2
0
31 May 2019
Previous
123...282930...394041
Next