ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.05477
  4. Cited By
Trust Region Policy Optimization
v1v2v3v4v5 (latest)

Trust Region Policy Optimization

19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
ArXiv (abs)PDFHTML

Papers citing "Trust Region Policy Optimization"

50 / 2,012 papers shown
Title
Policy Smoothing for Provably Robust Reinforcement Learning
Policy Smoothing for Provably Robust Reinforcement Learning
Aounon Kumar
Alexander Levine
Soheil Feizi
AAML
131
59
0
21 Jun 2021
A Max-Min Entropy Framework for Reinforcement Learning
A Max-Min Entropy Framework for Reinforcement Learning
Seungyul Han
Y. Sung
98
23
0
19 Jun 2021
Learning from Demonstration without Demonstrations
Learning from Demonstration without Demonstrations
Tom Blau
Gilad Francis
Philippe Morere
OffRL
54
1
0
17 Jun 2021
Behavioral Priors and Dynamics Models: Improving Performance and Domain
  Transfer in Offline RL
Behavioral Priors and Dynamics Models: Improving Performance and Domain Transfer in Offline RL
Catherine Cang
Aravind Rajeswaran
Pieter Abbeel
Michael Laskin
OffRL
70
30
0
16 Jun 2021
Offline RL Without Off-Policy Evaluation
Offline RL Without Off-Policy Evaluation
David Brandfonbrener
William F. Whitney
Rajesh Ranganath
Joan Bruna
OffRL
128
171
0
16 Jun 2021
On the Sample Complexity and Metastability of Heavy-tailed Policy Search
  in Continuous Control
On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control
Amrit Singh Bedi
Anjaly Parayil
Junyu Zhang
Mengdi Wang
Alec Koppel
90
15
0
15 Jun 2021
Sample Efficient Reinforcement Learning In Continuous State Spaces: A
  Perspective Beyond Linearity
Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity
Dhruv Malik
Aldo Pacchiano
Vishwak Srinivasan
Yuanzhi Li
57
6
0
15 Jun 2021
On-Policy Deep Reinforcement Learning for the Average-Reward Criterion
On-Policy Deep Reinforcement Learning for the Average-Reward Criterion
Yiming Zhang
George Andriopoulos
OffRL
93
43
0
14 Jun 2021
Characterizing the Gap Between Actor-Critic and Policy Gradient
Characterizing the Gap Between Actor-Critic and Policy Gradient
Junfeng Wen
Saurabh Kumar
Ramki Gummadi
Dale Schuurmans
92
15
0
13 Jun 2021
Recomposing the Reinforcement Learning Building Blocks with
  Hypernetworks
Recomposing the Reinforcement Learning Building Blocks with Hypernetworks
Shai Keynan
Elad Sarafian
Sarit Kraus
OffRL
97
30
0
12 Jun 2021
Policy Gradient Bayesian Robust Optimization for Imitation Learning
Policy Gradient Bayesian Robust Optimization for Imitation Learning
Zaynah Javed
Daniel S. Brown
Satvik Sharma
Jerry Zhu
Ashwin Balakrishna
Marek Petrik
Anca Dragan
Ken Goldberg
128
17
0
11 Jun 2021
Keyframe-Focused Visual Imitation Learning
Keyframe-Focused Visual Imitation Learning
Chuan Wen
Jierui Lin
Jianing Qian
Yang Gao
Dinesh Jayaraman
VGen
81
19
0
11 Jun 2021
GDI: Rethinking What Makes Reinforcement Learning Different From
  Supervised Learning
GDI: Rethinking What Makes Reinforcement Learning Different From Supervised Learning
Jiajun Fan
Changnan Xiao
Yue Huang
OffRL
93
10
0
11 Jun 2021
Taylor Expansion of Discount Factors
Taylor Expansion of Discount Factors
Yunhao Tang
Mark Rowland
Rémi Munos
Michal Valko
OffRL
72
5
0
11 Jun 2021
Differentiable Robust LQR Layers
Differentiable Robust LQR Layers
Ngo Anh Vien
Gerhard Neumann
54
4
0
10 Jun 2021
Simplifying Deep Reinforcement Learning via Self-Supervision
Simplifying Deep Reinforcement Learning via Self-Supervision
Daochen Zha
Kwei-Herng Lai
Kaixiong Zhou
Helen Zhou
SSL
94
15
0
10 Jun 2021
Artificial Intelligence in Drug Discovery: Applications and Techniques
Artificial Intelligence in Drug Discovery: Applications and Techniques
Jianyuan Deng
Zhibo Yang
Iwao Ojima
Dimitris Samaras
Fusheng Wang
AI4TS
152
113
0
09 Jun 2021
Self-Paced Context Evaluation for Contextual Reinforcement Learning
Self-Paced Context Evaluation for Contextual Reinforcement Learning
Theresa Eimer
André Biedenkapp
Frank Hutter
Marius Lindauer
OffRLLRM
99
25
0
09 Jun 2021
PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via
  Relabeling Experience and Unsupervised Pre-training
PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training
Kimin Lee
Laura M. Smith
Pieter Abbeel
OffRL
92
289
0
09 Jun 2021
RLupus: Cooperation through emergent communication in The Werewolf
  social deduction game
RLupus: Cooperation through emergent communication in The Werewolf social deduction game
Nicolo’ Brandizzi
D. Grossi
Luca Iocchi
71
9
0
09 Jun 2021
A Bi-Level Framework for Learning to Solve Combinatorial Optimization on
  Graphs
A Bi-Level Framework for Learning to Solve Combinatorial Optimization on Graphs
Runzhong Wang
Zhigang Hua
Gan Liu
Jiayi Zhang
Junchi Yan
Feng Qi
Shuang Yang
Jun Zhou
Xiaokang Yang
60
44
0
09 Jun 2021
Dynamic Sparse Training for Deep Reinforcement Learning
Dynamic Sparse Training for Deep Reinforcement Learning
Ghada Sokar
Elena Mocanu
Decebal Constantin Mocanu
Mykola Pechenizkiy
Peter Stone
111
60
0
08 Jun 2021
Linear Convergence of Entropy-Regularized Natural Policy Gradient with
  Linear Function Approximation
Linear Convergence of Entropy-Regularized Natural Policy Gradient with Linear Function Approximation
Semih Cayci
Niao He
R. Srikant
113
36
0
08 Jun 2021
Concave Utility Reinforcement Learning: the Mean-Field Game Viewpoint
Concave Utility Reinforcement Learning: the Mean-Field Game Viewpoint
Matthieu Geist
Julien Pérolat
Mathieu Laurière
Romuald Elie
Sarah Perrin
Olivier Bachem
Rémi Munos
Olivier Pietquin
108
65
0
07 Jun 2021
Average-Reward Reinforcement Learning with Trust Region Methods
Average-Reward Reinforcement Learning with Trust Region Methods
Xiaoteng Ma
Xiao-Jing Tang
Li Xia
Jun Yang
Qianchuan Zhao
61
18
0
07 Jun 2021
Learning Policies with Zero or Bounded Constraint Violation for
  Constrained MDPs
Learning Policies with Zero or Bounded Constraint Violation for Constrained MDPs
Tao-Wen Liu
Ruida Zhou
D. Kalathil
P. R. Kumar
Chao Tian
114
84
0
04 Jun 2021
Spatial Graph Attention and Curiosity-driven Policy for Antiviral Drug
  Discovery
Spatial Graph Attention and Curiosity-driven Policy for Antiviral Drug Discovery
Yulun Wu
Mikaela Cashman
Nicholas Choma
E. Prates
V. G. M. Vergara
...
M. Head
Rick L. Stevens
Peter Nugent
Daniel A. Jacobson
James B. Brown
GNN
91
10
0
04 Jun 2021
Robot in a China Shop: Using Reinforcement Learning for
  Location-Specific Navigation Behaviour
Robot in a China Shop: Using Reinforcement Learning for Location-Specific Navigation Behaviour
Xihan Bian
Oscar Alejandro Mendez Maldonado
Simon Hadfield
42
3
0
02 Jun 2021
Variational Empowerment as Representation Learning for Goal-Based
  Reinforcement Learning
Variational Empowerment as Representation Learning for Goal-Based Reinforcement Learning
Jongwook Choi
Archit Sharma
Honglak Lee
Sergey Levine
S. Gu
DRL
67
21
0
02 Jun 2021
Learning to schedule job-shop problems: Representation and policy
  learning using graph neural network and reinforcement learning
Learning to schedule job-shop problems: Representation and policy learning using graph neural network and reinforcement learning
Junyoung Park
J. Chun
Sang Hun Kim
Youngkook Kim
Jinkyoo Park
GNN
60
219
0
02 Jun 2021
What Matters for Adversarial Imitation Learning?
What Matters for Adversarial Imitation Learning?
Manu Orsini
Anton Raichuk
Léonard Hussenot
Damien Vincent
Robert Dadashi
Sertan Girgin
Matthieu Geist
Olivier Bachem
Olivier Pietquin
Marcin Andrychowicz
126
78
0
01 Jun 2021
Reward is enough for convex MDPs
Reward is enough for convex MDPs
Tom Zahavy
Brendan O'Donoghue
Guillaume Desjardins
Satinder Singh
137
76
0
01 Jun 2021
Improving Long-Term Metrics in Recommendation Systems using
  Short-Horizon Reinforcement Learning
Improving Long-Term Metrics in Recommendation Systems using Short-Horizon Reinforcement Learning
Bogdan Mazoure
Paul Mineiro
Pavithra Srinath
R. S. Sedeh
Doina Precup
Adith Swaminathan
OffRL
70
4
0
01 Jun 2021
AppBuddy: Learning to Accomplish Tasks in Mobile Apps via Reinforcement
  Learning
AppBuddy: Learning to Accomplish Tasks in Mobile Apps via Reinforcement Learning
Maayan Shvo
Zhiming Hu
Rodrigo Toro Icarte
Iqbal Mohomed
A. Jepson
Sheila A. McIlraith
99
14
0
31 May 2021
Safe Pontryagin Differentiable Programming
Safe Pontryagin Differentiable Programming
Wanxin Jin
Shaoshuai Mou
George J. Pappas
99
41
0
31 May 2021
Generative Adversarial Imitation Learning for Empathy-based AI
Generative Adversarial Imitation Learning for Empathy-based AI
Pratyush Muthukumar
Karishma Muthukumar
Deepan Muthirayan
Pramod P. Khargonekar
AI4MH
40
2
0
27 May 2021
Robust Value Iteration for Continuous Control Tasks
Robust Value Iteration for Continuous Control Tasks
M. Lutter
Shie Mannor
Jan Peters
Dieter Fox
Animesh Garg
64
19
0
25 May 2021
A Generalised Inverse Reinforcement Learning Framework
A Generalised Inverse Reinforcement Learning Framework
Firas Jarboui
Vianney Perchet
38
4
0
25 May 2021
GMAC: A Distributional Perspective on Actor-Critic Framework
GMAC: A Distributional Perspective on Actor-Critic Framework
D. W. Nam
Younghoon Kim
Chan Y. Park
79
17
0
24 May 2021
Policy Mirror Descent for Regularized Reinforcement Learning: A
  Generalized Framework with Linear Convergence
Policy Mirror Descent for Regularized Reinforcement Learning: A Generalized Framework with Linear Convergence
Wenhao Zhan
Shicong Cen
Baihe Huang
Yuxin Chen
Jason D. Lee
Yuejie Chi
98
78
0
24 May 2021
Feasible Actor-Critic: Constrained Reinforcement Learning for Ensuring
  Statewise Safety
Feasible Actor-Critic: Constrained Reinforcement Learning for Ensuring Statewise Safety
Haitong Ma
Yang Guan
Shegnbo Eben Li
Xiangteng Zhang
Sifa Zheng
Jianyu Chen
95
37
0
22 May 2021
Objective-aware Traffic Simulation via Inverse Reinforcement Learning
Objective-aware Traffic Simulation via Inverse Reinforcement Learning
Guanjie Zheng
Hanyang Liu
Kai Xu
Z. Li
79
11
0
20 May 2021
Controlling an Inverted Pendulum with Policy Gradient Methods-A Tutorial
Controlling an Inverted Pendulum with Policy Gradient Methods-A Tutorial
Swagat Kumar
25
2
0
17 May 2021
Mean Field Games Flock! The Reinforcement Learning Way
Mean Field Games Flock! The Reinforcement Learning Way
Sarah Perrin
Mathieu Laurière
Julien Pérolat
Matthieu Geist
Romuald Élie
Olivier Pietquin
AI4CE
79
47
0
17 May 2021
DRAS-CQSim: A Reinforcement Learning based Framework for HPC Cluster
  Scheduling
DRAS-CQSim: A Reinforcement Learning based Framework for HPC Cluster Scheduling
Yuping Fan
Z. Lan
16
14
0
16 May 2021
Learning Control Policies for Imitating Human Gaits
Learning Control Policies for Imitating Human Gaits
Utkarsh Aashu Mishra
51
0
0
15 May 2021
Identity Concealment Games: How I Learned to Stop Revealing and Love the
  Coincidences
Identity Concealment Games: How I Learned to Stop Revealing and Love the Coincidences
Mustafa O. Karabag
Melkior Ornik
Ufuk Topcu
82
3
0
12 May 2021
Hierarchical RNNs-Based Transformers MADDPG for Mixed
  Cooperative-Competitive Environments
Hierarchical RNNs-Based Transformers MADDPG for Mixed Cooperative-Competitive Environments
Xiaolong Wei
Lifang Yang
Xianglin Huang
Gang Cao
Zhulin Tao
Zhengyang Du
Jing An
53
6
0
11 May 2021
Adaptive Policy Transfer in Reinforcement Learning
Adaptive Policy Transfer in Reinforcement Learning
Girish Joshi
Girish Chowdhary
23
3
0
10 May 2021
Scalable, Decentralized Multi-Agent Reinforcement Learning Methods
  Inspired by Stigmergy and Ant Colonies
Scalable, Decentralized Multi-Agent Reinforcement Learning Methods Inspired by Stigmergy and Ant Colonies
Austin Nguyen
42
1
0
08 May 2021
Previous
123...141516...394041
Next