ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.05477
  4. Cited By
Trust Region Policy Optimization
v1v2v3v4v5 (latest)

Trust Region Policy Optimization

19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
ArXiv (abs)PDFHTML

Papers citing "Trust Region Policy Optimization"

50 / 2,009 papers shown
Title
Reinforcement Learning in Practice: Opportunities and Challenges
Reinforcement Learning in Practice: Opportunities and Challenges
Yuxi Li
OffRL
73
9
0
23 Feb 2022
Reward-Free Policy Space Compression for Reinforcement Learning
Reward-Free Policy Space Compression for Reinforcement Learning
Mirco Mutti
Stefano Del Col
Marcello Restelli
41
3
0
22 Feb 2022
Policy Learning and Evaluation with Randomized Quasi-Monte Carlo
Policy Learning and Evaluation with Randomized Quasi-Monte Carlo
Sébastien M. R. Arnold
P. LÉcuyer
Liyu Chen
Yi-fan Chen
Fei Sha
OffRL
84
4
0
16 Feb 2022
Disentangling Successor Features for Coordination in Multi-agent
  Reinforcement Learning
Disentangling Successor Features for Coordination in Multi-agent Reinforcement Learning
Seungchan Kim
Neale Van Stralen
Girish Chowdhary
Huy T. Tran
41
0
0
15 Feb 2022
CUP: A Conservative Update Policy Algorithm for Safe Reinforcement
  Learning
CUP: A Conservative Update Policy Algorithm for Safe Reinforcement Learning
Long Yang
Jiaming Ji
Juntao Dai
Yu Zhang
Pengfei Li
Gang Pan
71
17
0
15 Feb 2022
Saute RL: Almost Surely Safe Reinforcement Learning Using State
  Augmentation
Saute RL: Almost Surely Safe Reinforcement Learning Using State Augmentation
Aivar Sootla
Alexander I. Cowen-Rivers
Taher Jafferjee
Ziyan Wang
D. Mguni
Jun Wang
Haitham Bou-Ammar
132
54
0
14 Feb 2022
Sequential Bayesian experimental designs via reinforcement learning
Sequential Bayesian experimental designs via reinforcement learning
Hikaru Asano
OffRL
82
0
0
14 Feb 2022
Individual-Level Inverse Reinforcement Learning for Mean Field Games
Individual-Level Inverse Reinforcement Learning for Mean Field Games
Yang Chen
Libo Zhang
Jiamou Liu
Shuyue Hu
AI4CE
82
9
0
13 Feb 2022
Autonomous Drone Swarm Navigation and Multi-target Tracking in 3D
  Environments with Dynamic Obstacles
Autonomous Drone Swarm Navigation and Multi-target Tracking in 3D Environments with Dynamic Obstacles
Suleman Qamar
Dr. Saddam Hussain Khan
Muhammad Arif Arshad
Maryam Qamar
Asifullah Khan
66
16
0
13 Feb 2022
Supported Policy Optimization for Offline Reinforcement Learning
Supported Policy Optimization for Offline Reinforcement Learning
Jialong Wu
Haixu Wu
Zihan Qiu
Jianmin Wang
Mingsheng Long
OffRL
104
70
0
13 Feb 2022
A Unified Perspective on Value Backup and Exploration in Monte-Carlo
  Tree Search
A Unified Perspective on Value Backup and Exploration in Monte-Carlo Tree Search
Tuan Dam
Carlo DÉramo
Jan Peters
Joni Pajarinen
64
1
0
11 Feb 2022
Online Decision Transformer
Online Decision Transformer
Qinqing Zheng
Amy Zhang
Aditya Grover
OffRL
95
209
0
11 Feb 2022
Reinforcement Learning with Sparse Rewards using Guidance from Offline
  Demonstration
Reinforcement Learning with Sparse Rewards using Guidance from Offline Demonstration
Desik Rengarajan
G. Vaidya
Akshay Sarvesh
D. Kalathil
S. Shakkottai
OffRL
64
59
0
09 Feb 2022
Approximating Gradients for Differentiable Quality Diversity in
  Reinforcement Learning
Approximating Gradients for Differentiable Quality Diversity in Reinforcement Learning
Bryon Tjanaka
Matthew C. Fontaine
Julian Togelius
Stefanos Nikolaidis
89
54
0
08 Feb 2022
Policy Optimization for Stochastic Shortest Path
Policy Optimization for Stochastic Shortest Path
Liyu Chen
Haipeng Luo
Aviv A. Rosenberg
81
12
0
07 Feb 2022
TRGP: Trust Region Gradient Projection for Continual Learning
TRGP: Trust Region Gradient Projection for Continual Learning
Sen Lin
Li Yang
Deliang Fan
Junshan Zhang
CLL
140
81
0
07 Feb 2022
Soft Actor-Critic with Inhibitory Networks for Faster Retraining
Soft Actor-Critic with Inhibitory Networks for Faster Retraining
J. Ide
Daria Mićović
Michael J. Guarino
K. Alcedo
D. Rosenbluth
Adrian P. Pope
58
3
0
07 Feb 2022
ExPoSe: Combining State-Based Exploration with Gradient-Based Online
  Search
ExPoSe: Combining State-Based Exploration with Gradient-Based Online Search
Dixant Mittal
Siddharth Aravindan
W. Lee
OnRL
50
3
0
03 Feb 2022
PAGE-PG: A Simple and Loopless Variance-Reduced Policy Gradient Method
  with Probabilistic Gradient Estimation
PAGE-PG: A Simple and Loopless Variance-Reduced Policy Gradient Method with Probabilistic Gradient Estimation
Matilde Gargiani
Andrea Zanelli
Andrea Martinelli
Tyler H. Summers
John Lygeros
78
14
0
01 Feb 2022
You May Not Need Ratio Clipping in PPO
You May Not Need Ratio Clipping in PPO
Mingfei Sun
Vitaly Kurin
Guoqing Liu
Sam Devlin
Tao Qin
Katja Hofmann
Shimon Whiteson
62
16
0
31 Jan 2022
Communication-Efficient Consensus Mechanism for Federated Reinforcement
  Learning
Communication-Efficient Consensus Mechanism for Federated Reinforcement Learning
Xing Xu
Rongpeng Li
Zhifeng Zhao
Honggang Zhang
FedML
73
6
0
30 Jan 2022
Towards Safe Reinforcement Learning with a Safety Editor Policy
Towards Safe Reinforcement Learning with a Safety Editor Policy
Haonan Yu
Wei Xu
Haichao Zhang
OffRL
149
31
0
28 Jan 2022
Leveraging class abstraction for commonsense reinforcement learning via
  residual policy gradient methods
Leveraging class abstraction for commonsense reinforcement learning via residual policy gradient methods
Niklas Höpner
Ilaria Tiddi
H. V. Hoof
68
3
0
28 Jan 2022
STOPS: Short-Term-based Volatility-controlled Policy Search and its
  Global Convergence
STOPS: Short-Term-based Volatility-controlled Policy Search and its Global Convergence
Liang Xu
Daoming Lyu
Yangchen Pan
Aiwen Jiang
Bo Liu
95
0
0
24 Jan 2022
GoSafeOpt: Scalable Safe Exploration for Global Optimization of
  Dynamical Systems
GoSafeOpt: Scalable Safe Exploration for Global Optimization of Dynamical Systems
Bhavya Sukhija
M. Turchetta
David Lindner
Andreas Krause
Sebastian Trimpe
Dominik Baumann
133
19
0
24 Jan 2022
Pseudo-Labeled Auto-Curriculum Learning for Semi-Supervised Keypoint
  Localization
Pseudo-Labeled Auto-Curriculum Learning for Semi-Supervised Keypoint Localization
Can Wang
Sheng Jin
Yingda Guan
Wentao Liu
Chao Qian
Ping Luo
Wanli Ouyang
82
14
0
21 Jan 2022
A Prescriptive Dirichlet Power Allocation Policy with Deep Reinforcement
  Learning
A Prescriptive Dirichlet Power Allocation Policy with Deep Reinforcement Learning
Yuan Tian
Minghao Han
Chetan S. Kulkarni
Olga Fink
71
13
0
20 Jan 2022
Differentially Private Reinforcement Learning with Linear Function
  Approximation
Differentially Private Reinforcement Learning with Linear Function Approximation
Xingyu Zhou
97
26
0
18 Jan 2022
Profitable Strategy Design by Using Deep Reinforcement Learning for
  Trades on Cryptocurrency Markets
Profitable Strategy Design by Using Deep Reinforcement Learning for Trades on Cryptocurrency Markets
Mohsen Asgari
S. H. Khasteh
71
5
0
15 Jan 2022
Cooperative Multi-Agent Deep Reinforcement Learning for Reliable
  Surveillance via Autonomous Multi-UAV Control
Cooperative Multi-Agent Deep Reinforcement Learning for Reliable Surveillance via Autonomous Multi-UAV Control
Won Joon Yun
Soohyun Park
Joongheon Kim
Myungjae Shin
Soyi Jung
David A. Mohaisen
Jae-Hyun Kim
54
136
0
15 Jan 2022
Comparing Model-free and Model-based Algorithms for Offline
  Reinforcement Learning
Comparing Model-free and Model-based Algorithms for Offline Reinforcement Learning
Phillip Swazinna
Steffen Udluft
D. Hein
Thomas Runkler
OffRL
67
26
0
14 Jan 2022
Benchmarking Deep Reinforcement Learning Algorithms for Vision-based
  Robotics
Benchmarking Deep Reinforcement Learning Algorithms for Vision-based Robotics
Swagat Kumar
Hayden Sampson
Ardhendu Behera
40
0
0
11 Jan 2022
Automated Reinforcement Learning (AutoRL): A Survey and Open Problems
Automated Reinforcement Learning (AutoRL): A Survey and Open Problems
Jack Parker-Holder
Raghunandan Rajan
Xingyou Song
André Biedenkapp
Yingjie Miao
...
Vu-Linh Nguyen
Roberto Calandra
Aleksandra Faust
Frank Hutter
Marius Lindauer
AI4CE
116
107
0
11 Jan 2022
Admissible Policy Teaching through Reward Design
Admissible Policy Teaching through Reward Design
Kiarash Banihashem
Adish Singla
Jiarui Gan
Goran Radanović
81
15
0
06 Jan 2022
SABLAS: Learning Safe Control for Black-box Dynamical Systems
SABLAS: Learning Safe Control for Black-box Dynamical Systems
Zengyi Qin
Dawei Sun
Chuchu Fan
89
43
0
06 Jan 2022
Using Simulation Optimization to Improve Zero-shot Policy Transfer of
  Quadrotors
Using Simulation Optimization to Improve Zero-shot Policy Transfer of Quadrotors
Sven Gronauer
Matthias Kissel
L. Sacchetto
Mathias Korte
Klaus Diepold
61
6
0
04 Jan 2022
Stochastic convex optimization for provably efficient apprenticeship
  learning
Stochastic convex optimization for provably efficient apprenticeship learning
Angeliki Kamoutsi
G. Banjac
John Lygeros
46
1
0
31 Dec 2021
Adaptive Gaussian Process based Stochastic Trajectory Optimization for
  Motion Planning
Adaptive Gaussian Process based Stochastic Trajectory Optimization for Motion Planning
Yichang Feng
Haiyun Zhang
Jin Wang
Guodong Lu
74
30
0
30 Dec 2021
Efficient Performance Bounds for Primal-Dual Reinforcement Learning from
  Demonstrations
Efficient Performance Bounds for Primal-Dual Reinforcement Learning from Demonstrations
Angeliki Kamoutsi
G. Banjac
John Lygeros
OffRL
75
8
0
28 Dec 2021
Parallelized and Randomized Adversarial Imitation Learning for
  Safety-Critical Self-Driving Vehicles
Parallelized and Randomized Adversarial Imitation Learning for Safety-Critical Self-Driving Vehicles
Won Joon Yun
Myungjae Shin
Soyi Jung
S. Kwon
Joongheon Kim
66
6
0
26 Dec 2021
On the Unreasonable Efficiency of State Space Clustering in
  Personalization Tasks
On the Unreasonable Efficiency of State Space Clustering in Personalization Tasks
Anton Dereventsov
R. Vatsavai
Clayton Webster
78
5
0
24 Dec 2021
Curriculum Learning for Safe Mapless Navigation
Curriculum Learning for Safe Mapless Navigation
Luca Marzari
Davide Corsi
Enrico Marchesini
Alessandro Farinelli
79
15
0
23 Dec 2021
Alpha-Mini: Minichess Agent with Deep Reinforcement Learning
Alpha-Mini: Minichess Agent with Deep Reinforcement Learning
Michael Sun
R. Tan
LLMAG
24
0
0
22 Dec 2021
Soft Actor-Critic with Cross-Entropy Policy Optimization
Soft Actor-Critic with Cross-Entropy Policy Optimization
Zhenyang Shi
Surya Pal Singh
50
5
0
21 Dec 2021
Nearly Optimal Policy Optimization with Stable at Any Time Guarantee
Nearly Optimal Policy Optimization with Stable at Any Time Guarantee
Tianhao Wu
Yunchang Yang
Han Zhong
Liwei Wang
S. Du
Jiantao Jiao
127
14
0
21 Dec 2021
Adaptive Incentive Design with Multi-Agent Meta-Gradient Reinforcement
  Learning
Adaptive Incentive Design with Multi-Agent Meta-Gradient Reinforcement Learning
Jiachen Yang
Ethan Wang
Rakshit S. Trivedi
T. Zhao
H. Zha
78
20
0
20 Dec 2021
Differentially Private Regret Minimization in Episodic Markov Decision
  Processes
Differentially Private Regret Minimization in Episodic Markov Decision Processes
Sayak Ray Chowdhury
Xingyu Zhou
85
22
0
20 Dec 2021
Masked Deep Q-Recommender for Effective Question Scheduling
Masked Deep Q-Recommender for Effective Question Scheduling
Keunhyung Chung
D. Kim
Sangheon Lee
Guik Jung
AI4Ed
20
0
0
19 Dec 2021
Integrated Guidance and Control for Lunar Landing using a Stabilized
  Seeker
Integrated Guidance and Control for Lunar Landing using a Stabilized Seeker
B. Gaudet
R. Furfaro
41
5
0
16 Dec 2021
Conservative and Adaptive Penalty for Model-Based Safe Reinforcement
  Learning
Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning
Yecheng Jason Ma
Andrew Shen
Osbert Bastani
Dinesh Jayaraman
61
25
0
14 Dec 2021
Previous
123...101112...394041
Next