ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.05477
  4. Cited By
Trust Region Policy Optimization
v1v2v3v4v5 (latest)

Trust Region Policy Optimization

19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
ArXiv (abs)PDFHTML

Papers citing "Trust Region Policy Optimization"

50 / 2,009 papers shown
Title
Self-Adaptive Driving in Nonstationary Environments through Conjectural
  Online Lookahead Adaptation
Self-Adaptive Driving in Nonstationary Environments through Conjectural Online Lookahead Adaptation
Tao Li
Haozhe Lei
Quanyan Zhu
124
11
0
06 Oct 2022
Is Reinforcement Learning (Not) for Natural Language Processing:
  Benchmarks, Baselines, and Building Blocks for Natural Language Policy
  Optimization
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Rajkumar Ramamurthy
Prithviraj Ammanabrolu
Kianté Brantley
Jack Hessel
R. Sifa
Christian Bauckhage
Hannaneh Hajishirzi
Yejin Choi
OffRL
120
250
0
03 Oct 2022
Faster Last-iterate Convergence of Policy Optimization in Zero-Sum
  Markov Games
Faster Last-iterate Convergence of Policy Optimization in Zero-Sum Markov Games
Shicong Cen
Yuejie Chi
S. Du
Lin Xiao
134
38
0
03 Oct 2022
Safe Reinforcement Learning From Pixels Using a Stochastic Latent
  Representation
Safe Reinforcement Learning From Pixels Using a Stochastic Latent Representation
Yannick Hogewind
T. D. Simão
Tal Kachman
N. Jansen
69
10
0
02 Oct 2022
Policy Gradients for Probabilistic Constrained Reinforcement Learning
Policy Gradients for Probabilistic Constrained Reinforcement Learning
Weiqin Chen
D. Subramanian
Santiago Paternain
95
6
0
02 Oct 2022
Midas: A Multi-Joint Robotics Simulator with Intersection-Free
  Frictional Contact
Midas: A Multi-Joint Robotics Simulator with Intersection-Free Frictional Contact
Yunuo Chen
Minchen Li
Wenlong Lu
Chuyuan Fu
Chenfanfu Jiang
81
4
0
30 Sep 2022
Linear Convergence for Natural Policy Gradient with Log-linear Policy
  Parametrization
Linear Convergence for Natural Policy Gradient with Log-linear Policy Parametrization
Carlo Alfano
Patrick Rebeschini
108
14
0
30 Sep 2022
Reinforcement Learning Algorithms: An Overview and Classification
Reinforcement Learning Algorithms: An Overview and Classification
Fadi AlMahamid
Katarina Grolinger
39
45
0
29 Sep 2022
Online Policy Optimization for Robust MDP
Online Policy Optimization for Robust MDP
Jing Dong
Jingwei Li
Baoxiang Wang
J.N. Zhang
OffRL
99
15
0
28 Sep 2022
More Centralized Training, Still Decentralized Execution: Multi-Agent
  Conditional Policy Factorization
More Centralized Training, Still Decentralized Execution: Multi-Agent Conditional Policy Factorization
Jiangxing Wang
Deheng Ye
Zongqing Lu
OffRL
100
19
0
26 Sep 2022
Quantification before Selection: Active Dynamics Preference for Robust
  Reinforcement Learning
Quantification before Selection: Active Dynamics Preference for Robust Reinforcement Learning
Kang Xu
Yan Ma
Wei Li
97
0
0
23 Sep 2022
A Unified Perspective on Natural Gradient Variational Inference with
  Gaussian Mixture Models
A Unified Perspective on Natural Gradient Variational Inference with Gaussian Mixture Models
Oleg Arenz
Philipp Dahlinger
Zihan Ye
Michael Volpp
Gerhard Neumann
126
17
0
23 Sep 2022
Model-Free Reinforcement Learning for Asset Allocation
Model-Free Reinforcement Learning for Asset Allocation
Adebayo Oshingbesan
Eniola Ajiboye
Peruth Kamashazi
Timothy Mbaka
OffRL
59
1
0
21 Sep 2022
Revisiting Discrete Soft Actor-Critic
Revisiting Discrete Soft Actor-Critic
Haibin Zhou
Zichuan Lin
Junyou Li
Qiang Fu
Wei Yang
Deheng Ye
112
13
0
21 Sep 2022
Deep Generalized Schrödinger Bridge
Deep Generalized Schrödinger Bridge
Guan-Horng Liu
T. Chen
Oswin So
Evangelos A. Theodorou
OTAI4CE
100
37
0
20 Sep 2022
Robust Reinforcement Learning Algorithm for Vision-based Ship Landing of
  UAVs
Robust Reinforcement Learning Algorithm for Vision-based Ship Landing of UAVs
Vishnu Saj
Bochan Lee
D. Kalathil
Moble Benedict
63
5
0
17 Sep 2022
A Robust and Constrained Multi-Agent Reinforcement Learning Electric
  Vehicle Rebalancing Method in AMoD Systems
A Robust and Constrained Multi-Agent Reinforcement Learning Electric Vehicle Rebalancing Method in AMoD Systems
Sihong He
Yue Wang
Shuo Han
Shaofeng Zou
Fei Miao
68
12
0
17 Sep 2022
Trustworthy Reinforcement Learning Against Intrinsic Vulnerabilities:
  Robustness, Safety, and Generalizability
Trustworthy Reinforcement Learning Against Intrinsic Vulnerabilities: Robustness, Safety, and Generalizability
Mengdi Xu
Zuxin Liu
Peide Huang
Wenhao Ding
Zhepeng Cen
Yue Liu
Ding Zhao
175
47
0
16 Sep 2022
Reinforcement Learning-Based Cooperative P2P Power Trading between DC
  Nanogrid Clusters with Wind and PV Energy Resources
Reinforcement Learning-Based Cooperative P2P Power Trading between DC Nanogrid Clusters with Wind and PV Energy Resources
Sangkeum Lee
S. Nengroo
Hojun Jin
Taewook Heo
Y. Doh
Chun-leung Lee
Dongsoo Har
40
2
0
16 Sep 2022
Towards A Unified Policy Abstraction Theory and Representation Learning
  Approach in Markov Decision Processes
Towards A Unified Policy Abstraction Theory and Representation Learning Approach in Markov Decision Processes
Hao Fei
Hongyao Tang
Jianye Hao
Yan Zheng
OffRL
81
1
0
16 Sep 2022
Human-level Atari 200x faster
Human-level Atari 200x faster
Steven Kapturowski
Victor Campos
Ray Jiang
Nemanja Rakićević
Hado van Hasselt
Charles Blundell
Adria Puigdomenech Badia
OffRL
99
30
0
15 Sep 2022
Scalable Task-Driven Robotic Swarm Control via Collision Avoidance and
  Learning Mean-Field Control
Scalable Task-Driven Robotic Swarm Control via Collision Avoidance and Learning Mean-Field Control
Kai Cui
Mengguang Li
Christian Fabian
Heinz Koeppl
AI4CE
112
5
0
15 Sep 2022
Multi-Objective Policy Gradients with Topological Constraints
Multi-Objective Policy Gradients with Topological Constraints
K. H. Wray
Stas Tiomkin
Mykel J. Kochenderfer
Pieter Abbeel
47
2
0
15 Sep 2022
Constrained Update Projection Approach to Safe Policy Optimization
Constrained Update Projection Approach to Safe Policy Optimization
Long Yang
Jiaming Ji
Juntao Dai
Linrui Zhang
Binbin Zhou
Pengfei Li
Yaodong Yang
Gang Pan
114
48
0
15 Sep 2022
Towards self-attention based visual navigation in the real world
Towards self-attention based visual navigation in the real world
Jaime Ruiz-Serra
Jack White
Stephen M. Petrie
T. Kameneva
C. McCarthy
72
1
0
15 Sep 2022
Model-based Reinforcement Learning with Multi-step Plan Value Estimation
Model-based Reinforcement Learning with Multi-step Plan Value Estimation
Hao-Chu Lin
Yihao Sun
Jiajin Zhang
Yang Yu
OffRL
80
7
0
12 Sep 2022
Gradient Descent Temporal Difference-difference Learning
Gradient Descent Temporal Difference-difference Learning
Rong Zhu
James M. Murray
OffRL
68
1
0
10 Sep 2022
Natural Policy Gradients In Reinforcement Learning Explained
Natural Policy Gradients In Reinforcement Learning Explained
W. V. Heeswijk
32
2
0
05 Sep 2022
Variational Inference for Model-Free and Model-Based Reinforcement
  Learning
Variational Inference for Model-Free and Model-Based Reinforcement Learning
Felix Leibfried
OffRL
78
0
0
04 Sep 2022
Neural Approaches to Co-Optimization in Robotics
Neural Approaches to Co-Optimization in Robotics
Charles B. Schaff
111
1
0
01 Sep 2022
DRL Enabled Coverage and Capacity Optimization in STAR-RIS Assisted
  Networks
DRL Enabled Coverage and Capacity Optimization in STAR-RIS Assisted Networks
Xinyu Gao
Wenqiang Yi
Yuanwei Liu
Jianhua Zhang
Ping Zhang
26
5
0
01 Sep 2022
Dynamics-Adaptive Continual Reinforcement Learning via Progressive
  Contextualization
Dynamics-Adaptive Continual Reinforcement Learning via Progressive Contextualization
Tiantian Zhang
Zichuan Lin
Yuxing Wang
Deheng Ye
Qiang Fu
Wei Yang
Xueqian Wang
Bin Liang
Bo Yuan
Xiu Li
CLL
108
11
0
01 Sep 2022
Deep Anomaly Detection and Search via Reinforcement Learning
Chao Chen
Dawei Wang
Feng Mao
Zongzhang Zhang
Yang Yu
50
0
0
31 Aug 2022
Normality-Guided Distributional Reinforcement Learning for Continuous Control
Normality-Guided Distributional Reinforcement Learning for Continuous Control
Ju-Seung Byun
Andrew Perrault
OffRL
116
0
0
28 Aug 2022
Dynamic Regret of Online Markov Decision Processes
Dynamic Regret of Online Markov Decision Processes
Peng Zhao
Longfei Li
Zhi Zhou
OffRL
103
17
0
26 Aug 2022
A Comparison of Reinforcement Learning Frameworks for Software Testing
  Tasks
A Comparison of Reinforcement Learning Frameworks for Software Testing Tasks
Paulina Stevia Nouwou Mindom
Amin Nikanjam
Foutse Khomh
OffRL
69
11
0
25 Aug 2022
Oracle-free Reinforcement Learning in Mean-Field Games along a Single
  Sample Path
Oracle-free Reinforcement Learning in Mean-Field Games along a Single Sample Path
Muhammad Aneeq uz Zaman
Alec Koppel
Sujay Bhatt
Tamer Basar
66
25
0
24 Aug 2022
Entropy Enhanced Multi-Agent Coordination Based on Hierarchical Graph
  Learning for Continuous Action Space
Entropy Enhanced Multi-Agent Coordination Based on Hierarchical Graph Learning for Continuous Action Space
Yining Chen
Ke Wang
Guang-hua Song
Xiaohong Jiang
56
3
0
23 Aug 2022
Improving Sample Efficiency in Evolutionary RL Using Off-Policy Ranking
Improving Sample Efficiency in Evolutionary RL Using Off-Policy Ranking
R. EshwarS
Shishir Kolathaya
Gugan Thoppe
53
0
0
22 Aug 2022
Event-Triggered Model Predictive Control with Deep Reinforcement
  Learning for Autonomous Driving
Event-Triggered Model Predictive Control with Deep Reinforcement Learning for Autonomous Driving
Fengying Dang
Dong Chen
J. Chen
Zhaojian Li
59
29
0
22 Aug 2022
Unified Policy Optimization for Continuous-action Reinforcement Learning
  in Non-stationary Tasks and Games
Unified Policy Optimization for Continuous-action Reinforcement Learning in Non-stationary Tasks and Games
Rongjun Qin
Fan Luo
Hong Qian
Yang Yu
69
2
0
19 Aug 2022
Performance Optimization for Semantic Communications: An Attention-based
  Reinforcement Learning Approach
Performance Optimization for Semantic Communications: An Attention-based Reinforcement Learning Approach
Yining Wang
Mingzhe Chen
Tao Luo
Walid Saad
Dusit Niyato
H. Vincent Poor
Shuguang Cui
73
138
0
17 Aug 2022
Path Planning of Cleaning Robot with Reinforcement Learning
Path Planning of Cleaning Robot with Reinforcement Learning
Woohyeon Moon
Bumgeun Park
S. Nengroo
Taeyoung Kim
Dongsoo Har
60
18
0
17 Aug 2022
Maximum Correntropy Value Decomposition for Multi-agent Deep
  Reinforcemen Learning
Maximum Correntropy Value Decomposition for Multi-agent Deep Reinforcemen Learning
Kai Liu
Tianxian Zhang
L. Kong
78
0
0
07 Aug 2022
Backward Imitation and Forward Reinforcement Learning via Bi-directional
  Model Rollouts
Backward Imitation and Forward Reinforcement Learning via Bi-directional Model Rollouts
Yuxin Pan
Fangzhen Lin
OffRL
64
3
0
04 Aug 2022
Bayesian regularization of empirical MDPs
Bayesian regularization of empirical MDPs
Samarth Gupta
Daniel N. Hill
Lexing Ying
Inderjit Dhillon
OffRL
51
0
0
03 Aug 2022
Efficiently Computing Nash Equilibria in Adversarial Team Markov Games
Efficiently Computing Nash Equilibria in Adversarial Team Markov Games
Fivos Kalogiannis
Ioannis Anagnostides
Ioannis Panageas
Emmanouil-Vasileios Vlatakis-Gkaragkounis
Vaggos Chatziafratis
S. Stavroulakis
73
13
0
03 Aug 2022
Heterogeneous-Agent Mirror Learning: A Continuum of Solutions to
  Cooperative MARL
Heterogeneous-Agent Mirror Learning: A Continuum of Solutions to Cooperative MARL
J. Kuba
Xidong Feng
Shiyao Ding
Hao Dong
Jun Wang
Yaodong Yang
79
21
0
02 Aug 2022
DashBot: Insight-Driven Dashboard Generation Based on Deep Reinforcement
  Learning
DashBot: Insight-Driven Dashboard Generation Based on Deep Reinforcement Learning
Dazhen Deng
Aoyu Wu
Huamin Qu
Yingcai Wu
105
37
0
02 Aug 2022
Implicit Two-Tower Policies
Implicit Two-Tower Policies
Yunfan Zhao
Qingkai Pan
K. Choromanski
Deepali Jain
Vikas Sindhwani
OffRL
131
3
0
02 Aug 2022
Previous
123...789...394041
Next