ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.05477
  4. Cited By
Trust Region Policy Optimization
v1v2v3v4v5 (latest)

Trust Region Policy Optimization

19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
ArXiv (abs)PDFHTML

Papers citing "Trust Region Policy Optimization"

50 / 2,009 papers shown
Title
Model-based trajectory stitching for improved behavioural cloning and
  its applications
Model-based trajectory stitching for improved behavioural cloning and its applications
Charles A. Hepburn
Giovanni Montana
OffRL
85
7
0
08 Dec 2022
Design and Planning of Flexible Mobile Micro-Grids Using Deep
  Reinforcement Learning
Design and Planning of Flexible Mobile Micro-Grids Using Deep Reinforcement Learning
Cesare Caputo
Michel-Alexandre Cardin
Pudong Ge
Fei Teng
A. Korre
Ehecatl Antonio del Rio Chanona
52
18
0
08 Dec 2022
Accelerating Self-Imitation Learning from Demonstrations via Policy
  Constraints and Q-Ensemble
Accelerating Self-Imitation Learning from Demonstrations via Policy Constraints and Q-Ensemble
Chong Li
OffRL
69
1
0
07 Dec 2022
Reinforcement learning with Demonstrations from Mismatched Task under
  Sparse Reward
Reinforcement learning with Demonstrations from Mismatched Task under Sparse Reward
Yanjiang Guo
Jingyue Gao
Zheng Wu
Chengming Shi
Jianyu Chen
OffRL
90
5
0
03 Dec 2022
KRLS: Improving End-to-End Response Generation in Task Oriented Dialog
  with Reinforced Keywords Learning
KRLS: Improving End-to-End Response Generation in Task Oriented Dialog with Reinforced Keywords Learning
Xiao Yu
Qingyang Wu
Kun Qian
Zhou Yu
OffRL
72
12
0
30 Nov 2022
Quantile Constrained Reinforcement Learning: A Reinforcement Learning
  Framework Constraining Outage Probability
Quantile Constrained Reinforcement Learning: A Reinforcement Learning Framework Constraining Outage Probability
Whiyoung Jung
Myungsik Cho
Jongeui Park
Young-Jin Sung
85
4
0
28 Nov 2022
A Critical Review of Traffic Signal Control and A Novel Unified View of
  Reinforcement Learning and Model Predictive Control Approaches for Adaptive
  Traffic Signal Control
A Critical Review of Traffic Signal Control and A Novel Unified View of Reinforcement Learning and Model Predictive Control Approaches for Adaptive Traffic Signal Control
Xiaoyu Wang
Scott Sanner
Baher Abdulhai
69
5
0
26 Nov 2022
Representation Learning for Continuous Action Spaces is Beneficial for
  Efficient Policy Learning
Representation Learning for Continuous Action Spaces is Beneficial for Efficient Policy Learning
Tingting Zhao
Ying Wang
Weidong Sun
Yarui Chen
Gang Niu
Masashi Sugiyama
68
1
0
23 Nov 2022
Model-based Trajectory Stitching for Improved Offline Reinforcement
  Learning
Model-based Trajectory Stitching for Improved Offline Reinforcement Learning
Charles A. Hepburn
Giovanni Montana
OffRL
88
14
0
21 Nov 2022
An Improved Analysis of (Variance-Reduced) Policy Gradient and Natural
  Policy Gradient Methods
An Improved Analysis of (Variance-Reduced) Policy Gradient and Natural Policy Gradient Methods
Yanli Liu
Jianchao Tan
Tamer Basar
W. Yin
117
110
0
15 Nov 2022
Redeeming Intrinsic Rewards via Constrained Optimization
Redeeming Intrinsic Rewards via Constrained Optimization
Eric Chen
Zhang-Wei Hong
Joni Pajarinen
Pulkit Agrawal
OnRL
111
27
0
14 Nov 2022
Out-of-Dynamics Imitation Learning from Multimodal Demonstrations
Out-of-Dynamics Imitation Learning from Multimodal Demonstrations
Yiwen Qiu
Jialong Wu
Zhangjie Cao
Mingsheng Long
63
4
0
13 Nov 2022
A Survey on Explainable Reinforcement Learning: Concepts, Algorithms, Challenges
A Survey on Explainable Reinforcement Learning: Concepts, Algorithms, Challenges
Yunpeng Qing
Shunyu Liu
Mingli Song
Huiqiong Wang
Mingli Song
XAI
93
1
0
12 Nov 2022
Job Scheduling in Datacenters using Constraint Controlled RL
Job Scheduling in Datacenters using Constraint Controlled RL
V. Venkataswamy
43
1
0
10 Nov 2022
Max-Min Off-Policy Actor-Critic Method Focusing on Worst-Case Robustness
  to Model Misspecification
Max-Min Off-Policy Actor-Critic Method Focusing on Worst-Case Robustness to Model Misspecification
Takumi Tanabe
Reimi Sato
Kazuto Fukuchi
Jun Sakuma
Youhei Akimoto
OffRL
75
11
0
07 Nov 2022
Mixline: A Hybrid Reinforcement Learning Framework for Long-horizon
  Bimanual Coffee Stirring Task
Mixline: A Hybrid Reinforcement Learning Framework for Long-horizon Bimanual Coffee Stirring Task
Zheng Sun
Zhiqi Wang
Junjia Liu
Miao Li
Fei Chen
60
4
0
04 Nov 2022
A Survey on Reinforcement Learning in Aviation Applications
A Survey on Reinforcement Learning in Aviation Applications
Pouria Razzaghi
Amin Tabrizian
Wei Guo
Shulu Chen
Abenezer Taye
Ellis E. Thompson
Alexis Bregeon
Ali Baheri
Peng Wei
OffRL
57
56
0
03 Nov 2022
Offline RL With Realistic Datasets: Heteroskedasticity and Support
  Constraints
Offline RL With Realistic Datasets: Heteroskedasticity and Support Constraints
Anika Singh
Aviral Kumar
Q. Vuong
Yevgen Chebotar
Sergey Levine
OffRL
62
14
0
02 Nov 2022
Discrete Factorial Representations as an Abstraction for Goal
  Conditioned Reinforcement Learning
Discrete Factorial Representations as an Abstraction for Goal Conditioned Reinforcement Learning
Riashat Islam
Hongyu Zang
Anirudh Goyal
Alex Lamb
Kenji Kawaguchi
Xin-hui Li
Romain Laroche
Yoshua Bengio
Rémi Tachet des Combes
OffRLAI4CE
108
11
0
01 Nov 2022
Imitating Opponent to Win: Adversarial Policy Imitation Learning in
  Two-player Competitive Games
Imitating Opponent to Win: Adversarial Policy Imitation Learning in Two-player Competitive Games
Viet The Bui
Tien Mai
T. Nguyen
AAML
115
5
0
30 Oct 2022
Goal Exploration Augmentation via Pre-trained Skills for Sparse-Reward
  Long-Horizon Goal-Conditioned Reinforcement Learning
Goal Exploration Augmentation via Pre-trained Skills for Sparse-Reward Long-Horizon Goal-Conditioned Reinforcement Learning
Lisheng Wu
Ke Chen
81
4
0
28 Oct 2022
Learning on the Job: Self-Rewarding Offline-to-Online Finetuning for
  Industrial Insertion of Novel Connectors from Vision
Learning on the Job: Self-Rewarding Offline-to-Online Finetuning for Industrial Insertion of Novel Connectors from Vision
Ashvin Nair
Brian Zhu
Gokul Narayanan
Eugen Solowjow
Sergey Levine
OffRLOnRL
136
16
0
27 Oct 2022
Characterising the Robustness of Reinforcement Learning for Continuous
  Control using Disturbance Injection
Characterising the Robustness of Reinforcement Learning for Continuous Control using Disturbance Injection
Catherine R. Glossop
Jacopo Panerati
A. Krishnan
Zhaocong Yuan
Angela P. Schoellig
81
6
0
27 Oct 2022
Reinforcement Learning and Bandits for Speech and Language Processing:
  Tutorial, Review and Outlook
Reinforcement Learning and Bandits for Speech and Language Processing: Tutorial, Review and Outlook
Baihan Lin
OffRLAI4TS
135
27
0
24 Oct 2022
Symmetric (Optimistic) Natural Policy Gradient for Multi-agent Learning
  with Parameter Convergence
Symmetric (Optimistic) Natural Policy Gradient for Multi-agent Learning with Parameter Convergence
S. Pattathil
Jianchao Tan
Asuman Ozdaglar
99
14
0
23 Oct 2022
Solving Continuous Control via Q-learning
Solving Continuous Control via Q-learning
Tim Seyde
Peter Werner
Wilko Schwarting
Igor Gilitschenski
Martin Riedmiller
Daniela Rus
Markus Wulfmeier
OffRLLRM
92
23
0
22 Oct 2022
Probing Transfer in Deep Reinforcement Learning without Task Engineering
Probing Transfer in Deep Reinforcement Learning without Task Engineering
Andrei A. Rusu
Sebastian Flennerhag
Dushyant Rao
Razvan Pascanu
R. Hadsell
74
6
0
22 Oct 2022
Deep Reinforcement Learning for Stabilization of Large-scale
  Probabilistic Boolean Networks
Deep Reinforcement Learning for Stabilization of Large-scale Probabilistic Boolean Networks
S. Moschoyiannis
Evangelos Chatzaroulas
Vytenis Sliogeris
Yuhu Wu
BDLOffRLAI4CE
62
8
0
21 Oct 2022
RMBench: Benchmarking Deep Reinforcement Learning for Robotic
  Manipulator Control
RMBench: Benchmarking Deep Reinforcement Learning for Robotic Manipulator Control
Yanfei Xiang
Xin Wang
Shu Hu
Bin Zhu
Xiaomeng Huang
Xi Wu
Siwei Lyu
SSL
102
5
0
20 Oct 2022
Safe Policy Improvement in Constrained Markov Decision Processes
Safe Policy Improvement in Constrained Markov Decision Processes
Luigi Berducci
Radu Grosu
OffRL
111
2
0
20 Oct 2022
Task Phasing: Automated Curriculum Learning from Demonstrations
Task Phasing: Automated Curriculum Learning from Demonstrations
Vaibhav Bajaj
Guni Sharon
Peter Stone
77
8
0
20 Oct 2022
Topology Optimization via Machine Learning and Deep Learning: A Review
Topology Optimization via Machine Learning and Deep Learning: A Review
S. Shin
Dongju Shin
Namwoo Kang
AI4CE
88
69
0
19 Oct 2022
Proximal Learning With Opponent-Learning Awareness
Proximal Learning With Opponent-Learning Awareness
S. Zhao
Chris Xiaoxuan Lu
Roger C. Grosse
Jakob N. Foerster
83
21
0
18 Oct 2022
Entropy Regularized Reinforcement Learning with Cascading Networks
Entropy Regularized Reinforcement Learning with Cascading Networks
R. D. Vecchia
Alena Shilova
Philippe Preux
R. Akrour
34
2
0
16 Oct 2022
When to Update Your Model: Constrained Model-based Reinforcement
  Learning
When to Update Your Model: Constrained Model-based Reinforcement Learning
Tianying Ji
Yu-Juan Luo
Gang Hua
Mingxuan Jing
Fengxiang He
Wen-bing Huang
84
19
0
15 Oct 2022
A Policy-Guided Imitation Approach for Offline Reinforcement Learning
A Policy-Guided Imitation Approach for Offline Reinforcement Learning
Haoran Xu
Li Jiang
Jianxiong Li
Xianyuan Zhan
OffRL
157
64
0
15 Oct 2022
DyFEn: Agent-Based Fee Setting in Payment Channel Networks
DyFEn: Agent-Based Fee Setting in Payment Channel Networks
Kian Asgari
Aida Mohammadian
M. Tefagh
31
7
0
15 Oct 2022
Just Round: Quantized Observation Spaces Enable Memory Efficient
  Learning of Dynamic Locomotion
Just Round: Quantized Observation Spaces Enable Memory Efficient Learning of Dynamic Locomotion
Lev Grossman
Brian Plancher
MQ
69
4
0
14 Oct 2022
A Scalable Finite Difference Method for Deep Reinforcement Learning
A Scalable Finite Difference Method for Deep Reinforcement Learning
Matthew Allen
John C. Raisbeck
Hakho Lee
61
0
0
14 Oct 2022
Mutual Information Regularized Offline Reinforcement Learning
Mutual Information Regularized Offline Reinforcement Learning
Xiao Ma
Bingyi Kang
Zhongwen Xu
Min Lin
Shuicheng Yan
OffRL
101
8
0
14 Oct 2022
A Concise Introduction to Reinforcement Learning in Robotics
A Concise Introduction to Reinforcement Learning in Robotics
Akash Nagaraj
Mukund Sood
B. Patil
40
22
0
13 Oct 2022
Towards Multi-Agent Reinforcement Learning driven Over-The-Counter
  Market Simulations
Towards Multi-Agent Reinforcement Learning driven Over-The-Counter Market Simulations
N. Vadori
Leo Ardon
Sumitra Ganesh
Thomas Spooner
Selim Amrouni
Jared Vann
Mengda Xu
Zeyu Zheng
T. Balch
Manuela Veloso
75
17
0
13 Oct 2022
Observed Adversaries in Deep Reinforcement Learning
Observed Adversaries in Deep Reinforcement Learning
Eugene Lim
Harold Soh
AAML
26
0
0
13 Oct 2022
Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient
Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient
Yuda Song
Yi Zhou
Ayush Sekhari
J. Andrew Bagnell
A. Krishnamurthy
Wen Sun
OffRLOnRL
97
105
0
13 Oct 2022
Real World Offline Reinforcement Learning with Realistic Data Source
Real World Offline Reinforcement Learning with Realistic Data Source
G. Zhou
Liyiming Ke
S. Srinivasa
Abhi Gupta
Aravind Rajeswaran
Vikash Kumar
OffRL
92
23
0
12 Oct 2022
Efficient Adversarial Training without Attacking: Worst-Case-Aware
  Robust Reinforcement Learning
Efficient Adversarial Training without Attacking: Worst-Case-Aware Robust Reinforcement Learning
Yongyuan Liang
Yanchao Sun
Ruijie Zheng
Furong Huang
OODAAMLOffRL
48
51
0
12 Oct 2022
Discovered Policy Optimisation
Discovered Policy Optimisation
Chris Xiaoxuan Lu
J. Kuba
Alistair Letcher
Luke Metz
Christian Schroeder de Witt
Jakob N. Foerster
OffRL
111
79
0
11 Oct 2022
DHRL: A Graph-Based Approach for Long-Horizon and Sparse Hierarchical
  Reinforcement Learning
DHRL: A Graph-Based Approach for Long-Horizon and Sparse Hierarchical Reinforcement Learning
Seungjae Lee
Jigang Kim
Inkyu Jang
H. J. Kim
OffRL
105
13
0
11 Oct 2022
Towards a Theoretical Foundation of Policy Optimization for Learning
  Control Policies
Towards a Theoretical Foundation of Policy Optimization for Learning Control Policies
Bin Hu
Jianchao Tan
Na Li
M. Mesbahi
Maryam Fazel
Tamer Bacsar
167
27
0
10 Oct 2022
Traffic-Aware Autonomous Driving with Differentiable Traffic Simulation
Traffic-Aware Autonomous Driving with Differentiable Traffic Simulation
L. Zheng
Sanghyun Son
Ming-Chyuan Lin
129
3
0
07 Oct 2022
Previous
123...678...394041
Next