ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.05477
  4. Cited By
Trust Region Policy Optimization

Trust Region Policy Optimization

19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
ArXivPDFHTML

Papers citing "Trust Region Policy Optimization"

50 / 1,182 papers shown
Title
Bayesian regularization of empirical MDPs
Bayesian regularization of empirical MDPs
Samarth Gupta
Daniel N. Hill
Lexing Ying
Inderjit Dhillon
OffRL
29
0
0
03 Aug 2022
Efficiently Computing Nash Equilibria in Adversarial Team Markov Games
Efficiently Computing Nash Equilibria in Adversarial Team Markov Games
Fivos Kalogiannis
Ioannis Anagnostides
Ioannis Panageas
Emmanouil-Vasileios Vlatakis-Gkaragkounis
Vaggos Chatziafratis
S. Stavroulakis
39
13
0
03 Aug 2022
Heterogeneous-Agent Mirror Learning: A Continuum of Solutions to
  Cooperative MARL
Heterogeneous-Agent Mirror Learning: A Continuum of Solutions to Cooperative MARL
J. Kuba
Xidong Feng
Shiyao Ding
Hao Dong
Jun Wang
Yaodong Yang
26
16
0
02 Aug 2022
DashBot: Insight-Driven Dashboard Generation Based on Deep Reinforcement
  Learning
DashBot: Insight-Driven Dashboard Generation Based on Deep Reinforcement Learning
Dazhen Deng
Aoyu Wu
Huamin Qu
Yingcai Wu
42
35
0
02 Aug 2022
Implicit Two-Tower Policies
Implicit Two-Tower Policies
Yunfan Zhao
Qingkai Pan
K. Choromanski
Deepali Jain
Vikas Sindhwani
OffRL
31
3
0
02 Aug 2022
Unified Automatic Control of Vehicular Systems with Reinforcement
  Learning
Unified Automatic Control of Vehicular Systems with Reinforcement Learning
Zhongxia Yan
Abdul Rahman Kreidieh
Eugene Vinitsky
Alexandre M. Bayen
Cathy Wu
AI4CE
17
41
0
30 Jul 2022
Improved Policy Optimization for Online Imitation Learning
Improved Policy Optimization for Online Imitation Learning
J. Lavington
Sharan Vaswani
Mark W. Schmidt
OffRL
21
6
0
29 Jul 2022
JDRec: Practical Actor-Critic Framework for Online Combinatorial
  Recommender System
JDRec: Practical Actor-Critic Framework for Online Combinatorial Recommender System
Xin Zhao
Zhiwei Fang
Yuchen Guo
Jie He
Wenlong Chen
Changping Peng
26
0
0
27 Jul 2022
Provably Efficient Fictitious Play Policy Optimization for Zero-Sum
  Markov Games with Structured Transitions
Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions
Shuang Qiu
Xiaohan Wei
Jieping Ye
Zhaoran Wang
Zhuoran Yang
OffRL
30
11
0
25 Jul 2022
Discriminator-Weighted Offline Imitation Learning from Suboptimal
  Demonstrations
Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations
Haoran Xu
Xianyuan Zhan
Honglei Yin
Huiling Qin
OffRL
26
66
0
20 Jul 2022
Resolving Copycat Problems in Visual Imitation Learning via Residual
  Action Prediction
Resolving Copycat Problems in Visual Imitation Learning via Residual Action Prediction
Chia-Chi Chuang
Donglin Yang
Chuan Wen
Yang Gao
SSL
23
12
0
20 Jul 2022
Bayesian Generational Population-Based Training
Bayesian Generational Population-Based Training
Xingchen Wan
Cong Lu
Jack Parker-Holder
Philip J. Ball
Vu-Linh Nguyen
Binxin Ru
Michael A. Osborne
OffRL
31
15
0
19 Jul 2022
Minimum Description Length Control
Minimum Description Length Control
Theodore H. Moskovitz
Ta-Chu Kao
M. Sahani
M. Botvinick
26
1
0
17 Jul 2022
Learning robust marking policies for adaptive mesh refinement
Learning robust marking policies for adaptive mesh refinement
A. Gillette
B. Keith
S. Petrides
33
11
0
13 Jul 2022
Deep Learning Approaches to Grasp Synthesis: A Review
Deep Learning Approaches to Grasp Synthesis: A Review
Rhys Newbury
Morris Gu
Lachlan Chumbley
Arsalan Mousavian
Clemens Eppner
...
A. Morales
Tamim Asfour
Danica Kragic
Dieter Fox
Akansel Cosgun
40
162
0
06 Jul 2022
Learning fast and agile quadrupedal locomotion over complex terrain
Learning fast and agile quadrupedal locomotion over complex terrain
Xu Chang
Zhitong Zhang
Honglei An
Hongxu Ma
Qing Wei
27
0
0
02 Jul 2022
Reinforcement Learning of Multi-Domain Dialog Policies Via Action
  Embeddings
Reinforcement Learning of Multi-Domain Dialog Policies Via Action Embeddings
Jorge Armando Mendez Mendez
Alborz Geramifard
Mohammad Ghavamzadeh
Bing-Quan Liu
OffRL
27
6
0
01 Jul 2022
Generalized Policy Improvement Algorithms with Theoretically Supported
  Sample Reuse
Generalized Policy Improvement Algorithms with Theoretically Supported Sample Reuse
James Queeney
I. Paschalidis
Christos G. Cassandras
OffRL
32
2
0
28 Jun 2022
Auto-Encoding Adversarial Imitation Learning
Auto-Encoding Adversarial Imitation Learning
Kaifeng Zhang
Rui Zhao
Ziming Zhang
Yang Gao
19
1
0
22 Jun 2022
Imitate then Transcend: Multi-Agent Optimal Execution with Dual-Window
  Denoise PPO
Imitate then Transcend: Multi-Agent Optimal Execution with Dual-Window Denoise PPO
Jin Fang
Jiacheng Weng
Yi Xiang
Xinwen Zhang
OffRL
29
2
0
21 Jun 2022
Model-Based Imitation Learning Using Entropy Regularization of Model and
  Policy
Model-Based Imitation Learning Using Entropy Regularization of Model and Policy
E. Uchibe
23
3
0
21 Jun 2022
Constrained Reinforcement Learning for Robotics via Scenario-Based
  Programming
Constrained Reinforcement Learning for Robotics via Scenario-Based Programming
Davide Corsi
Raz Yerushalmi
Guy Amir
Alessandro Farinelli
D. Harel
Guy Katz
27
19
0
20 Jun 2022
A Survey on Model-based Reinforcement Learning
A Survey on Model-based Reinforcement Learning
Fan Luo
Tian Xu
Hang Lai
Xiong-Hui Chen
Weinan Zhang
Yang Yu
OffRL
LRM
50
101
0
19 Jun 2022
Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement
  Learning
Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning
Yuanpei Chen
Tianhao Wu
Shengjie Wang
Xidong Feng
Jiechuan Jiang
...
Yiran Geng
Hao Dong
Zongqing Lu
Song-Chun Zhu
Yaodong Yang
OffRL
46
109
0
17 Jun 2022
A Search-Based Testing Approach for Deep Reinforcement Learning Agents
A Search-Based Testing Approach for Deep Reinforcement Learning Agents
Amirhossein Zolfagharian
Manel Abdellatif
Lionel C. Briand
M. Bagherzadeh
Ramesh S
45
27
0
15 Jun 2022
Transformers are Meta-Reinforcement Learners
Transformers are Meta-Reinforcement Learners
Luckeciano C. Melo
OffRL
41
50
0
14 Jun 2022
Relative Policy-Transition Optimization for Fast Policy Transfer
Relative Policy-Transition Optimization for Fast Policy Transfer
Jiawei Xu
Cheng Zhou
Yizheng Zhang
Zhengyou Zhang
Lei Han
21
0
0
13 Jun 2022
Safe-FinRL: A Low Bias and Variance Deep Reinforcement Learning
  Implementation for High-Freq Stock Trading
Safe-FinRL: A Low Bias and Variance Deep Reinforcement Learning Implementation for High-Freq Stock Trading
Zitao Song
Xuyang Jin
Chenliang Li
OffRL
AIFin
29
1
0
13 Jun 2022
Rare event failure test case generation in Learning-Enabled-Controllers
Rare event failure test case generation in Learning-Enabled-Controllers
H. Vardhan
J. Sztipanovits
19
20
0
11 Jun 2022
Multifidelity Reinforcement Learning with Control Variates
Multifidelity Reinforcement Learning with Control Variates
Sami Khairy
Prasanna Balaprakash
OffRL
36
5
0
10 Jun 2022
Towards Safe Reinforcement Learning via Constraining Conditional
  Value-at-Risk
Towards Safe Reinforcement Learning via Constraining Conditional Value-at-Risk
Chengyang Ying
Xinning Zhou
Hang Su
Dong Yan
Ning Chen
Jun Zhu
24
41
0
09 Jun 2022
Constrained Imitation Learning for a Flapping Wing Unmanned Aerial
  Vehicle
Constrained Imitation Learning for a Flapping Wing Unmanned Aerial Vehicle
T. K C
Taeyoung Lee
18
2
0
08 Jun 2022
Action Noise in Off-Policy Deep Reinforcement Learning: Impact on
  Exploration and Performance
Action Noise in Off-Policy Deep Reinforcement Learning: Impact on Exploration and Performance
Jakob J. Hollenstein
Sayantan Auddy
Matteo Saveriano
Erwan Renaudo
J. Piater
41
17
0
08 Jun 2022
Real2Sim or Sim2Real: Robotics Visual Insertion using Deep Reinforcement
  Learning and Real2Sim Policy Adaptation
Real2Sim or Sim2Real: Robotics Visual Insertion using Deep Reinforcement Learning and Real2Sim Policy Adaptation
Yiwen Chen
Xue-Yong Li
Sheng Guo
Xiang Yao Ng
Marcelo H. Ang Jr
16
4
0
06 Jun 2022
Robust Adversarial Attacks Detection based on Explainable Deep
  Reinforcement Learning For UAV Guidance and Planning
Robust Adversarial Attacks Detection based on Explainable Deep Reinforcement Learning For UAV Guidance and Planning
Tom Hickling
Nabil Aouf
P. Spencer
AAML
17
50
0
06 Jun 2022
Policy Optimization for Markov Games: Unified Framework and Faster
  Convergence
Policy Optimization for Markov Games: Unified Framework and Faster Convergence
Runyu Zhang
Qinghua Liu
Haiquan Wang
Caiming Xiong
Na Li
Yu Bai
27
26
0
06 Jun 2022
Algorithm for Constrained Markov Decision Process with Linear
  Convergence
Algorithm for Constrained Markov Decision Process with Linear Convergence
E. Gladin
Maksim Lavrik-Karmazin
K. Zainullina
Varvara Rudenko
Alexander V. Gasnikov
Martin Takáč
33
6
0
03 Jun 2022
On Reinforcement Learning and Distribution Matching for Fine-Tuning
  Language Models with no Catastrophic Forgetting
On Reinforcement Learning and Distribution Matching for Fine-Tuning Language Models with no Catastrophic Forgetting
Tomasz Korbak
Hady ElSahar
Germán Kruszewski
Marc Dymetman
CLL
25
51
0
01 Jun 2022
Control of Two-way Coupled Fluid Systems with Differentiable Solvers
Control of Two-way Coupled Fluid Systems with Differentiable Solvers
B. Ramos
Felix Trost
Nils Thuerey
AI4CE
17
5
0
01 Jun 2022
Learning to Use Chopsticks in Diverse Gripping Styles
Learning to Use Chopsticks in Diverse Gripping Styles
Zeshi Yang
KangKang Yin
Libin Liu
30
29
0
28 May 2022
Reward Uncertainty for Exploration in Preference-based Reinforcement
  Learning
Reward Uncertainty for Exploration in Preference-based Reinforcement Learning
Xinran Liang
Katherine Shu
Kimin Lee
Pieter Abbeel
21
58
0
24 May 2022
Regret-Aware Black-Box Optimization with Natural Gradients,
  Trust-Regions and Entropy Control
Regret-Aware Black-Box Optimization with Natural Gradients, Trust-Regions and Entropy Control
Maximilian Hüttenrauch
Gerhard Neumann
27
1
0
24 May 2022
Multi-Agent Collaborative Inference via DNN Decoupling: Intermediate
  Feature Compression and Edge Learning
Multi-Agent Collaborative Inference via DNN Decoupling: Intermediate Feature Compression and Edge Learning
Zhiwei Hao
Guanyu Xu
Yong Luo
Han Hu
Jianping An
Shiwen Mao
32
22
0
24 May 2022
Penalized Proximal Policy Optimization for Safe Reinforcement Learning
Penalized Proximal Policy Optimization for Safe Reinforcement Learning
Linrui Zhang
Li Shen
Long Yang
Shi-Yong Chen
Bo Yuan
Xueqian Wang
Dacheng Tao
13
62
0
24 May 2022
Efficient Reinforcement Learning from Demonstration Using Local Ensemble
  and Reparameterization with Split and Merge of Expert Policies
Efficient Reinforcement Learning from Demonstration Using Local Ensemble and Reparameterization with Split and Merge of Expert Policies
Yu Wang
Fang Liu
29
0
0
23 May 2022
Memory-efficient Reinforcement Learning with Value-based Knowledge
  Consolidation
Memory-efficient Reinforcement Learning with Value-based Knowledge Consolidation
Qingfeng Lan
Yangchen Pan
Jun Luo
A. R. Mahmood
OffRL
29
7
0
22 May 2022
A Review of Safe Reinforcement Learning: Methods, Theory and
  Applications
A Review of Safe Reinforcement Learning: Methods, Theory and Applications
Shangding Gu
Longyu Yang
Yali Du
Guang Chen
Florian Walter
Jun Wang
Alois C. Knoll
OffRL
AI4TS
117
241
0
20 May 2022
The Sufficiency of Off-Policyness and Soft Clipping: PPO is still
  Insufficient according to an Off-Policy Measure
The Sufficiency of Off-Policyness and Soft Clipping: PPO is still Insufficient according to an Off-Policy Measure
Xing Chen
Dongcui Diao
Hechang Chen
Hengshuai Yao
Haiyin Piao
Zhixiao Sun
Zhiwei Yang
Randy Goebel
Bei Jiang
Yi-Ju Chang
OffRL
41
8
0
20 May 2022
Qualitative Differences Between Evolutionary Strategies and
  Reinforcement Learning Methods for Control of Autonomous Agents
Qualitative Differences Between Evolutionary Strategies and Reinforcement Learning Methods for Control of Autonomous Agents
Nicola Milano
S. Nolfi
20
0
0
16 May 2022
Cliff Diving: Exploring Reward Surfaces in Reinforcement Learning
  Environments
Cliff Diving: Exploring Reward Surfaces in Reinforcement Learning Environments
Ryan Sullivan
J. K. Terry
Benjamin Black
John P. Dickerson
24
8
0
14 May 2022
Previous
123...789...222324
Next