ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.05477
  4. Cited By
Trust Region Policy Optimization

Trust Region Policy Optimization

19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
ArXivPDFHTML

Papers citing "Trust Region Policy Optimization"

50 / 3,098 papers shown
Title
Deep Bayesian Reinforcement Learning for Spacecraft Proximity Maneuvers
  and Docking
Deep Bayesian Reinforcement Learning for Spacecraft Proximity Maneuvers and Docking
Desong Du
Naiming Qi
Yanfang Liu
Wei Pan
14
0
0
07 Nov 2023
Uni-O4: Unifying Online and Offline Deep Reinforcement Learning with
  Multi-Step On-Policy Optimization
Uni-O4: Unifying Online and Offline Deep Reinforcement Learning with Multi-Step On-Policy Optimization
Kun Lei
Zhengmao He
Chenhao Lu
Kaizhe Hu
Yang Gao
Huazhe Xu
OffRL
OnRL
67
13
0
06 Nov 2023
Active Reasoning in an Open-World Environment
Active Reasoning in an Open-World Environment
Manjie Xu
Guangyuan Jiang
Weihan Liang
Chi Zhang
Yixin Zhu
LLMAG
LRM
21
10
0
03 Nov 2023
Efficient Symbolic Policy Learning with Differentiable Symbolic
  Expression
Efficient Symbolic Policy Learning with Differentiable Symbolic Expression
Jiaming Guo
Rui Zhang
Shaohui Peng
Qi Yi
Xingui Hu
...
Zidong Du
Xishan Zhang
Ling Li
Qi Guo
Yunji Chen
OffRL
30
5
0
02 Nov 2023
SCPO: Safe Reinforcement Learning with Safety Critic Policy Optimization
SCPO: Safe Reinforcement Learning with Safety Critic Policy Optimization
Jaafar Mhamed
Shangding Gu
24
0
0
01 Nov 2023
A Multi-Agent Reinforcement Learning Framework for Evaluating the U.S.
  Ending the HIV Epidemic Plan
A Multi-Agent Reinforcement Learning Framework for Evaluating the U.S. Ending the HIV Epidemic Plan
Dinesh Sharma
Ankit Shah
Chaitra Gopalappa
40
0
0
01 Nov 2023
Offline RL with Observation Histories: Analyzing and Improving Sample
  Complexity
Offline RL with Observation Histories: Analyzing and Improving Sample Complexity
Joey Hong
Anca Dragan
Sergey Levine
OffRL
38
5
0
31 Oct 2023
Information-Theoretic Trust Regions for Stochastic Gradient-Based
  Optimization
Information-Theoretic Trust Regions for Stochastic Gradient-Based Optimization
Philipp Dahlinger
P. Becker
Maximilian Hüttenrauch
Gerhard Neumann
21
0
0
31 Oct 2023
Amoeba: Circumventing ML-supported Network Censorship via Adversarial
  Reinforcement Learning
Amoeba: Circumventing ML-supported Network Censorship via Adversarial Reinforcement Learning
Haoyu Liu
A. Diallo
P. Patras
AAML
16
3
0
31 Oct 2023
Dropout Strategy in Reinforcement Learning: Limiting the Surrogate
  Objective Variance in Policy Optimization Methods
Dropout Strategy in Reinforcement Learning: Limiting the Surrogate Objective Variance in Policy Optimization Methods
Zhengpeng Xie
Changdong Yu
Weizheng Qiao
29
1
0
31 Oct 2023
Network Contention-Aware Cluster Scheduling with Reinforcement Learning
Network Contention-Aware Cluster Scheduling with Reinforcement Learning
Junyeol Ryu
Jeongyoon Eo
GNN
17
0
0
31 Oct 2023
On the Theory of Risk-Aware Agents: Bridging Actor-Critic and Economics
On the Theory of Risk-Aware Agents: Bridging Actor-Critic and Economics
Michal Nauman
Marek Cygan
40
1
0
30 Oct 2023
Robot Control based on Motor Primitives -- A Comparison of Two
  Approaches
Robot Control based on Motor Primitives -- A Comparison of Two Approaches
Moses C. Nah
Johannes Lachner
Neville Hogan
26
3
0
28 Oct 2023
Online Decision Mediation
Online Decision Mediation
Daniel Jarrett
Alihan Huyuk
M. Schaar
35
2
0
28 Oct 2023
Deep Reinforcement Learning for Weapons to Targets Assignment in a
  Hypersonic strike
Deep Reinforcement Learning for Weapons to Targets Assignment in a Hypersonic strike
B. Gaudet
K. Drozd
R. Furfaro
11
1
0
27 Oct 2023
Reward Scale Robustness for Proximal Policy Optimization via DreamerV3
  Tricks
Reward Scale Robustness for Proximal Policy Optimization via DreamerV3 Tricks
Ryan Sullivan
Akarsh Kumar
Shengyi Huang
John P. Dickerson
Joseph Suárez
OffRL
24
5
0
26 Oct 2023
DSAC-C: Constrained Maximum Entropy for Robust Discrete Soft-Actor
  Critic
DSAC-C: Constrained Maximum Entropy for Robust Discrete Soft-Actor Critic
Dexter Neo
Tsuhan Chen
30
1
0
26 Oct 2023
Fractal Landscapes in Policy Optimization
Fractal Landscapes in Policy Optimization
Tao Wang
Sylvia Herbert
Sicun Gao
34
5
0
24 Oct 2023
A Doubly Robust Approach to Sparse Reinforcement Learning
A Doubly Robust Approach to Sparse Reinforcement Learning
Wonyoung Hedge Kim
Garud Iyengar
A. Zeevi
25
3
0
23 Oct 2023
Policy Gradient with Kernel Quadrature
Policy Gradient with Kernel Quadrature
Satoshi Hayakawa
Tetsuro Morimura
OffRL
BDL
32
0
0
23 Oct 2023
Absolute Policy Optimization
Absolute Policy Optimization
Weiye Zhao
Feihan Li
Yifan Sun
Rui Chen
Tianhao Wei
Changliu Liu
52
4
0
20 Oct 2023
MARVEL: Multi-Agent Reinforcement-Learning for Large-Scale Variable
  Speed Limits
MARVEL: Multi-Agent Reinforcement-Learning for Large-Scale Variable Speed Limits
Yuhang Zhang
Marcos Quiñones-Grueiro
Zhiyao Zhang
Yanbing Wang
William Barbour
Gautam Biswas
Dan Work
38
5
0
18 Oct 2023
Improved Sample Complexity Analysis of Natural Policy Gradient Algorithm
  with General Parameterization for Infinite Horizon Discounted Reward Markov
  Decision Processes
Improved Sample Complexity Analysis of Natural Policy Gradient Algorithm with General Parameterization for Infinite Horizon Discounted Reward Markov Decision Processes
Washim Uddin Mondal
Vaneet Aggarwal
38
9
0
18 Oct 2023
Quantifying Assistive Robustness Via the Natural-Adversarial Frontier
Quantifying Assistive Robustness Via the Natural-Adversarial Frontier
Jerry Zhi-Yang He
Zackory M. Erickson
Daniel S. Brown
Anca Dragan
AAML
29
0
0
16 Oct 2023
End-to-end Offline Reinforcement Learning for Glycemia Control
End-to-end Offline Reinforcement Learning for Glycemia Control
Tristan Beolet
Alice Adenis
E. Huneker
Maxime Louis
OffRL
38
1
0
16 Oct 2023
DexCatch: Learning to Catch Arbitrary Objects with Dexterous Hands
DexCatch: Learning to Catch Arbitrary Objects with Dexterous Hands
Fengbo Lan
Shengjie Wang
Yunzhe Zhang
Haotian Xu
Oluwatosin Oseni
Yang Gao
Tao Zhang
47
5
0
13 Oct 2023
Discovering Fatigued Movements for Virtual Character Animation
Discovering Fatigued Movements for Virtual Character Animation
N. Cheema
Rui Xu
Nam Hee Kim
Perttu Hämäläinen
Vladislav Golyanik
Marc Habermann
Christian Theobalt
Philipp Slusallek
32
4
0
12 Oct 2023
Discerning Temporal Difference Learning
Discerning Temporal Difference Learning
Jianfei Ma
15
0
0
12 Oct 2023
Accountability in Offline Reinforcement Learning: Explaining Decisions
  with a Corpus of Examples
Accountability in Offline Reinforcement Learning: Explaining Decisions with a Corpus of Examples
Hao Sun
Alihan Huyuk
Daniel Jarrett
M. Schaar
OffRL
39
7
0
11 Oct 2023
Deep Reinforcement Learning for Autonomous Cyber Operations: A Survey
Deep Reinforcement Learning for Autonomous Cyber Operations: A Survey
Gregory Palmer
Chris Parry
Daniel J.B. Harrold
Chris Willis
AI4CE
23
1
0
11 Oct 2023
Diversity for Contingency: Learning Diverse Behaviors for Efficient
  Adaptation and Transfer
Diversity for Contingency: Learning Diverse Behaviors for Efficient Adaptation and Transfer
Finn Rietz
J. A. Stork
33
0
0
11 Oct 2023
Imitation Learning from Observation with Automatic Discount Scheduling
Imitation Learning from Observation with Automatic Discount Scheduling
Yuyang Liu
Weijun Dong
Yingdong Hu
Chuan Wen
Zhao-Heng Yin
Chongjie Zhang
Yang Gao
30
6
0
11 Oct 2023
Imitation Learning from Purified Demonstration
Imitation Learning from Purified Demonstration
Yunke Wang
Minjing Dong
Bo Du
Chang Xu
31
1
0
11 Oct 2023
Reinforcement Learning in the Era of LLMs: What is Essential? What is
  needed? An RL Perspective on RLHF, Prompting, and Beyond
Reinforcement Learning in the Era of LLMs: What is Essential? What is needed? An RL Perspective on RLHF, Prompting, and Beyond
Hao Sun
OffRL
34
21
0
09 Oct 2023
When is Agnostic Reinforcement Learning Statistically Tractable?
When is Agnostic Reinforcement Learning Statistically Tractable?
Zeyu Jia
Gene Li
Alexander Rakhlin
Ayush Sekhari
Nathan Srebro
OffRL
32
5
0
09 Oct 2023
Improved Communication Efficiency in Federated Natural Policy Gradient
  via ADMM-based Gradient Updates
Improved Communication Efficiency in Federated Natural Policy Gradient via ADMM-based Gradient Updates
Guangchen Lan
Han Wang
James Anderson
Christopher G. Brinton
Vaneet Aggarwal
FedML
32
27
0
09 Oct 2023
Increasing Entropy to Boost Policy Gradient Performance on
  Personalization Tasks
Increasing Entropy to Boost Policy Gradient Performance on Personalization Tasks
Andrew Starnes
Anton Dereventsov
Clayton Webster
24
0
0
09 Oct 2023
Distributional Soft Actor-Critic with Three Refinements
Distributional Soft Actor-Critic with Three Refinements
Jingliang Duan
Wenxuan Wang
Liming Xiao
Jiaxin Gao
Shengbo Eben Li
Chang Liu
Ya-Qin Zhang
Bo Cheng
Keqiang Li
OODD
OffRL
27
2
0
09 Oct 2023
FP3O: Enabling Proximal Policy Optimization in Multi-Agent Cooperation
  with Parameter-Sharing Versatility
FP3O: Enabling Proximal Policy Optimization in Multi-Agent Cooperation with Parameter-Sharing Versatility
Lang Feng
Dong Xing
Junru Zhang
Gang Pan
34
1
0
08 Oct 2023
Safe Deep Policy Adaptation
Safe Deep Policy Adaptation
Wenli Xiao
Tairan He
John M. Dolan
Guanya Shi
34
9
0
08 Oct 2023
Accelerate Multi-Agent Reinforcement Learning in Zero-Sum Games with
  Subgame Curriculum Learning
Accelerate Multi-Agent Reinforcement Learning in Zero-Sum Games with Subgame Curriculum Learning
Jiayu Chen
Zelai Xu
Yunfei Li
Chao Yu
Jiaming Song
Huazhong Yang
Fei Fang
Yu Wang
Yi Wu
34
4
0
07 Oct 2023
Terrain-Aware Quadrupedal Locomotion via Reinforcement Learning
Terrain-Aware Quadrupedal Locomotion via Reinforcement Learning
Hao-bin Shi
Qing Zhu
Lei Han
Wanchao Chi
Tingguang Li
Max Q.-H. Meng
40
3
0
07 Oct 2023
Deep Model Predictive Optimization
Deep Model Predictive Optimization
Jacob Sacks
Rwik Rana
Kevin Huang
Alex Spitzer
Guanya Shi
Byron Boots
48
7
0
06 Oct 2023
Confronting Reward Model Overoptimization with Constrained RLHF
Confronting Reward Model Overoptimization with Constrained RLHF
Ted Moskovitz
Aaditya K. Singh
DJ Strouse
T. Sandholm
Ruslan Salakhutdinov
Anca D. Dragan
Stephen Marcus McAleer
50
48
0
06 Oct 2023
Reinforcement Learning with Fast and Forgetful Memory
Reinforcement Learning with Fast and Forgetful Memory
Steven D. Morad
Ryan Kortvelesy
Stephan Liwicki
Amanda Prorok
OffRL
29
4
0
06 Oct 2023
TRAM: Bridging Trust Regions and Sharpness Aware Minimization
TRAM: Bridging Trust Regions and Sharpness Aware Minimization
Tom Sherborne
Naomi Saphra
Pradeep Dasigi
Hao Peng
32
4
0
05 Oct 2023
Safe Reinforcement Learning via Hierarchical Adaptive Chance-Constraint
  Safeguards
Safe Reinforcement Learning via Hierarchical Adaptive Chance-Constraint Safeguards
Zhaorun Chen
Zhuokai Zhao
Tairan He
Binhao Chen
Xuhao Zhao
Liang Gong
Chengliang Liu
29
3
0
05 Oct 2023
Safe Exploration in Reinforcement Learning: A Generalized Formulation
  and Algorithms
Safe Exploration in Reinforcement Learning: A Generalized Formulation and Algorithms
Akifumi Wachi
Wataru Hashimoto
Xun Shen
Kazumune Hashimoto
22
9
0
05 Oct 2023
Deep Reinforcement Learning Algorithms for Hybrid V2X Communication: A
  Benchmarking Study
Deep Reinforcement Learning Algorithms for Hybrid V2X Communication: A Benchmarking Study
Fouzi Boukhalfa
Réda Alami
Mastane Achab
Eric Moulines
M. Bennis
11
1
0
04 Oct 2023
Beyond Stationarity: Convergence Analysis of Stochastic Softmax Policy
  Gradient Methods
Beyond Stationarity: Convergence Analysis of Stochastic Softmax Policy Gradient Methods
Sara Klein
Simon Weissmann
Leif Döring
29
7
0
04 Oct 2023
Previous
123...101112...606162
Next