ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.05477
  4. Cited By
Trust Region Policy Optimization

Trust Region Policy Optimization

19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
ArXivPDFHTML

Papers citing "Trust Region Policy Optimization"

50 / 1,112 papers shown
Title
Efficient Recurrent Off-Policy RL Requires a Context-Encoder-Specific
  Learning Rate
Efficient Recurrent Off-Policy RL Requires a Context-Encoder-Specific Learning Rate
Fan Luo
Zuolin Tu
Zefang Huang
Yang Yu
OffRL
38
0
0
24 May 2024
Bayesian Optimization of Functions over Node Subsets in Graphs
Bayesian Optimization of Functions over Node Subsets in Graphs
Huidong Liang
Xingchen Wan
Xiaowen Dong
50
1
0
24 May 2024
MeMo: Meaningful, Modular Controllers via Noise Injection
MeMo: Meaningful, Modular Controllers via Noise Injection
Megan Tjandrasuwita
Jie Xu
Armando Solar-Lezama
Wojciech Matusik
21
0
0
24 May 2024
Interpretable and Editable Programmatic Tree Policies for Reinforcement
  Learning
Interpretable and Editable Programmatic Tree Policies for Reinforcement Learning
Hector Kohler
Quentin Delfosse
R. Akrour
Kristian Kersting
Philippe Preux
62
14
0
23 May 2024
Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration
Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration
Yang Zhang
Shixin Yang
Chenjia Bai
Fei Wu
Xiu Li
Zhen Wang
Xuelong Li
LLMAG
36
25
0
23 May 2024
Exclusively Penalized Q-learning for Offline Reinforcement Learning
Exclusively Penalized Q-learning for Offline Reinforcement Learning
Junghyuk Yeom
Yonghyeon Jo
Jungmo Kim
Sanghyeon Lee
Seungyul Han
OffRL
40
2
0
23 May 2024
A Survey on Vision-Language-Action Models for Embodied AI
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
82
43
0
23 May 2024
Towards Robust Policy: Enhancing Offline Reinforcement Learning with
  Adversarial Attacks and Defenses
Towards Robust Policy: Enhancing Offline Reinforcement Learning with Adversarial Attacks and Defenses
Thanh Nguyen
Tung M. Luu
Tri Ton
Chang D. Yoo
OffRL
AAML
34
0
0
18 May 2024
Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback
Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback
Asaf B. Cassel
Haipeng Luo
Aviv A. Rosenberg
Dmitry Sotnikov
OffRL
31
3
0
13 May 2024
An Improved Finite-time Analysis of Temporal Difference Learning with
  Deep Neural Networks
An Improved Finite-time Analysis of Temporal Difference Learning with Deep Neural Networks
Zhifa Ke
Zaiwen Wen
Junyu Zhang
37
0
0
07 May 2024
Dynamic Anisotropic Smoothing for Noisy Derivative-Free Optimization
Dynamic Anisotropic Smoothing for Noisy Derivative-Free Optimization
S. Reifenstein
T. Leleu
Yoshihisa Yamamoto
48
1
0
02 May 2024
Adversarial Attacks on Reinforcement Learning Agents for Command and
  Control
Adversarial Attacks on Reinforcement Learning Agents for Command and Control
Ahaan Dabholkar
James Z. Hare
Mark R. Mittrick
John Richardson
Nick Waytowich
Priya Narayanan
Saurabh Bagchi
AAML
37
1
0
02 May 2024
Hard-Thresholding Meets Evolution Strategies in Reinforcement Learning
Hard-Thresholding Meets Evolution Strategies in Reinforcement Learning
Chengqian Gao
William de Vazelhes
Hualin Zhang
Bin Gu
Zhiqiang Xu
54
0
0
02 May 2024
Constrained Reinforcement Learning Under Model Mismatch
Constrained Reinforcement Learning Under Model Mismatch
Zhongchang Sun
Sihong He
Fei Miao
Shaofeng Zou
46
4
0
02 May 2024
S$^2$AC: Energy-Based Reinforcement Learning with Stein Soft Actor
  Critic
S2^22AC: Energy-Based Reinforcement Learning with Stein Soft Actor Critic
Safa Messaoud
Billel Mokeddem
Zhenghai Xue
L. Pang
Bo An
Haipeng Chen
Sanjay Chawla
41
3
0
02 May 2024
Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient Manipulation
Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient Manipulation
Shangding Gu
Bilgehan Sel
Yuhao Ding
Lu Wang
Qingwei Lin
Ming Jin
Alois Knoll
60
9
0
02 May 2024
Deep Reinforcement Learning for Bipedal Locomotion: A Brief Survey
Deep Reinforcement Learning for Bipedal Locomotion: A Brief Survey
Lingfan Bao
Josephine N. Humphreys
Tianhu Peng
Chengxu Zhou
73
5
0
25 Apr 2024
Beyond the Edge: An Advanced Exploration of Reinforcement Learning for
  Mobile Edge Computing, its Applications, and Future Research Trajectories
Beyond the Edge: An Advanced Exploration of Reinforcement Learning for Mobile Edge Computing, its Applications, and Future Research Trajectories
Ning Yang
Shuo Chen
Haijun Zhang
Randall Berry
OffRL
29
6
0
22 Apr 2024
Self-playing Adversarial Language Game Enhances LLM Reasoning
Self-playing Adversarial Language Game Enhances LLM Reasoning
Pengyu Cheng
Tianhao Hu
Han Xu
Zhisong Zhang
Yong Dai
Lei Han
Nan Du
Nan Du
Xiaolong Li
SyDa
LRM
ReLM
98
29
0
16 Apr 2024
Learn Your Reference Model for Real Good Alignment
Learn Your Reference Model for Real Good Alignment
Alexey Gorbatovski
Boris Shaposhnikov
Alexey Malakhov
Nikita Surnachev
Yaroslav Aksenov
Ian Maksimov
Nikita Balagansky
Daniil Gavrilov
OffRL
54
27
0
15 Apr 2024
Graph Reinforcement Learning for Combinatorial Optimization: A Survey
  and Unifying Perspective
Graph Reinforcement Learning for Combinatorial Optimization: A Survey and Unifying Perspective
Victor-Alexandru Darvariu
Stephen Hailes
Mirco Musolesi
AI4CE
50
6
0
09 Apr 2024
Learning Prehensile Dexterity by Imitating and Emulating State-only
  Observations
Learning Prehensile Dexterity by Imitating and Emulating State-only Observations
Yunhai Han
Zhenyang Chen
Harish Ravichandar
32
5
0
08 Apr 2024
Model-based Reinforcement Learning for Parameterized Action Spaces
Model-based Reinforcement Learning for Parameterized Action Spaces
Renhao Zhang
Haotian Fu
Yilin Miao
George Konidaris
31
3
0
03 Apr 2024
Generalized Policy Learning for Smart Grids: FL TRPO Approach
Generalized Policy Learning for Smart Grids: FL TRPO Approach
Yunxiang Li
Nicolas Mauricio Cuadrado
Samuel Horváth
Martin Takáč
30
0
0
27 Mar 2024
Policy Mirror Descent with Lookahead
Policy Mirror Descent with Lookahead
Kimon Protopapas
Anas Barakat
29
1
0
21 Mar 2024
Reusing Historical Trajectories in Natural Policy Gradient via Importance Sampling: Convergence and Convergence Rate
Reusing Historical Trajectories in Natural Policy Gradient via Importance Sampling: Convergence and Convergence Rate
Yifan Lin
Yuhao Wang
Enlu Zhou
72
0
0
01 Mar 2024
Generalizing Reward Modeling for Out-of-Distribution Preference Learning
Generalizing Reward Modeling for Out-of-Distribution Preference Learning
Chen Jia
44
2
0
22 Feb 2024
ACE : Off-Policy Actor-Critic with Causality-Aware Entropy
  Regularization
ACE : Off-Policy Actor-Critic with Causality-Aware Entropy Regularization
Tianying Ji
Yongyuan Liang
Yan Zeng
Yu-Juan Luo
Guowei Xu
Jiawei Guo
Ruijie Zheng
Furong Huang
Gang Hua
Huazhe Xu
CML
48
11
0
22 Feb 2024
Performance Improvement Bounds for Lipschitz Configurable Markov
  Decision Processes
Performance Improvement Bounds for Lipschitz Configurable Markov Decision Processes
Alberto Maria Metelli
15
0
0
21 Feb 2024
Off-policy Distributional Q($λ$): Distributional RL without
  Importance Sampling
Off-policy Distributional Q(λλλ): Distributional RL without Importance Sampling
Yunhao Tang
Mark Rowland
Rémi Munos
Bernardo Avila-Pires
Will Dabney
OffRL
12
1
0
08 Feb 2024
Regularized Q-Learning with Linear Function Approximation
Regularized Q-Learning with Linear Function Approximation
Jiachen Xi
Alfredo Garcia
P. Momcilovic
38
2
0
26 Jan 2024
The Definitive Guide to Policy Gradients in Deep Reinforcement Learning:
  Theory, Algorithms and Implementations
The Definitive Guide to Policy Gradients in Deep Reinforcement Learning: Theory, Algorithms and Implementations
Matthias Lehmann
46
0
0
24 Jan 2024
On the Stochastic (Variance-Reduced) Proximal Gradient Method for
  Regularized Expected Reward Optimization
On the Stochastic (Variance-Reduced) Proximal Gradient Method for Regularized Expected Reward Optimization
Ling Liang
Haizhao Yang
14
1
0
23 Jan 2024
A Minimaximalist Approach to Reinforcement Learning from Human Feedback
A Minimaximalist Approach to Reinforcement Learning from Human Feedback
Gokul Swamy
Christoph Dann
Rahul Kidambi
Zhiwei Steven Wu
Alekh Agarwal
OffRL
41
96
0
08 Jan 2024
Physics-Informed Multi-Agent Reinforcement Learning for Distributed Multi-Robot Problems
Physics-Informed Multi-Agent Reinforcement Learning for Distributed Multi-Robot Problems
Eduardo Sebastián
T. Duong
Nikolay Atanasov
Eduardo Montijano
C. Sagüés
31
3
0
30 Dec 2023
Agnostic Interactive Imitation Learning: New Theory and Practical
  Algorithms
Agnostic Interactive Imitation Learning: New Theory and Practical Algorithms
Yichen Li
Chicheng Zhang
OffRL
33
0
0
28 Dec 2023
Adaptive trajectory-constrained exploration strategy for deep
  reinforcement learning
Adaptive trajectory-constrained exploration strategy for deep reinforcement learning
Guojian Wang
Faguo Wu
Xiao Zhang
Ning Guo
Zhiming Zheng
36
3
0
27 Dec 2023
Maximizing the Success Probability of Policy Allocations in Online
  Systems
Maximizing the Success Probability of Policy Allocations in Online Systems
Artem Betlei
Mariia Vladimirova
Mehdi Sebbar
Nicolas Urien
Thibaud Rahier
Benjamin Heymann
OffRL
19
1
0
26 Dec 2023
Conservative Exploration for Policy Optimization via Off-Policy Policy
  Evaluation
Conservative Exploration for Policy Optimization via Off-Policy Policy Evaluation
Paul Daoudi
Mathias Formoso
Othman Gaizi
Achraf Azize
Evrard Garcelon
OffRL
26
0
0
24 Dec 2023
PGN: A perturbation generation network against deep reinforcement
  learning
PGN: A perturbation generation network against deep reinforcement learning
Xiangjuan Li
Feifan Li
Yang Li
Quanbiao Pan
AAML
27
2
0
20 Dec 2023
Exploring Gradient Explosion in Generative Adversarial Imitation
  Learning: A Probabilistic Perspective
Exploring Gradient Explosion in Generative Adversarial Imitation Learning: A Probabilistic Perspective
Wanying Wang
Yichen Zhu
Yirui Zhou
Chaomin Shen
Jian Tang
Zhiyuan Xu
Yaxin Peng
Yangchun Zhang
21
4
0
18 Dec 2023
Colored Noise in PPO: Improved Exploration and Performance through
  Correlated Action Sampling
Colored Noise in PPO: Improved Exploration and Performance through Correlated Action Sampling
Jakob J. Hollenstein
Georg Martius
J. Piater
22
3
0
18 Dec 2023
Multi-Agent Reinforcement Learning for Connected and Automated Vehicles
  Control: Recent Advancements and Future Prospects
Multi-Agent Reinforcement Learning for Connected and Automated Vehicles Control: Recent Advancements and Future Prospects
Min Hua
Dong Chen
Xinda Qi
Kun Jiang
Z. Liu
Quan Zhou
Hongming Xu
26
10
0
18 Dec 2023
Multi-Objective Reinforcement Learning-based Approach for Pressurized
  Water Reactor Optimization
Multi-Objective Reinforcement Learning-based Approach for Pressurized Water Reactor Optimization
Paul Seurin
K. Shirvan
24
10
0
15 Dec 2023
Gradient Informed Proximal Policy Optimization
Gradient Informed Proximal Policy Optimization
Sanghyun Son
L. Zheng
Ryan Sullivan
Yi-Ling Qiao
Ming-Chyuan Lin
37
7
0
14 Dec 2023
An Invitation to Deep Reinforcement Learning
An Invitation to Deep Reinforcement Learning
Bernhard Jaeger
Andreas Geiger
OffRL
OOD
78
5
0
13 Dec 2023
On Task-Relevant Loss Functions in Meta-Reinforcement Learning and
  Online LQR
On Task-Relevant Loss Functions in Meta-Reinforcement Learning and Online LQR
Jaeuk Shin
Giho Kim
Howon Lee
Joonho Han
Insoon Yang
OffRL
33
1
0
09 Dec 2023
LLM A*: Human in the Loop Large Language Models Enabled A* Search for Robotics
LLM A*: Human in the Loop Large Language Models Enabled A* Search for Robotics
Hengjia Xiao
Peng Wang
Mingzhe Yu
Mattia Robbiani
29
21
0
04 Dec 2023
TRC: Trust Region Conditional Value at Risk for Safe Reinforcement
  Learning
TRC: Trust Region Conditional Value at Risk for Safe Reinforcement Learning
Dohyeong Kim
Songhwai Oh
19
19
0
01 Dec 2023
Towards a Standardized Reinforcement Learning Framework for AAM
  Contingency Management
Towards a Standardized Reinforcement Learning Framework for AAM Contingency Management
Luis E. Alvarez
Marc W. Brittain
Kara Breeden
13
2
0
17 Nov 2023
Previous
123456...212223
Next