ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.02769
  4. Cited By
Adaptive Trust Region Policy Optimization: Global Convergence and Faster
  Rates for Regularized MDPs

Adaptive Trust Region Policy Optimization: Global Convergence and Faster Rates for Regularized MDPs

6 September 2019
Lior Shani
Yonathan Efroni
Shie Mannor
ArXivPDFHTML

Papers citing "Adaptive Trust Region Policy Optimization: Global Convergence and Faster Rates for Regularized MDPs"

50 / 52 papers shown
Title
Efficient Policy Optimization in Robust Constrained MDPs with Iteration Complexity Guarantees
Efficient Policy Optimization in Robust Constrained MDPs with Iteration Complexity Guarantees
Sourav Ganguly
Arnob Ghosh
Kishan Panaganti
Adam Wierman
5
0
0
25 May 2025
Sharp Analysis for KL-Regularized Contextual Bandits and RLHF
Sharp Analysis for KL-Regularized Contextual Bandits and RLHF
Heyang Zhao
Chenlu Ye
Quanquan Gu
Tong Zhang
OffRL
71
4
0
07 Nov 2024
Learning Optimal Deterministic Policies with Stochastic Policy Gradients
Learning Optimal Deterministic Policies with Stochastic Policy Gradients
Alessandro Montenegro
Marco Mussi
Alberto Maria Metelli
Matteo Papini
60
2
0
03 May 2024
Regularized Q-Learning with Linear Function Approximation
Regularized Q-Learning with Linear Function Approximation
Jiachen Xi
Alfredo Garcia
P. Momcilovic
54
2
0
26 Jan 2024
On the Stochastic (Variance-Reduced) Proximal Gradient Method for
  Regularized Expected Reward Optimization
On the Stochastic (Variance-Reduced) Proximal Gradient Method for Regularized Expected Reward Optimization
Ling Liang
Haizhao Yang
24
1
0
23 Jan 2024
Heterogeneous Multi-Agent Reinforcement Learning via Mirror Descent
  Policy Optimization
Heterogeneous Multi-Agent Reinforcement Learning via Mirror Descent Policy Optimization
Mohammad Mehdi Nasiri
M. Rezghi
68
0
0
13 Aug 2023
Provably Robust Temporal Difference Learning for Heavy-Tailed Rewards
Provably Robust Temporal Difference Learning for Heavy-Tailed Rewards
Semih Cayci
A. Eryilmaz
40
2
0
20 Jun 2023
On the Linear Convergence of Policy Gradient under Hadamard
  Parameterization
On the Linear Convergence of Policy Gradient under Hadamard Parameterization
Jiacai Liu
Jinchi Chen
Ke Wei
34
2
0
31 May 2023
Policy Gradient Algorithms Implicitly Optimize by Continuation
Policy Gradient Algorithms Implicitly Optimize by Continuation
Adrien Bolland
Gilles Louppe
D. Ernst
39
3
0
11 May 2023
Local Optimization Achieves Global Optimality in Multi-Agent
  Reinforcement Learning
Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning
Yulai Zhao
Zhuoran Yang
Zhaoran Wang
Jason D. Lee
48
3
0
08 May 2023
Twice Regularized Markov Decision Processes: The Equivalence between
  Robustness and Regularization
Twice Regularized Markov Decision Processes: The Equivalence between Robustness and Regularization
E. Derman
Yevgeniy Men
Matthieu Geist
Shie Mannor
50
2
0
12 Mar 2023
An Improved Analysis of (Variance-Reduced) Policy Gradient and Natural
  Policy Gradient Methods
An Improved Analysis of (Variance-Reduced) Policy Gradient and Natural Policy Gradient Methods
Yanli Liu
Kai Zhang
Tamer Basar
W. Yin
61
107
0
15 Nov 2022
Ensemble Reinforcement Learning in Continuous Spaces -- A Hierarchical
  Multi-Step Approach for Policy Training
Ensemble Reinforcement Learning in Continuous Spaces -- A Hierarchical Multi-Step Approach for Policy Training
Gang Chen
Victoria Huang
OffRL
66
0
0
29 Sep 2022
First-order Policy Optimization for Robust Markov Decision Process
First-order Policy Optimization for Robust Markov Decision Process
Yan Li
Guanghui Lan
Tuo Zhao
77
23
0
21 Sep 2022
Actor-Critic based Improper Reinforcement Learning
Actor-Critic based Improper Reinforcement Learning
Mohammadi Zaki
Avinash Mohan
Aditya Gopalan
Shie Mannor
35
2
0
19 Jul 2022
Policy Optimization for Markov Games: Unified Framework and Faster
  Convergence
Policy Optimization for Markov Games: Unified Framework and Faster Convergence
Runyu Zhang
Qinghua Liu
Haiquan Wang
Caiming Xiong
Na Li
Yu Bai
40
26
0
06 Jun 2022
Learning to Constrain Policy Optimization with Virtual Trust Region
Learning to Constrain Policy Optimization with Virtual Trust Region
Hung Le
Thommen Karimpanal George
Majid Abdolshah
D. Nguyen
Kien Do
Sunil R. Gupta
Svetha Venkatesh
41
3
0
20 Apr 2022
Latency Optimization for Blockchain-Empowered Federated Learning in
  Multi-Server Edge Computing
Latency Optimization for Blockchain-Empowered Federated Learning in Multi-Server Edge Computing
Dinh C. Nguyen
Seyyedali Hosseinalipour
David J. Love
P. Pathirana
Christopher G. Brinton
51
47
0
18 Mar 2022
Mirror Learning: A Unifying Framework of Policy Optimisation
Mirror Learning: A Unifying Framework of Policy Optimisation
J. Kuba
Christian Schroeder de Witt
Jakob N. Foerster
34
25
0
07 Jan 2022
Recent Advances in Reinforcement Learning in Finance
Recent Advances in Reinforcement Learning in Finance
B. Hambly
Renyuan Xu
Huining Yang
OffRL
45
169
0
08 Dec 2021
Neural PPO-Clip Attains Global Optimality: A Hinge Loss Perspective
Neural PPO-Clip Attains Global Optimality: A Hinge Loss Perspective
Nai-Chieh Huang
Ping-Chun Hsieh
Kuo-Hao Ho
Hsuan-Yu Yao
Kai-Chun Hu
Liang-Chun Ouyang
I-Chen Wu
37
1
0
26 Oct 2021
EnTRPO: Trust Region Policy Optimization Method with Entropy
  Regularization
EnTRPO: Trust Region Policy Optimization Method with Entropy Regularization
Sahar Roostaie
M. Ebadzadeh
24
3
0
26 Oct 2021
Optimistic Policy Optimization is Provably Efficient in Non-stationary
  MDPs
Optimistic Policy Optimization is Provably Efficient in Non-stationary MDPs
Han Zhong
Zhuoran Yang
Zhaoran Wang
Csaba Szepesvári
66
21
0
18 Oct 2021
Twice regularized MDPs and the equivalence between robustness and
  regularization
Twice regularized MDPs and the equivalence between robustness and regularization
E. Derman
Matthieu Geist
Shie Mannor
53
55
0
12 Oct 2021
Approximate Newton policy gradient algorithms
Approximate Newton policy gradient algorithms
Haoya Li
Samarth Gupta
Hsiangfu Yu
Lexing Ying
Inderjit Dhillon
56
2
0
05 Oct 2021
Theoretical Guarantees of Fictitious Discount Algorithms for Episodic
  Reinforcement Learning and Global Convergence of Policy Gradient Methods
Theoretical Guarantees of Fictitious Discount Algorithms for Episodic Reinforcement Learning and Global Convergence of Policy Gradient Methods
Xin Guo
Anran Hu
Junzi Zhang
OffRL
44
6
0
13 Sep 2021
Provable Benefits of Actor-Critic Methods for Offline Reinforcement
  Learning
Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning
Andrea Zanette
Martin J. Wainwright
Emma Brunskill
OffRL
34
115
0
19 Aug 2021
A general sample complexity analysis of vanilla policy gradient
A general sample complexity analysis of vanilla policy gradient
Rui Yuan
Robert Mansel Gower
A. Lazaric
84
63
0
23 Jul 2021
Greedification Operators for Policy Optimization: Investigating Forward
  and Reverse KL Divergences
Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences
Alan Chan
Hugo Silva
Sungsu Lim
Tadashi Kozuno
A. R. Mahmood
Martha White
30
29
0
17 Jul 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic with Linear
  Function Approximation
Finite-Sample Analysis of Off-Policy Natural Actor-Critic with Linear Function Approximation
Zaiwei Chen
S. Khodadadian
S. T. Maguluri
OffRL
68
29
0
26 May 2021
Policy Mirror Descent for Regularized Reinforcement Learning: A
  Generalized Framework with Linear Convergence
Policy Mirror Descent for Regularized Reinforcement Learning: A Generalized Framework with Linear Convergence
Wenhao Zhan
Shicong Cen
Baihe Huang
Yuxin Chen
Jason D. Lee
Yuejie Chi
32
76
0
24 May 2021
On the Linear convergence of Natural Policy Gradient Algorithm
On the Linear convergence of Natural Policy Gradient Algorithm
S. Khodadadian
P. Jhunjhunwala
Sushil Mahavir Varma
S. T. Maguluri
45
56
0
04 May 2021
Softmax Policy Gradient Methods Can Take Exponential Time to Converge
Softmax Policy Gradient Methods Can Take Exponential Time to Converge
Gen Li
Yuting Wei
Yuejie Chi
Yuxin Chen
48
50
0
22 Feb 2021
Dealing with Non-Stationarity in MARL via Trust-Region Decomposition
Dealing with Non-Stationarity in MARL via Trust-Region Decomposition
Wenhao Li
Xiangfeng Wang
Bo Jin
Junjie Sheng
H. Zha
52
7
0
21 Feb 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic Algorithm
Finite-Sample Analysis of Off-Policy Natural Actor-Critic Algorithm
S. Khodadadian
Zaiwei Chen
S. T. Maguluri
CML
OffRL
78
26
0
18 Feb 2021
Online Apprenticeship Learning
Online Apprenticeship Learning
Lior Shani
Tom Zahavy
Shie Mannor
OffRL
31
26
0
13 Feb 2021
Optimization Issues in KL-Constrained Approximate Policy Iteration
Optimization Issues in KL-Constrained Approximate Policy Iteration
N. Lazić
Botao Hao
Yasin Abbasi-Yadkori
Dale Schuurmans
Csaba Szepesvári
19
10
0
11 Feb 2021
Policy Mirror Descent for Reinforcement Learning: Linear Convergence,
  New Sampling Complexity, and Generalized Problem Classes
Policy Mirror Descent for Reinforcement Learning: Linear Convergence, New Sampling Complexity, and Generalized Problem Classes
Guanghui Lan
106
138
0
30 Jan 2021
Finite Sample Analysis of Two-Time-Scale Natural Actor-Critic Algorithm
Finite Sample Analysis of Two-Time-Scale Natural Actor-Critic Algorithm
S. Khodadadian
Thinh T. Doan
Justin Romberg
S. T. Maguluri
40
42
0
26 Jan 2021
CRPO: A New Approach for Safe Reinforcement Learning with Convergence
  Guarantee
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee
Tengyu Xu
Yingbin Liang
Guanghui Lan
54
123
0
11 Nov 2020
Sample Efficient Reinforcement Learning with REINFORCE
Sample Efficient Reinforcement Learning with REINFORCE
Junzi Zhang
Jongho Kim
Brendan O'Donoghue
Stephen P. Boyd
51
104
0
22 Oct 2020
Approximation Benefits of Policy Gradient Methods with Aggregated States
Approximation Benefits of Policy Gradient Methods with Aggregated States
Daniel Russo
61
7
0
22 Jul 2020
On Linear Convergence of Policy Gradient Methods for Finite MDPs
On Linear Convergence of Policy Gradient Methods for Finite MDPs
Jalaj Bhandari
Daniel Russo
62
59
0
21 Jul 2020
Mirror Descent Policy Optimization
Mirror Descent Policy Optimization
Manan Tomar
Lior Shani
Yonathan Efroni
Mohammad Ghavamzadeh
35
83
0
20 May 2020
On the Global Convergence Rates of Softmax Policy Gradient Methods
On the Global Convergence Rates of Softmax Policy Gradient Methods
Jincheng Mei
Chenjun Xiao
Csaba Szepesvári
Dale Schuurmans
52
278
0
13 May 2020
Non-asymptotic Convergence Analysis of Two Time-scale (Natural)
  Actor-Critic Algorithms
Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
45
58
0
07 May 2020
Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms
Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
32
25
0
27 Apr 2020
Leverage the Average: an Analysis of KL Regularization in RL
Leverage the Average: an Analysis of KL Regularization in RL
Nino Vieillard
Tadashi Kozuno
B. Scherrer
Olivier Pietquin
Rémi Munos
Matthieu Geist
29
43
0
31 Mar 2020
Distributional Robustness and Regularization in Reinforcement Learning
Distributional Robustness and Regularization in Reinforcement Learning
E. Derman
Shie Mannor
34
44
0
05 Mar 2020
Policy Optimization for $\mathcal{H}_2$ Linear Control with
  $\mathcal{H}_\infty$ Robustness Guarantee: Implicit Regularization and Global
  Convergence
Policy Optimization for H2\mathcal{H}_2H2​ Linear Control with H∞\mathcal{H}_\inftyH∞​ Robustness Guarantee: Implicit Regularization and Global Convergence
Kai Zhang
Bin Hu
Tamer Basar
29
119
0
21 Oct 2019
12
Next