Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1909.02769
Cited By
Adaptive Trust Region Policy Optimization: Global Convergence and Faster Rates for Regularized MDPs
6 September 2019
Lior Shani
Yonathan Efroni
Shie Mannor
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Adaptive Trust Region Policy Optimization: Global Convergence and Faster Rates for Regularized MDPs"
50 / 52 papers shown
Title
Efficient Policy Optimization in Robust Constrained MDPs with Iteration Complexity Guarantees
Sourav Ganguly
Arnob Ghosh
Kishan Panaganti
Adam Wierman
5
0
0
25 May 2025
Sharp Analysis for KL-Regularized Contextual Bandits and RLHF
Heyang Zhao
Chenlu Ye
Quanquan Gu
Tong Zhang
OffRL
71
4
0
07 Nov 2024
Learning Optimal Deterministic Policies with Stochastic Policy Gradients
Alessandro Montenegro
Marco Mussi
Alberto Maria Metelli
Matteo Papini
60
2
0
03 May 2024
Regularized Q-Learning with Linear Function Approximation
Jiachen Xi
Alfredo Garcia
P. Momcilovic
54
2
0
26 Jan 2024
On the Stochastic (Variance-Reduced) Proximal Gradient Method for Regularized Expected Reward Optimization
Ling Liang
Haizhao Yang
24
1
0
23 Jan 2024
Heterogeneous Multi-Agent Reinforcement Learning via Mirror Descent Policy Optimization
Mohammad Mehdi Nasiri
M. Rezghi
68
0
0
13 Aug 2023
Provably Robust Temporal Difference Learning for Heavy-Tailed Rewards
Semih Cayci
A. Eryilmaz
40
2
0
20 Jun 2023
On the Linear Convergence of Policy Gradient under Hadamard Parameterization
Jiacai Liu
Jinchi Chen
Ke Wei
34
2
0
31 May 2023
Policy Gradient Algorithms Implicitly Optimize by Continuation
Adrien Bolland
Gilles Louppe
D. Ernst
39
3
0
11 May 2023
Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning
Yulai Zhao
Zhuoran Yang
Zhaoran Wang
Jason D. Lee
48
3
0
08 May 2023
Twice Regularized Markov Decision Processes: The Equivalence between Robustness and Regularization
E. Derman
Yevgeniy Men
Matthieu Geist
Shie Mannor
50
2
0
12 Mar 2023
An Improved Analysis of (Variance-Reduced) Policy Gradient and Natural Policy Gradient Methods
Yanli Liu
Kai Zhang
Tamer Basar
W. Yin
61
107
0
15 Nov 2022
Ensemble Reinforcement Learning in Continuous Spaces -- A Hierarchical Multi-Step Approach for Policy Training
Gang Chen
Victoria Huang
OffRL
66
0
0
29 Sep 2022
First-order Policy Optimization for Robust Markov Decision Process
Yan Li
Guanghui Lan
Tuo Zhao
77
23
0
21 Sep 2022
Actor-Critic based Improper Reinforcement Learning
Mohammadi Zaki
Avinash Mohan
Aditya Gopalan
Shie Mannor
35
2
0
19 Jul 2022
Policy Optimization for Markov Games: Unified Framework and Faster Convergence
Runyu Zhang
Qinghua Liu
Haiquan Wang
Caiming Xiong
Na Li
Yu Bai
40
26
0
06 Jun 2022
Learning to Constrain Policy Optimization with Virtual Trust Region
Hung Le
Thommen Karimpanal George
Majid Abdolshah
D. Nguyen
Kien Do
Sunil R. Gupta
Svetha Venkatesh
41
3
0
20 Apr 2022
Latency Optimization for Blockchain-Empowered Federated Learning in Multi-Server Edge Computing
Dinh C. Nguyen
Seyyedali Hosseinalipour
David J. Love
P. Pathirana
Christopher G. Brinton
51
47
0
18 Mar 2022
Mirror Learning: A Unifying Framework of Policy Optimisation
J. Kuba
Christian Schroeder de Witt
Jakob N. Foerster
34
25
0
07 Jan 2022
Recent Advances in Reinforcement Learning in Finance
B. Hambly
Renyuan Xu
Huining Yang
OffRL
45
169
0
08 Dec 2021
Neural PPO-Clip Attains Global Optimality: A Hinge Loss Perspective
Nai-Chieh Huang
Ping-Chun Hsieh
Kuo-Hao Ho
Hsuan-Yu Yao
Kai-Chun Hu
Liang-Chun Ouyang
I-Chen Wu
37
1
0
26 Oct 2021
EnTRPO: Trust Region Policy Optimization Method with Entropy Regularization
Sahar Roostaie
M. Ebadzadeh
24
3
0
26 Oct 2021
Optimistic Policy Optimization is Provably Efficient in Non-stationary MDPs
Han Zhong
Zhuoran Yang
Zhaoran Wang
Csaba Szepesvári
66
21
0
18 Oct 2021
Twice regularized MDPs and the equivalence between robustness and regularization
E. Derman
Matthieu Geist
Shie Mannor
53
55
0
12 Oct 2021
Approximate Newton policy gradient algorithms
Haoya Li
Samarth Gupta
Hsiangfu Yu
Lexing Ying
Inderjit Dhillon
56
2
0
05 Oct 2021
Theoretical Guarantees of Fictitious Discount Algorithms for Episodic Reinforcement Learning and Global Convergence of Policy Gradient Methods
Xin Guo
Anran Hu
Junzi Zhang
OffRL
44
6
0
13 Sep 2021
Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning
Andrea Zanette
Martin J. Wainwright
Emma Brunskill
OffRL
34
115
0
19 Aug 2021
A general sample complexity analysis of vanilla policy gradient
Rui Yuan
Robert Mansel Gower
A. Lazaric
84
63
0
23 Jul 2021
Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences
Alan Chan
Hugo Silva
Sungsu Lim
Tadashi Kozuno
A. R. Mahmood
Martha White
30
29
0
17 Jul 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic with Linear Function Approximation
Zaiwei Chen
S. Khodadadian
S. T. Maguluri
OffRL
68
29
0
26 May 2021
Policy Mirror Descent for Regularized Reinforcement Learning: A Generalized Framework with Linear Convergence
Wenhao Zhan
Shicong Cen
Baihe Huang
Yuxin Chen
Jason D. Lee
Yuejie Chi
32
76
0
24 May 2021
On the Linear convergence of Natural Policy Gradient Algorithm
S. Khodadadian
P. Jhunjhunwala
Sushil Mahavir Varma
S. T. Maguluri
45
56
0
04 May 2021
Softmax Policy Gradient Methods Can Take Exponential Time to Converge
Gen Li
Yuting Wei
Yuejie Chi
Yuxin Chen
48
50
0
22 Feb 2021
Dealing with Non-Stationarity in MARL via Trust-Region Decomposition
Wenhao Li
Xiangfeng Wang
Bo Jin
Junjie Sheng
H. Zha
52
7
0
21 Feb 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic Algorithm
S. Khodadadian
Zaiwei Chen
S. T. Maguluri
CML
OffRL
78
26
0
18 Feb 2021
Online Apprenticeship Learning
Lior Shani
Tom Zahavy
Shie Mannor
OffRL
31
26
0
13 Feb 2021
Optimization Issues in KL-Constrained Approximate Policy Iteration
N. Lazić
Botao Hao
Yasin Abbasi-Yadkori
Dale Schuurmans
Csaba Szepesvári
19
10
0
11 Feb 2021
Policy Mirror Descent for Reinforcement Learning: Linear Convergence, New Sampling Complexity, and Generalized Problem Classes
Guanghui Lan
106
138
0
30 Jan 2021
Finite Sample Analysis of Two-Time-Scale Natural Actor-Critic Algorithm
S. Khodadadian
Thinh T. Doan
Justin Romberg
S. T. Maguluri
40
42
0
26 Jan 2021
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee
Tengyu Xu
Yingbin Liang
Guanghui Lan
54
123
0
11 Nov 2020
Sample Efficient Reinforcement Learning with REINFORCE
Junzi Zhang
Jongho Kim
Brendan O'Donoghue
Stephen P. Boyd
51
104
0
22 Oct 2020
Approximation Benefits of Policy Gradient Methods with Aggregated States
Daniel Russo
61
7
0
22 Jul 2020
On Linear Convergence of Policy Gradient Methods for Finite MDPs
Jalaj Bhandari
Daniel Russo
62
59
0
21 Jul 2020
Mirror Descent Policy Optimization
Manan Tomar
Lior Shani
Yonathan Efroni
Mohammad Ghavamzadeh
35
83
0
20 May 2020
On the Global Convergence Rates of Softmax Policy Gradient Methods
Jincheng Mei
Chenjun Xiao
Csaba Szepesvári
Dale Schuurmans
52
278
0
13 May 2020
Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
45
58
0
07 May 2020
Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
32
25
0
27 Apr 2020
Leverage the Average: an Analysis of KL Regularization in RL
Nino Vieillard
Tadashi Kozuno
B. Scherrer
Olivier Pietquin
Rémi Munos
Matthieu Geist
29
43
0
31 Mar 2020
Distributional Robustness and Regularization in Reinforcement Learning
E. Derman
Shie Mannor
34
44
0
05 Mar 2020
Policy Optimization for
H
2
\mathcal{H}_2
H
2
Linear Control with
H
∞
\mathcal{H}_\infty
H
∞
Robustness Guarantee: Implicit Regularization and Global Convergence
Kai Zhang
Bin Hu
Tamer Basar
29
119
0
21 Oct 2019
1
2
Next