Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.10306
Cited By
Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy
25 June 2019
Boyi Liu
Qi Cai
Zhuoran Yang
Zhaoran Wang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy"
31 / 31 papers shown
Title
The Definitive Guide to Policy Gradients in Deep Reinforcement Learning: Theory, Algorithms and Implementations
Matthias Lehmann
46
0
0
24 Jan 2024
On the Stochastic (Variance-Reduced) Proximal Gradient Method for Regularized Expected Reward Optimization
Ling Liang
Haizhao Yang
14
1
0
23 Jan 2024
Provably Robust Temporal Difference Learning for Heavy-Tailed Rewards
Semih Cayci
A. Eryilmaz
23
2
0
20 Jun 2023
Graphon Mean-Field Control for Cooperative Multi-Agent Reinforcement Learning
Yuanquan Hu
Xiaoli Wei
Jun Yan
Heng-Wei Zhang
42
8
0
11 Sep 2022
Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions
Shuang Qiu
Xiaohan Wei
Jieping Ye
Zhaoran Wang
Zhuoran Yang
OffRL
30
11
0
25 Jul 2022
Mirror Learning: A Unifying Framework of Policy Optimisation
J. Kuba
Christian Schroeder de Witt
Jakob N. Foerster
26
24
0
07 Jan 2022
Differentially Private Regret Minimization in Episodic Markov Decision Processes
Sayak Ray Chowdhury
Xingyu Zhou
26
21
0
20 Dec 2021
On the Privacy Risks of Deploying Recurrent Neural Networks in Machine Learning Models
Yunhao Yang
Parham Gohari
Ufuk Topcu
AAML
30
3
0
06 Oct 2021
Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning
Andrea Zanette
Martin J. Wainwright
Emma Brunskill
OffRL
29
113
0
19 Aug 2021
Towards General Function Approximation in Zero-Sum Markov Games
Baihe Huang
Jason D. Lee
Zhaoran Wang
Zhuoran Yang
33
47
0
30 Jul 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic with Linear Function Approximation
Zaiwei Chen
S. Khodadadian
S. T. Maguluri
OffRL
63
29
0
26 May 2021
On the Linear convergence of Natural Policy Gradient Algorithm
S. Khodadadian
P. Jhunjhunwala
Sushil Mahavir Varma
S. T. Maguluri
40
56
0
04 May 2021
Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation
Andrea Zanette
Ching-An Cheng
Alekh Agarwal
32
52
0
24 Mar 2021
Softmax Policy Gradient Methods Can Take Exponential Time to Converge
Gen Li
Yuting Wei
Yuejie Chi
Yuxin Chen
29
50
0
22 Feb 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic Algorithm
S. Khodadadian
Zaiwei Chen
S. T. Maguluri
CML
OffRL
71
26
0
18 Feb 2021
Policy Mirror Descent for Reinforcement Learning: Linear Convergence, New Sampling Complexity, and Generalized Problem Classes
Guanghui Lan
94
136
0
30 Jan 2021
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee
Tengyu Xu
Yingbin Liang
Guanghui Lan
42
121
0
11 Nov 2020
Proximal Policy Optimization via Enhanced Exploration Efficiency
Junwei Zhang
Zhenghao Zhang
Shuai Han
Shuai Lu
32
41
0
11 Nov 2020
Single and Multi-Agent Deep Reinforcement Learning for AI-Enabled Wireless Networks: A Tutorial
Amal Feriani
Ekram Hossain
35
237
0
06 Nov 2020
Sample Efficient Reinforcement Learning with REINFORCE
Junzi Zhang
Jongho Kim
Brendan O'Donoghue
Stephen P. Boyd
37
100
0
22 Oct 2020
Revisiting Design Choices in Proximal Policy Optimization
Chloe Ching-Yun Hsu
Celestine Mendler-Dünner
Moritz Hardt
22
53
0
23 Sep 2020
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Zuyue Fu
Zhuoran Yang
Zhaoran Wang
15
42
0
02 Aug 2020
Variational Policy Gradient Method for Reinforcement Learning with General Utilities
Junyu Zhang
Alec Koppel
Amrit Singh Bedi
Csaba Szepesvári
Mengdi Wang
27
137
0
04 Jul 2020
Mirror Descent Policy Optimization
Manan Tomar
Lior Shani
Yonathan Efroni
Mohammad Ghavamzadeh
19
82
0
20 May 2020
Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
26
57
0
07 May 2020
Generative Adversarial Imitation Learning with Neural Networks: Global Optimality and Convergence Rate
Yufeng Zhang
Qi Cai
Zhuoran Yang
Zhaoran Wang
111
12
0
08 Mar 2020
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms
Kaipeng Zhang
Zhuoran Yang
Tamer Basar
58
1,181
0
24 Nov 2019
Convergent Policy Optimization for Safe Reinforcement Learning
Ming Yu
Zhuoran Yang
Mladen Kolar
Zhaoran Wang
16
91
0
26 Oct 2019
Sample Efficient Policy Gradient Methods with Recursive Variance Reduction
Pan Xu
F. Gao
Quanquan Gu
31
83
0
18 Sep 2019
On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift
Alekh Agarwal
Sham Kakade
J. Lee
G. Mahajan
13
316
0
01 Aug 2019
Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies
Kaipeng Zhang
Alec Koppel
Haoqi Zhu
Tamer Basar
44
186
0
19 Jun 2019
1