ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.01150
  4. Cited By
Neural Policy Gradient Methods: Global Optimality and Rates of
  Convergence

Neural Policy Gradient Methods: Global Optimality and Rates of Convergence

29 August 2019
Lingxiao Wang
Qi Cai
Zhuoran Yang
Zhaoran Wang
ArXivPDFHTML

Papers citing "Neural Policy Gradient Methods: Global Optimality and Rates of Convergence"

28 / 78 papers shown
Title
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Tengyu Xu
Zhuoran Yang
Zhaoran Wang
Yingbin Liang
OffRL
47
24
0
23 Feb 2021
Softmax Policy Gradient Methods Can Take Exponential Time to Converge
Softmax Policy Gradient Methods Can Take Exponential Time to Converge
Gen Li
Yuting Wei
Yuejie Chi
Yuxin Chen
29
50
0
22 Feb 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic Algorithm
Finite-Sample Analysis of Off-Policy Natural Actor-Critic Algorithm
S. Khodadadian
Zaiwei Chen
S. T. Maguluri
CML
OffRL
71
26
0
18 Feb 2021
Policy Mirror Descent for Reinforcement Learning: Linear Convergence,
  New Sampling Complexity, and Generalized Problem Classes
Policy Mirror Descent for Reinforcement Learning: Linear Convergence, New Sampling Complexity, and Generalized Problem Classes
Guanghui Lan
102
136
0
30 Jan 2021
Is Pessimism Provably Efficient for Offline RL?
Is Pessimism Provably Efficient for Offline RL?
Ying Jin
Zhuoran Yang
Zhaoran Wang
OffRL
27
349
0
30 Dec 2020
CRPO: A New Approach for Safe Reinforcement Learning with Convergence
  Guarantee
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee
Tengyu Xu
Yingbin Liang
Guanghui Lan
47
122
0
11 Nov 2020
Single and Multi-Agent Deep Reinforcement Learning for AI-Enabled
  Wireless Networks: A Tutorial
Single and Multi-Agent Deep Reinforcement Learning for AI-Enabled Wireless Networks: A Tutorial
Amal Feriani
Ekram Hossain
35
237
0
06 Nov 2020
Cooperative Heterogeneous Deep Reinforcement Learning
Cooperative Heterogeneous Deep Reinforcement Learning
Han Zheng
Pengfei Wei
Jing Jiang
Guodong Long
Qinghua Lu
Chengqi Zhang
51
12
0
02 Nov 2020
Global optimality of softmax policy gradient with single hidden layer
  neural networks in the mean-field regime
Global optimality of softmax policy gradient with single hidden layer neural networks in the mean-field regime
Andrea Agazzi
Jianfeng Lu
13
15
0
22 Oct 2020
Sample Efficient Reinforcement Learning with REINFORCE
Sample Efficient Reinforcement Learning with REINFORCE
Junzi Zhang
Jongho Kim
Brendan O'Donoghue
Stephen P. Boyd
42
101
0
22 Oct 2020
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Zuyue Fu
Zhuoran Yang
Zhaoran Wang
21
42
0
02 Aug 2020
Variational Policy Gradient Method for Reinforcement Learning with
  General Utilities
Variational Policy Gradient Method for Reinforcement Learning with General Utilities
Junyu Zhang
Alec Koppel
Amrit Singh Bedi
Csaba Szepesvári
Mengdi Wang
27
138
0
04 Jul 2020
Provably Efficient Neural Estimation of Structural Equation Model: An
  Adversarial Approach
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach
Luofeng Liao
You-Lin Chen
Zhuoran Yang
Bo Dai
Zhaoran Wang
Mladen Kolar
30
33
0
02 Jul 2020
Non-asymptotic Convergence Analysis of Two Time-scale (Natural)
  Actor-Critic Algorithms
Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
26
57
0
07 May 2020
A Finite Time Analysis of Two Time-Scale Actor Critic Methods
A Finite Time Analysis of Two Time-Scale Actor Critic Methods
Yue Wu
Weitong Zhang
Pan Xu
Quanquan Gu
90
146
0
04 May 2020
Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms
Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
24
25
0
27 Apr 2020
Generative Adversarial Imitation Learning with Neural Networks: Global
  Optimality and Convergence Rate
Generative Adversarial Imitation Learning with Neural Networks: Global Optimality and Convergence Rate
Yufeng Zhang
Qi Cai
Zhuoran Yang
Zhaoran Wang
116
12
0
08 Mar 2020
Provably Efficient Safe Exploration via Primal-Dual Policy Optimization
Provably Efficient Safe Exploration via Primal-Dual Policy Optimization
Dongsheng Ding
Xiaohan Wei
Zhuoran Yang
Zhaoran Wang
M. Jovanović
20
159
0
01 Mar 2020
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and
  Algorithms
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms
Kaipeng Zhang
Zhuoran Yang
Tamer Basar
63
1,184
0
24 Nov 2019
Convergent Policy Optimization for Safe Reinforcement Learning
Convergent Policy Optimization for Safe Reinforcement Learning
Ming Yu
Zhuoran Yang
Mladen Kolar
Zhaoran Wang
16
91
0
26 Oct 2019
Policy Optimization for $\mathcal{H}_2$ Linear Control with
  $\mathcal{H}_\infty$ Robustness Guarantee: Implicit Regularization and Global
  Convergence
Policy Optimization for H2\mathcal{H}_2H2​ Linear Control with H∞\mathcal{H}_\inftyH∞​ Robustness Guarantee: Implicit Regularization and Global Convergence
Kaipeng Zhang
Bin Hu
Tamer Basar
24
119
0
21 Oct 2019
On the Sample Complexity of Actor-Critic Method for Reinforcement
  Learning with Function Approximation
On the Sample Complexity of Actor-Critic Method for Reinforcement Learning with Function Approximation
Harshat Kumar
Alec Koppel
Alejandro Ribeiro
104
80
0
18 Oct 2019
Sample Efficient Policy Gradient Methods with Recursive Variance
  Reduction
Sample Efficient Policy Gradient Methods with Recursive Variance Reduction
Pan Xu
F. Gao
Quanquan Gu
31
83
0
18 Sep 2019
A Review of Cooperative Multi-Agent Deep Reinforcement Learning
A Review of Cooperative Multi-Agent Deep Reinforcement Learning
Afshin Oroojlooyjadid
Davood Hajinezhad
53
413
0
11 Aug 2019
Neural Proximal/Trust Region Policy Optimization Attains Globally
  Optimal Policy
Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy
Boyi Liu
Qi Cai
Zhuoran Yang
Zhaoran Wang
30
108
0
25 Jun 2019
Global Convergence of Policy Gradient Methods to (Almost) Locally
  Optimal Policies
Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies
Kaipeng Zhang
Alec Koppel
Haoqi Zhu
Tamer Basar
44
186
0
19 Jun 2019
Global Optimality Guarantees For Policy Gradient Methods
Global Optimality Guarantees For Policy Gradient Methods
Jalaj Bhandari
Daniel Russo
37
186
0
05 Jun 2019
A Theoretical Analysis of Deep Q-Learning
A Theoretical Analysis of Deep Q-Learning
Jianqing Fan
Zhuoran Yang
Yuchen Xie
Zhaoran Wang
23
596
0
01 Jan 2019
Previous
12