ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.10306
  4. Cited By
Neural Proximal/Trust Region Policy Optimization Attains Globally
  Optimal Policy

Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy

25 June 2019
Boyi Liu
Qi Cai
Zhuoran Yang
Zhaoran Wang
ArXivPDFHTML

Papers citing "Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy"

31 / 31 papers shown
Title
The Definitive Guide to Policy Gradients in Deep Reinforcement Learning:
  Theory, Algorithms and Implementations
The Definitive Guide to Policy Gradients in Deep Reinforcement Learning: Theory, Algorithms and Implementations
Matthias Lehmann
46
0
0
24 Jan 2024
On the Stochastic (Variance-Reduced) Proximal Gradient Method for
  Regularized Expected Reward Optimization
On the Stochastic (Variance-Reduced) Proximal Gradient Method for Regularized Expected Reward Optimization
Ling Liang
Haizhao Yang
14
1
0
23 Jan 2024
Provably Robust Temporal Difference Learning for Heavy-Tailed Rewards
Provably Robust Temporal Difference Learning for Heavy-Tailed Rewards
Semih Cayci
A. Eryilmaz
23
2
0
20 Jun 2023
Graphon Mean-Field Control for Cooperative Multi-Agent Reinforcement
  Learning
Graphon Mean-Field Control for Cooperative Multi-Agent Reinforcement Learning
Yuanquan Hu
Xiaoli Wei
Jun Yan
Heng-Wei Zhang
42
8
0
11 Sep 2022
Provably Efficient Fictitious Play Policy Optimization for Zero-Sum
  Markov Games with Structured Transitions
Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions
Shuang Qiu
Xiaohan Wei
Jieping Ye
Zhaoran Wang
Zhuoran Yang
OffRL
30
11
0
25 Jul 2022
Mirror Learning: A Unifying Framework of Policy Optimisation
Mirror Learning: A Unifying Framework of Policy Optimisation
J. Kuba
Christian Schroeder de Witt
Jakob N. Foerster
26
24
0
07 Jan 2022
Differentially Private Regret Minimization in Episodic Markov Decision
  Processes
Differentially Private Regret Minimization in Episodic Markov Decision Processes
Sayak Ray Chowdhury
Xingyu Zhou
26
21
0
20 Dec 2021
On the Privacy Risks of Deploying Recurrent Neural Networks in Machine
  Learning Models
On the Privacy Risks of Deploying Recurrent Neural Networks in Machine Learning Models
Yunhao Yang
Parham Gohari
Ufuk Topcu
AAML
30
3
0
06 Oct 2021
Provable Benefits of Actor-Critic Methods for Offline Reinforcement
  Learning
Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning
Andrea Zanette
Martin J. Wainwright
Emma Brunskill
OffRL
29
113
0
19 Aug 2021
Towards General Function Approximation in Zero-Sum Markov Games
Towards General Function Approximation in Zero-Sum Markov Games
Baihe Huang
Jason D. Lee
Zhaoran Wang
Zhuoran Yang
33
47
0
30 Jul 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic with Linear
  Function Approximation
Finite-Sample Analysis of Off-Policy Natural Actor-Critic with Linear Function Approximation
Zaiwei Chen
S. Khodadadian
S. T. Maguluri
OffRL
63
29
0
26 May 2021
On the Linear convergence of Natural Policy Gradient Algorithm
On the Linear convergence of Natural Policy Gradient Algorithm
S. Khodadadian
P. Jhunjhunwala
Sushil Mahavir Varma
S. T. Maguluri
40
56
0
04 May 2021
Cautiously Optimistic Policy Optimization and Exploration with Linear
  Function Approximation
Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation
Andrea Zanette
Ching-An Cheng
Alekh Agarwal
32
52
0
24 Mar 2021
Softmax Policy Gradient Methods Can Take Exponential Time to Converge
Softmax Policy Gradient Methods Can Take Exponential Time to Converge
Gen Li
Yuting Wei
Yuejie Chi
Yuxin Chen
29
50
0
22 Feb 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic Algorithm
Finite-Sample Analysis of Off-Policy Natural Actor-Critic Algorithm
S. Khodadadian
Zaiwei Chen
S. T. Maguluri
CML
OffRL
71
26
0
18 Feb 2021
Policy Mirror Descent for Reinforcement Learning: Linear Convergence,
  New Sampling Complexity, and Generalized Problem Classes
Policy Mirror Descent for Reinforcement Learning: Linear Convergence, New Sampling Complexity, and Generalized Problem Classes
Guanghui Lan
94
136
0
30 Jan 2021
CRPO: A New Approach for Safe Reinforcement Learning with Convergence
  Guarantee
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee
Tengyu Xu
Yingbin Liang
Guanghui Lan
42
121
0
11 Nov 2020
Proximal Policy Optimization via Enhanced Exploration Efficiency
Proximal Policy Optimization via Enhanced Exploration Efficiency
Junwei Zhang
Zhenghao Zhang
Shuai Han
Shuai Lu
32
41
0
11 Nov 2020
Single and Multi-Agent Deep Reinforcement Learning for AI-Enabled
  Wireless Networks: A Tutorial
Single and Multi-Agent Deep Reinforcement Learning for AI-Enabled Wireless Networks: A Tutorial
Amal Feriani
Ekram Hossain
35
237
0
06 Nov 2020
Sample Efficient Reinforcement Learning with REINFORCE
Sample Efficient Reinforcement Learning with REINFORCE
Junzi Zhang
Jongho Kim
Brendan O'Donoghue
Stephen P. Boyd
37
100
0
22 Oct 2020
Revisiting Design Choices in Proximal Policy Optimization
Revisiting Design Choices in Proximal Policy Optimization
Chloe Ching-Yun Hsu
Celestine Mendler-Dünner
Moritz Hardt
22
53
0
23 Sep 2020
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Zuyue Fu
Zhuoran Yang
Zhaoran Wang
15
42
0
02 Aug 2020
Variational Policy Gradient Method for Reinforcement Learning with
  General Utilities
Variational Policy Gradient Method for Reinforcement Learning with General Utilities
Junyu Zhang
Alec Koppel
Amrit Singh Bedi
Csaba Szepesvári
Mengdi Wang
27
137
0
04 Jul 2020
Mirror Descent Policy Optimization
Mirror Descent Policy Optimization
Manan Tomar
Lior Shani
Yonathan Efroni
Mohammad Ghavamzadeh
19
82
0
20 May 2020
Non-asymptotic Convergence Analysis of Two Time-scale (Natural)
  Actor-Critic Algorithms
Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
26
57
0
07 May 2020
Generative Adversarial Imitation Learning with Neural Networks: Global
  Optimality and Convergence Rate
Generative Adversarial Imitation Learning with Neural Networks: Global Optimality and Convergence Rate
Yufeng Zhang
Qi Cai
Zhuoran Yang
Zhaoran Wang
111
12
0
08 Mar 2020
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and
  Algorithms
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms
Kaipeng Zhang
Zhuoran Yang
Tamer Basar
58
1,181
0
24 Nov 2019
Convergent Policy Optimization for Safe Reinforcement Learning
Convergent Policy Optimization for Safe Reinforcement Learning
Ming Yu
Zhuoran Yang
Mladen Kolar
Zhaoran Wang
16
91
0
26 Oct 2019
Sample Efficient Policy Gradient Methods with Recursive Variance
  Reduction
Sample Efficient Policy Gradient Methods with Recursive Variance Reduction
Pan Xu
F. Gao
Quanquan Gu
31
83
0
18 Sep 2019
On the Theory of Policy Gradient Methods: Optimality, Approximation, and
  Distribution Shift
On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift
Alekh Agarwal
Sham Kakade
J. Lee
G. Mahajan
13
316
0
01 Aug 2019
Global Convergence of Policy Gradient Methods to (Almost) Locally
  Optimal Policies
Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies
Kaipeng Zhang
Alec Koppel
Haoqi Zhu
Tamer Basar
44
186
0
19 Jun 2019
1