ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.08383
  4. Cited By
Global Convergence of Policy Gradient Methods to (Almost) Locally
  Optimal Policies
v1v2v3 (latest)

Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies

19 June 2019
Kai Zhang
Alec Koppel
Haoqi Zhu
Tamer Basar
ArXiv (abs)PDFHTML

Papers citing "Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies"

50 / 67 papers shown
Title
Reinforcement Learning with Random Time Horizons
Reinforcement Learning with Random Time Horizons
Enric Ribera Borrell
Lorenz Richter
Christof Schütte
AI4TS
30
0
0
01 Jun 2025
Small steps no more: Global convergence of stochastic gradient bandits for arbitrary learning rates
Small steps no more: Global convergence of stochastic gradient bandits for arbitrary learning rates
Jincheng Mei
Bo Dai
Alekh Agarwal
Sharan Vaswani
Anant Raj
Csaba Szepesvári
Dale Schuurmans
134
0
0
11 Feb 2025
Structure Matters: Dynamic Policy Gradient
Structure Matters: Dynamic Policy Gradient
Sara Klein
Xiangyuan Zhang
Tamer Basar
Simon Weissmann
Leif Döring
59
0
0
07 Nov 2024
Bilevel reinforcement learning via the development of hyper-gradient without lower-level convexity
Bilevel reinforcement learning via the development of hyper-gradient without lower-level convexity
Yan Yang
Bin Gao
Ya-xiang Yuan
133
2
0
30 May 2024
Almost sure convergence rates of stochastic gradient methods under gradient domination
Almost sure convergence rates of stochastic gradient methods under gradient domination
Simon Weissmann
Sara Klein
Waïss Azizian
Leif Döring
86
3
0
22 May 2024
Asynchronous Federated Reinforcement Learning with Policy Gradient Updates: Algorithm Design and Convergence Analysis
Asynchronous Federated Reinforcement Learning with Policy Gradient Updates: Algorithm Design and Convergence Analysis
Guangchen Lan
Dong-Jun Han
Abolfazl Hashemi
Vaneet Aggarwal
Christopher G. Brinton
228
16
0
09 Apr 2024
Reusing Historical Trajectories in Natural Policy Gradient via Importance Sampling: Convergence and Convergence Rate
Reusing Historical Trajectories in Natural Policy Gradient via Importance Sampling: Convergence and Convergence Rate
Yifan Lin
Yuhao Wang
Enlu Zhou
139
0
0
01 Mar 2024
A Fisher-Rao gradient flow for entropy-regularised Markov decision processes in Polish spaces
A Fisher-Rao gradient flow for entropy-regularised Markov decision processes in Polish spaces
B. Kerimkulov
J. Leahy
David Siska
Lukasz Szpruch
Yufei Zhang
124
12
0
04 Oct 2023
Understanding the Complexity Gains of Single-Task RL with a Curriculum
Understanding the Complexity Gains of Single-Task RL with a Curriculum
Qiyang Li
Yuexiang Zhai
Yi-An Ma
Sergey Levine
112
16
0
24 Dec 2022
An Improved Analysis of (Variance-Reduced) Policy Gradient and Natural
  Policy Gradient Methods
An Improved Analysis of (Variance-Reduced) Policy Gradient and Natural Policy Gradient Methods
Yanli Liu
Kai Zhang
Tamer Basar
W. Yin
111
110
0
15 Nov 2022
Symmetric (Optimistic) Natural Policy Gradient for Multi-agent Learning
  with Parameter Convergence
Symmetric (Optimistic) Natural Policy Gradient for Multi-agent Learning with Parameter Convergence
S. Pattathil
Kai Zhang
Asuman Ozdaglar
94
14
0
23 Oct 2022
On the convergence of policy gradient methods to Nash equilibria in
  general stochastic games
On the convergence of policy gradient methods to Nash equilibria in general stochastic games
Angeliki Giannou
Kyriakos Lotidis
P. Mertikopoulos
Emmanouil-Vasileios Vlatakis-Gkaragkounis
124
18
0
17 Oct 2022
Decentralized Policy Gradient for Nash Equilibria Learning of
  General-sum Stochastic Games
Decentralized Policy Gradient for Nash Equilibria Learning of General-sum Stochastic Games
Yan Chen
Taoying Li
51
2
0
14 Oct 2022
RTAW: An Attention Inspired Reinforcement Learning Method for
  Multi-Robot Task Allocation in Warehouse Environments
RTAW: An Attention Inspired Reinforcement Learning Method for Multi-Robot Task Allocation in Warehouse Environments
Aakriti Agrawal
Amrit Singh Bedi
Tianyi Zhou
116
20
0
13 Sep 2022
Sampling Through the Lens of Sequential Decision Making
Sampling Through the Lens of Sequential Decision Making
J. Dou
Alvin Pan
Runxue Bao
Haiyi Mao
Lei Luo
Zhi-Hong Mao
96
19
0
17 Aug 2022
A Single-Timescale Analysis For Stochastic Approximation With Multiple
  Coupled Sequences
A Single-Timescale Analysis For Stochastic Approximation With Multiple Coupled Sequences
Han Shen
Tianyi Chen
105
15
0
21 Jun 2022
How are policy gradient methods affected by the limits of control?
How are policy gradient methods affected by the limits of control?
Ingvar M. Ziemann
Anastasios Tsiamis
H. Sandberg
Nikolai Matni
57
14
0
14 Jun 2022
Variance Reduction for Policy-Gradient Methods via Empirical Variance
  Minimization
Variance Reduction for Policy-Gradient Methods via Empirical Variance Minimization
Maxim Kaledin
Alexander Golubev
Denis Belomestny
OffRL
82
4
0
14 Jun 2022
Achieving Zero Constraint Violation for Constrained Reinforcement
  Learning via Conservative Natural Policy Gradient Primal-Dual Algorithm
Achieving Zero Constraint Violation for Constrained Reinforcement Learning via Conservative Natural Policy Gradient Primal-Dual Algorithm
Qinbo Bai
Amrit Singh Bedi
Vaneet Aggarwal
78
24
0
12 Jun 2022
Finite-Time Analysis of Fully Decentralized Single-Timescale
  Actor-Critic
Finite-Time Analysis of Fully Decentralized Single-Timescale Actor-Critic
Qijun Luo
Xiao Li
102
1
0
12 Jun 2022
Dealing with Sparse Rewards in Continuous Control Robotics via
  Heavy-Tailed Policies
Dealing with Sparse Rewards in Continuous Control Robotics via Heavy-Tailed Policies
Souradip Chakraborty
Amrit Singh Bedi
Alec Koppel
Pratap Tokekar
Tianyi Zhou
79
8
0
12 Jun 2022
Convergence and sample complexity of natural policy gradient primal-dual
  methods for constrained MDPs
Convergence and sample complexity of natural policy gradient primal-dual methods for constrained MDPs
Dongsheng Ding
Kai Zhang
Jiali Duan
Tamer Bacsar
Mihailo R. Jovanović
73
21
0
06 Jun 2022
A Small Gain Analysis of Single Timescale Actor Critic
A Small Gain Analysis of Single Timescale Actor Critic
Alexander Olshevsky
Bahman Gharesifard
104
20
0
04 Mar 2022
A policy gradient approach for optimization of smooth risk measures
A policy gradient approach for optimization of smooth risk measures
Nithia Vijayan
Prashanth L.A.
OffRL
50
4
0
22 Feb 2022
Beyond the Policy Gradient Theorem for Efficient Policy Updates in
  Actor-Critic Algorithms
Beyond the Policy Gradient Theorem for Efficient Policy Updates in Actor-Critic Algorithms
Romain Laroche
Rémi Tachet des Combes
94
2
0
15 Feb 2022
Do Differentiable Simulators Give Better Policy Gradients?
Do Differentiable Simulators Give Better Policy Gradients?
H.J. Terry Suh
Max Simchowitz
Kai Zhang
Russ Tedrake
84
101
0
02 Feb 2022
Recent Advances in Reinforcement Learning in Finance
Recent Advances in Reinforcement Learning in Finance
B. Hambly
Renyuan Xu
Huining Yang
OffRL
126
180
0
08 Dec 2021
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Shangtong Zhang
Rémi Tachet des Combes
Romain Laroche
107
12
0
04 Nov 2021
Understanding the Effect of Stochasticity in Policy Optimization
Understanding the Effect of Stochasticity in Policy Optimization
Jincheng Mei
Bo Dai
Chenjun Xiao
Csaba Szepesvári
Dale Schuurmans
70
19
0
29 Oct 2021
Convergence Rates of Average-Reward Multi-agent Reinforcement Learning
  via Randomized Linear Programming
Convergence Rates of Average-Reward Multi-agent Reinforcement Learning via Randomized Linear Programming
Alec Koppel
Amrit Singh Bedi
Bhargav Ganguly
Vaneet Aggarwal
51
4
0
22 Oct 2021
Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy
  Gradient Methods with Entropy Regularization
Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization
Yuhao Ding
Junzi Zhang
Hyunin Lee
Javad Lavaei
111
19
0
19 Oct 2021
Learning to Coordinate in Multi-Agent Systems: A Coordinated
  Actor-Critic Algorithm and Finite-Time Guarantees
Learning to Coordinate in Multi-Agent Systems: A Coordinated Actor-Critic Algorithm and Finite-Time Guarantees
Siliang Zeng
Tianyi Chen
Alfredo García
Mingyi Hong
92
11
0
11 Oct 2021
Theoretical Guarantees of Fictitious Discount Algorithms for Episodic
  Reinforcement Learning and Global Convergence of Policy Gradient Methods
Theoretical Guarantees of Fictitious Discount Algorithms for Episodic Reinforcement Learning and Global Convergence of Policy Gradient Methods
Xin Guo
Anran Hu
Junzi Zhang
OffRL
86
6
0
13 Sep 2021
Mean-Field Multi-Agent Reinforcement Learning: A Decentralized Network
  Approach
Mean-Field Multi-Agent Reinforcement Learning: A Decentralized Network Approach
Haotian Gu
Xin Guo
Xiaoli Wei
Renyuan Xu
OOD
99
36
0
05 Aug 2021
A general sample complexity analysis of vanilla policy gradient
A general sample complexity analysis of vanilla policy gradient
Rui Yuan
Robert Mansel Gower
A. Lazaric
118
64
0
23 Jul 2021
Policy Gradient Methods for Distortion Risk Measures
Policy Gradient Methods for Distortion Risk Measures
Nithia Vijayan
Prashanth L.A.
131
5
0
09 Jul 2021
On the Sample Complexity and Metastability of Heavy-tailed Policy Search
  in Continuous Control
On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control
Amrit Singh Bedi
Anjaly Parayil
Junyu Zhang
Mengdi Wang
Alec Koppel
88
15
0
15 Jun 2021
Analysis of a Target-Based Actor-Critic Algorithm with Linear Function
  Approximation
Analysis of a Target-Based Actor-Critic Algorithm with Linear Function Approximation
Anas Barakat
Pascal Bianchi
Julien Lehmann
91
9
0
14 Jun 2021
Joint Optimization of Multi-Objective Reinforcement Learning with Policy
  Gradient Based Algorithm
Joint Optimization of Multi-Objective Reinforcement Learning with Policy Gradient Based Algorithm
Qinbo Bai
Mridul Agarwal
Vaneet Aggarwal
45
7
0
28 May 2021
A nearly Blackwell-optimal policy gradient method
A nearly Blackwell-optimal policy gradient method
Vektor Dewanto
M. Gallagher
OffRL
35
0
0
28 May 2021
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Tengyu Xu
Zhuoran Yang
Zhaoran Wang
Yingbin Liang
OffRL
104
25
0
23 Feb 2021
Softmax Policy Gradient Methods Can Take Exponential Time to Converge
Softmax Policy Gradient Methods Can Take Exponential Time to Converge
Gen Li
Yuting Wei
Yuejie Chi
Yuxin Chen
109
53
0
22 Feb 2021
Provable Super-Convergence with a Large Cyclical Learning Rate
Provable Super-Convergence with a Large Cyclical Learning Rate
Samet Oymak
62
12
0
22 Feb 2021
Provably Efficient Policy Optimization for Two-Player Zero-Sum Markov
  Games
Provably Efficient Policy Optimization for Two-Player Zero-Sum Markov Games
Yulai Zhao
Yuandong Tian
Jason D. Lee
S. Du
OffRL
76
18
0
17 Feb 2021
On the Convergence and Sample Efficiency of Variance-Reduced Policy
  Gradient Method
On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method
Junyu Zhang
Chengzhuo Ni
Zheng Yu
Csaba Szepesvári
Mengdi Wang
125
69
0
17 Feb 2021
Derivative-Free Policy Optimization for Linear Risk-Sensitive and Robust
  Control Design: Implicit Regularization and Sample Complexity
Derivative-Free Policy Optimization for Linear Risk-Sensitive and Robust Control Design: Implicit Regularization and Sample Complexity
Kai Zhang
Xiangyuan Zhang
Bin Hu
Tamer Bacsar
106
19
0
04 Jan 2021
Model Free Reinforcement Learning Algorithm for Stationary Mean field
  Equilibrium for Multiple Types of Agents
Model Free Reinforcement Learning Algorithm for Stationary Mean field Equilibrium for Multiple Types of Agents
A. Ghosh
Vaneet Aggarwal
105
7
0
31 Dec 2020
Learning Fair Policies in Decentralized Cooperative Multi-Agent
  Reinforcement Learning
Learning Fair Policies in Decentralized Cooperative Multi-Agent Reinforcement Learning
Matthieu Zimmer
Claire Glanois
Umer Siddique
Paul Weng
OffRL
164
60
0
17 Dec 2020
Sample Complexity of Policy Gradient Finding Second-Order Stationary
  Points
Sample Complexity of Policy Gradient Finding Second-Order Stationary Points
Long Yang
Qian Zheng
Gang Pan
100
21
0
02 Dec 2020
CRPO: A New Approach for Safe Reinforcement Learning with Convergence
  Guarantee
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee
Tengyu Xu
Yingbin Liang
Guanghui Lan
89
128
0
11 Nov 2020
12
Next