ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.08383
  4. Cited By
Global Convergence of Policy Gradient Methods to (Almost) Locally
  Optimal Policies

Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies

19 June 2019
Kaipeng Zhang
Alec Koppel
Haoqi Zhu
Tamer Basar
ArXivPDFHTML

Papers citing "Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies"

50 / 111 papers shown
Title
Finite-Time Analysis of Fully Decentralized Single-Timescale
  Actor-Critic
Finite-Time Analysis of Fully Decentralized Single-Timescale Actor-Critic
Qijun Luo
Xiao Li
30
1
0
12 Jun 2022
Dealing with Sparse Rewards in Continuous Control Robotics via
  Heavy-Tailed Policies
Dealing with Sparse Rewards in Continuous Control Robotics via Heavy-Tailed Policies
Souradip Chakraborty
Amrit Singh Bedi
Alec Koppel
Pratap Tokekar
Tianyi Zhou
18
8
0
12 Jun 2022
Convergence and sample complexity of natural policy gradient primal-dual
  methods for constrained MDPs
Convergence and sample complexity of natural policy gradient primal-dual methods for constrained MDPs
Dongsheng Ding
Kaipeng Zhang
Jiali Duan
Tamer Bacsar
Mihailo R. Jovanović
23
19
0
06 Jun 2022
Stochastic Second-Order Methods Improve Best-Known Sample Complexity of
  SGD for Gradient-Dominated Function
Stochastic Second-Order Methods Improve Best-Known Sample Complexity of SGD for Gradient-Dominated Function
Saeed Masiha
Saber Salehkaleybar
Niao He
Negar Kiyavash
Patrick Thiran
87
18
0
25 May 2022
A Small Gain Analysis of Single Timescale Actor Critic
A Small Gain Analysis of Single Timescale Actor Critic
Alexander Olshevsky
Bahman Gharesifard
25
20
0
04 Mar 2022
A policy gradient approach for optimization of smooth risk measures
A policy gradient approach for optimization of smooth risk measures
Nithia Vijayan
Prashanth L.A.
OffRL
19
4
0
22 Feb 2022
Beyond the Policy Gradient Theorem for Efficient Policy Updates in
  Actor-Critic Algorithms
Beyond the Policy Gradient Theorem for Efficient Policy Updates in Actor-Critic Algorithms
Romain Laroche
Rémi Tachet des Combes
46
2
0
15 Feb 2022
Do Differentiable Simulators Give Better Policy Gradients?
Do Differentiable Simulators Give Better Policy Gradients?
H. Suh
Max Simchowitz
Kaipeng Zhang
Russ Tedrake
32
95
0
02 Feb 2022
On the Hidden Biases of Policy Mirror Ascent in Continuous Action Spaces
On the Hidden Biases of Policy Mirror Ascent in Continuous Action Spaces
Amrit Singh Bedi
Souradip Chakraborty
Anjaly Parayil
Brian M Sadler
Pratap Tokekar
Alec Koppel
43
17
0
28 Jan 2022
Occupancy Information Ratio: Infinite-Horizon, Information-Directed,
  Parameterized Policy Search
Occupancy Information Ratio: Infinite-Horizon, Information-Directed, Parameterized Policy Search
Wesley A Suttle
Alec Koppel
Ji Liu
31
0
0
21 Jan 2022
Recent Advances in Reinforcement Learning in Finance
Recent Advances in Reinforcement Learning in Finance
B. Hambly
Renyuan Xu
Huining Yang
OffRL
27
167
0
08 Dec 2021
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Shangtong Zhang
Rémi Tachet des Combes
Romain Laroche
30
10
0
04 Nov 2021
Convergence and Optimality of Policy Gradient Methods in Weakly Smooth
  Settings
Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings
Matthew Shunshi Zhang
Murat A. Erdogdu
Animesh Garg
16
5
0
30 Oct 2021
Understanding the Effect of Stochasticity in Policy Optimization
Understanding the Effect of Stochasticity in Policy Optimization
Jincheng Mei
Bo Dai
Chenjun Xiao
Csaba Szepesvári
Dale Schuurmans
19
17
0
29 Oct 2021
Convergence Rates of Average-Reward Multi-agent Reinforcement Learning
  via Randomized Linear Programming
Convergence Rates of Average-Reward Multi-agent Reinforcement Learning via Randomized Linear Programming
Alec Koppel
Amrit Singh Bedi
Bhargav Ganguly
Vaneet Aggarwal
32
4
0
22 Oct 2021
Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy
  Gradient Methods with Entropy Regularization
Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization
Yuhao Ding
Junzi Zhang
Hyunin Lee
Javad Lavaei
43
18
0
19 Oct 2021
On the Global Optimum Convergence of Momentum-based Policy Gradient
On the Global Optimum Convergence of Momentum-based Policy Gradient
Yuhao Ding
Junzi Zhang
Javad Lavaei
32
16
0
19 Oct 2021
Variance Reduction based Experience Replay for Policy Optimization
Variance Reduction based Experience Replay for Policy Optimization
Hua Zheng
Wei Xie
M. Feng
OffRL
39
2
0
17 Oct 2021
Learning to Coordinate in Multi-Agent Systems: A Coordinated
  Actor-Critic Algorithm and Finite-Time Guarantees
Learning to Coordinate in Multi-Agent Systems: A Coordinated Actor-Critic Algorithm and Finite-Time Guarantees
Siliang Zeng
Tianyi Chen
Alfredo García
Mingyi Hong
25
11
0
11 Oct 2021
Convergence of a Human-in-the-Loop Policy-Gradient Algorithm With
  Eligibility Trace Under Reward, Policy, and Advantage Feedback
Convergence of a Human-in-the-Loop Policy-Gradient Algorithm With Eligibility Trace Under Reward, Policy, and Advantage Feedback
Ishaan Shah
D. Halpern
Kavosh Asadi
Michael L. Littman
20
0
0
15 Sep 2021
Theoretical Guarantees of Fictitious Discount Algorithms for Episodic
  Reinforcement Learning and Global Convergence of Policy Gradient Methods
Theoretical Guarantees of Fictitious Discount Algorithms for Episodic Reinforcement Learning and Global Convergence of Policy Gradient Methods
Xin Guo
Anran Hu
Junzi Zhang
OffRL
25
6
0
13 Sep 2021
Mean-Field Multi-Agent Reinforcement Learning: A Decentralized Network
  Approach
Mean-Field Multi-Agent Reinforcement Learning: A Decentralized Network Approach
Haotian Gu
Xin Guo
Xiaoli Wei
Renyuan Xu
OOD
40
36
0
05 Aug 2021
A general sample complexity analysis of vanilla policy gradient
A general sample complexity analysis of vanilla policy gradient
Rui Yuan
Robert Mansel Gower
A. Lazaric
79
62
0
23 Jul 2021
Policy Gradient Methods for Distortion Risk Measures
Policy Gradient Methods for Distortion Risk Measures
Nithia Vijayan
Prashanth L.A.
27
5
0
09 Jul 2021
Bregman Gradient Policy Optimization
Bregman Gradient Policy Optimization
Feihu Huang
Shangqian Gao
Heng-Chiao Huang
25
16
0
23 Jun 2021
On the Sample Complexity and Metastability of Heavy-tailed Policy Search
  in Continuous Control
On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control
Amrit Singh Bedi
Anjaly Parayil
Junyu Zhang
Mengdi Wang
Alec Koppel
30
15
0
15 Jun 2021
Analysis of a Target-Based Actor-Critic Algorithm with Linear Function
  Approximation
Analysis of a Target-Based Actor-Critic Algorithm with Linear Function Approximation
Anas Barakat
Pascal Bianchi
Julien Lehmann
21
9
0
14 Jun 2021
Joint Optimization of Multi-Objective Reinforcement Learning with Policy
  Gradient Based Algorithm
Joint Optimization of Multi-Objective Reinforcement Learning with Policy Gradient Based Algorithm
Qinbo Bai
Mridul Agarwal
Vaneet Aggarwal
16
7
0
28 May 2021
A nearly Blackwell-optimal policy gradient method
A nearly Blackwell-optimal policy gradient method
Vektor Dewanto
M. Gallagher
OffRL
16
0
0
28 May 2021
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Tengyu Xu
Zhuoran Yang
Zhaoran Wang
Yingbin Liang
OffRL
39
24
0
23 Feb 2021
Softmax Policy Gradient Methods Can Take Exponential Time to Converge
Softmax Policy Gradient Methods Can Take Exponential Time to Converge
Gen Li
Yuting Wei
Yuejie Chi
Yuxin Chen
29
50
0
22 Feb 2021
Provable Super-Convergence with a Large Cyclical Learning Rate
Provable Super-Convergence with a Large Cyclical Learning Rate
Samet Oymak
33
12
0
22 Feb 2021
Provably Efficient Policy Optimization for Two-Player Zero-Sum Markov
  Games
Provably Efficient Policy Optimization for Two-Player Zero-Sum Markov Games
Yulai Zhao
Yuandong Tian
Jason D. Lee
S. Du
OffRL
41
18
0
17 Feb 2021
On the Convergence and Sample Efficiency of Variance-Reduced Policy
  Gradient Method
On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method
Junyu Zhang
Chengzhuo Ni
Zheng Yu
Csaba Szepesvári
Mengdi Wang
49
67
0
17 Feb 2021
Smoothed functional-based gradient algorithms for off-policy
  reinforcement learning: A non-asymptotic viewpoint
Smoothed functional-based gradient algorithms for off-policy reinforcement learning: A non-asymptotic viewpoint
Nithia Vijayan
A. PrashanthL.
OffRL
35
6
0
06 Jan 2021
Derivative-Free Policy Optimization for Linear Risk-Sensitive and Robust
  Control Design: Implicit Regularization and Sample Complexity
Derivative-Free Policy Optimization for Linear Risk-Sensitive and Robust Control Design: Implicit Regularization and Sample Complexity
Kaipeng Zhang
Xiangyuan Zhang
Bin Hu
Tamer Bacsar
21
19
0
04 Jan 2021
Towards Understanding Asynchronous Advantage Actor-critic: Convergence
  and Linear Speedup
Towards Understanding Asynchronous Advantage Actor-critic: Convergence and Linear Speedup
Han Shen
Kaipeng Zhang
Min-Fong Hong
Tianyi Chen
35
28
0
31 Dec 2020
Model Free Reinforcement Learning Algorithm for Stationary Mean field
  Equilibrium for Multiple Types of Agents
Model Free Reinforcement Learning Algorithm for Stationary Mean field Equilibrium for Multiple Types of Agents
A. Ghosh
Vaneet Aggarwal
21
7
0
31 Dec 2020
Learning Fair Policies in Decentralized Cooperative Multi-Agent
  Reinforcement Learning
Learning Fair Policies in Decentralized Cooperative Multi-Agent Reinforcement Learning
Matthieu Zimmer
Claire Glanois
Umer Siddique
Paul Weng
OffRL
23
59
0
17 Dec 2020
Sample Complexity of Policy Gradient Finding Second-Order Stationary
  Points
Sample Complexity of Policy Gradient Finding Second-Order Stationary Points
Long Yang
Qian Zheng
Gang Pan
23
21
0
02 Dec 2020
CRPO: A New Approach for Safe Reinforcement Learning with Convergence
  Guarantee
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee
Tengyu Xu
Yingbin Liang
Guanghui Lan
42
121
0
11 Nov 2020
Single and Multi-Agent Deep Reinforcement Learning for AI-Enabled
  Wireless Networks: A Tutorial
Single and Multi-Agent Deep Reinforcement Learning for AI-Enabled Wireless Networks: A Tutorial
Amal Feriani
Ekram Hossain
35
237
0
06 Nov 2020
A Study of Policy Gradient on a Class of Exactly Solvable Models
A Study of Policy Gradient on a Class of Exactly Solvable Models
Gavin McCracken
Colin Daniels
Rosie Zhao
Anna M. Brandenberger
Prakash Panangaden
Doina Precup
16
0
0
03 Nov 2020
Sample Efficient Reinforcement Learning with REINFORCE
Sample Efficient Reinforcement Learning with REINFORCE
Junzi Zhang
Jongho Kim
Brendan O'Donoghue
Stephen P. Boyd
37
100
0
22 Oct 2020
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Zuyue Fu
Zhuoran Yang
Zhaoran Wang
15
42
0
02 Aug 2020
Variational Policy Gradient Method for Reinforcement Learning with
  General Utilities
Variational Policy Gradient Method for Reinforcement Learning with General Utilities
Junyu Zhang
Alec Koppel
Amrit Singh Bedi
Csaba Szepesvári
Mengdi Wang
19
137
0
04 Jul 2020
When Will Generative Adversarial Imitation Learning Algorithms Attain
  Global Convergence
When Will Generative Adversarial Imitation Learning Algorithms Attain Global Convergence
Ziwei Guan
Tengyu Xu
Yingbin Liang
19
16
0
24 Jun 2020
Zeroth-order Deterministic Policy Gradient
Zeroth-order Deterministic Policy Gradient
Harshat Kumar
Dionysios S. Kalogerias
George J. Pappas
Alejandro Ribeiro
OffRL
17
14
0
12 Jun 2020
Non-asymptotic Convergence Analysis of Two Time-scale (Natural)
  Actor-Critic Algorithms
Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
26
57
0
07 May 2020
A Finite Time Analysis of Two Time-Scale Actor Critic Methods
A Finite Time Analysis of Two Time-Scale Actor Critic Methods
Yue Wu
Weitong Zhang
Pan Xu
Quanquan Gu
90
146
0
04 May 2020
Previous
123
Next