Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.08383
Cited By
Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies
19 June 2019
Kaipeng Zhang
Alec Koppel
Haoqi Zhu
Tamer Basar
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies"
50 / 111 papers shown
Title
Finite-Time Analysis of Fully Decentralized Single-Timescale Actor-Critic
Qijun Luo
Xiao Li
30
1
0
12 Jun 2022
Dealing with Sparse Rewards in Continuous Control Robotics via Heavy-Tailed Policies
Souradip Chakraborty
Amrit Singh Bedi
Alec Koppel
Pratap Tokekar
Tianyi Zhou
18
8
0
12 Jun 2022
Convergence and sample complexity of natural policy gradient primal-dual methods for constrained MDPs
Dongsheng Ding
Kaipeng Zhang
Jiali Duan
Tamer Bacsar
Mihailo R. Jovanović
23
19
0
06 Jun 2022
Stochastic Second-Order Methods Improve Best-Known Sample Complexity of SGD for Gradient-Dominated Function
Saeed Masiha
Saber Salehkaleybar
Niao He
Negar Kiyavash
Patrick Thiran
87
18
0
25 May 2022
A Small Gain Analysis of Single Timescale Actor Critic
Alexander Olshevsky
Bahman Gharesifard
25
20
0
04 Mar 2022
A policy gradient approach for optimization of smooth risk measures
Nithia Vijayan
Prashanth L.A.
OffRL
19
4
0
22 Feb 2022
Beyond the Policy Gradient Theorem for Efficient Policy Updates in Actor-Critic Algorithms
Romain Laroche
Rémi Tachet des Combes
46
2
0
15 Feb 2022
Do Differentiable Simulators Give Better Policy Gradients?
H. Suh
Max Simchowitz
Kaipeng Zhang
Russ Tedrake
32
95
0
02 Feb 2022
On the Hidden Biases of Policy Mirror Ascent in Continuous Action Spaces
Amrit Singh Bedi
Souradip Chakraborty
Anjaly Parayil
Brian M Sadler
Pratap Tokekar
Alec Koppel
43
17
0
28 Jan 2022
Occupancy Information Ratio: Infinite-Horizon, Information-Directed, Parameterized Policy Search
Wesley A Suttle
Alec Koppel
Ji Liu
31
0
0
21 Jan 2022
Recent Advances in Reinforcement Learning in Finance
B. Hambly
Renyuan Xu
Huining Yang
OffRL
27
167
0
08 Dec 2021
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Shangtong Zhang
Rémi Tachet des Combes
Romain Laroche
30
10
0
04 Nov 2021
Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings
Matthew Shunshi Zhang
Murat A. Erdogdu
Animesh Garg
16
5
0
30 Oct 2021
Understanding the Effect of Stochasticity in Policy Optimization
Jincheng Mei
Bo Dai
Chenjun Xiao
Csaba Szepesvári
Dale Schuurmans
19
17
0
29 Oct 2021
Convergence Rates of Average-Reward Multi-agent Reinforcement Learning via Randomized Linear Programming
Alec Koppel
Amrit Singh Bedi
Bhargav Ganguly
Vaneet Aggarwal
32
4
0
22 Oct 2021
Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization
Yuhao Ding
Junzi Zhang
Hyunin Lee
Javad Lavaei
43
18
0
19 Oct 2021
On the Global Optimum Convergence of Momentum-based Policy Gradient
Yuhao Ding
Junzi Zhang
Javad Lavaei
32
16
0
19 Oct 2021
Variance Reduction based Experience Replay for Policy Optimization
Hua Zheng
Wei Xie
M. Feng
OffRL
39
2
0
17 Oct 2021
Learning to Coordinate in Multi-Agent Systems: A Coordinated Actor-Critic Algorithm and Finite-Time Guarantees
Siliang Zeng
Tianyi Chen
Alfredo García
Mingyi Hong
25
11
0
11 Oct 2021
Convergence of a Human-in-the-Loop Policy-Gradient Algorithm With Eligibility Trace Under Reward, Policy, and Advantage Feedback
Ishaan Shah
D. Halpern
Kavosh Asadi
Michael L. Littman
20
0
0
15 Sep 2021
Theoretical Guarantees of Fictitious Discount Algorithms for Episodic Reinforcement Learning and Global Convergence of Policy Gradient Methods
Xin Guo
Anran Hu
Junzi Zhang
OffRL
25
6
0
13 Sep 2021
Mean-Field Multi-Agent Reinforcement Learning: A Decentralized Network Approach
Haotian Gu
Xin Guo
Xiaoli Wei
Renyuan Xu
OOD
40
36
0
05 Aug 2021
A general sample complexity analysis of vanilla policy gradient
Rui Yuan
Robert Mansel Gower
A. Lazaric
79
62
0
23 Jul 2021
Policy Gradient Methods for Distortion Risk Measures
Nithia Vijayan
Prashanth L.A.
27
5
0
09 Jul 2021
Bregman Gradient Policy Optimization
Feihu Huang
Shangqian Gao
Heng-Chiao Huang
25
16
0
23 Jun 2021
On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control
Amrit Singh Bedi
Anjaly Parayil
Junyu Zhang
Mengdi Wang
Alec Koppel
30
15
0
15 Jun 2021
Analysis of a Target-Based Actor-Critic Algorithm with Linear Function Approximation
Anas Barakat
Pascal Bianchi
Julien Lehmann
21
9
0
14 Jun 2021
Joint Optimization of Multi-Objective Reinforcement Learning with Policy Gradient Based Algorithm
Qinbo Bai
Mridul Agarwal
Vaneet Aggarwal
16
7
0
28 May 2021
A nearly Blackwell-optimal policy gradient method
Vektor Dewanto
M. Gallagher
OffRL
16
0
0
28 May 2021
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Tengyu Xu
Zhuoran Yang
Zhaoran Wang
Yingbin Liang
OffRL
39
24
0
23 Feb 2021
Softmax Policy Gradient Methods Can Take Exponential Time to Converge
Gen Li
Yuting Wei
Yuejie Chi
Yuxin Chen
29
50
0
22 Feb 2021
Provable Super-Convergence with a Large Cyclical Learning Rate
Samet Oymak
33
12
0
22 Feb 2021
Provably Efficient Policy Optimization for Two-Player Zero-Sum Markov Games
Yulai Zhao
Yuandong Tian
Jason D. Lee
S. Du
OffRL
41
18
0
17 Feb 2021
On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method
Junyu Zhang
Chengzhuo Ni
Zheng Yu
Csaba Szepesvári
Mengdi Wang
49
67
0
17 Feb 2021
Smoothed functional-based gradient algorithms for off-policy reinforcement learning: A non-asymptotic viewpoint
Nithia Vijayan
A. PrashanthL.
OffRL
35
6
0
06 Jan 2021
Derivative-Free Policy Optimization for Linear Risk-Sensitive and Robust Control Design: Implicit Regularization and Sample Complexity
Kaipeng Zhang
Xiangyuan Zhang
Bin Hu
Tamer Bacsar
21
19
0
04 Jan 2021
Towards Understanding Asynchronous Advantage Actor-critic: Convergence and Linear Speedup
Han Shen
Kaipeng Zhang
Min-Fong Hong
Tianyi Chen
35
28
0
31 Dec 2020
Model Free Reinforcement Learning Algorithm for Stationary Mean field Equilibrium for Multiple Types of Agents
A. Ghosh
Vaneet Aggarwal
21
7
0
31 Dec 2020
Learning Fair Policies in Decentralized Cooperative Multi-Agent Reinforcement Learning
Matthieu Zimmer
Claire Glanois
Umer Siddique
Paul Weng
OffRL
23
59
0
17 Dec 2020
Sample Complexity of Policy Gradient Finding Second-Order Stationary Points
Long Yang
Qian Zheng
Gang Pan
23
21
0
02 Dec 2020
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee
Tengyu Xu
Yingbin Liang
Guanghui Lan
42
121
0
11 Nov 2020
Single and Multi-Agent Deep Reinforcement Learning for AI-Enabled Wireless Networks: A Tutorial
Amal Feriani
Ekram Hossain
35
237
0
06 Nov 2020
A Study of Policy Gradient on a Class of Exactly Solvable Models
Gavin McCracken
Colin Daniels
Rosie Zhao
Anna M. Brandenberger
Prakash Panangaden
Doina Precup
16
0
0
03 Nov 2020
Sample Efficient Reinforcement Learning with REINFORCE
Junzi Zhang
Jongho Kim
Brendan O'Donoghue
Stephen P. Boyd
37
100
0
22 Oct 2020
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Zuyue Fu
Zhuoran Yang
Zhaoran Wang
15
42
0
02 Aug 2020
Variational Policy Gradient Method for Reinforcement Learning with General Utilities
Junyu Zhang
Alec Koppel
Amrit Singh Bedi
Csaba Szepesvári
Mengdi Wang
19
137
0
04 Jul 2020
When Will Generative Adversarial Imitation Learning Algorithms Attain Global Convergence
Ziwei Guan
Tengyu Xu
Yingbin Liang
19
16
0
24 Jun 2020
Zeroth-order Deterministic Policy Gradient
Harshat Kumar
Dionysios S. Kalogerias
George J. Pappas
Alejandro Ribeiro
OffRL
17
14
0
12 Jun 2020
Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
26
57
0
07 May 2020
A Finite Time Analysis of Two Time-Scale Actor Critic Methods
Yue Wu
Weitong Zhang
Pan Xu
Quanquan Gu
90
146
0
04 May 2020
Previous
1
2
3
Next