Global Convergence of Policy Gradient Methods to (Almost) Locally
Optimal Policies

Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies

19 June 2019

Papers citing "Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies"

11 / 111 papers shown

Title
Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms Tengyu Xu Zhe Wang Yingbin Liang 19 25 0 27 Apr 2020
Generative Adversarial Imitation Learning with Neural Networks: Global Optimality and Convergence Rate Yufeng Zhang Qi Cai Zhuoran Yang Zhaoran Wang 111 12 0 08 Mar 2020
Non-asymptotic Convergence of Adam-type Reinforcement Learning Algorithms under Markovian Sampling Huaqing Xiong Tengyu Xu Yingbin Liang Wei Zhang 17 33 0 15 Feb 2020
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms Kaipeng Zhang Zhuoran Yang Tamer Basar 55 1,181 0 24 Nov 2019
$Policy Optimization for $\mathcal{H}_2$ Linear Control with $\mathcal{H}_\infty$ Robustness Guarantee: Implicit Regularization and Global Convergence$ Policy Optimization for $\mathcal{H}_2$ Linear Control with $\mathcal{H}_\infty$ Robustness Guarantee: Implicit Regularization and Global Convergence Kaipeng Zhang Bin Hu Tamer Basar 24 119 0 21 Oct 2019
On the Sample Complexity of Actor-Critic Method for Reinforcement Learning with Function Approximation Harshat Kumar Alec Koppel Alejandro Ribeiro 102 79 0 18 Oct 2019
Neural Policy Gradient Methods: Global Optimality and Rates of Convergence Lingxiao Wang Qi Cai Zhuoran Yang Zhaoran Wang 14 236 0 29 Aug 2019
A Review of Cooperative Multi-Agent Deep Reinforcement Learning Afshin Oroojlooyjadid Davood Hajinezhad 48 408 0 11 Aug 2019
Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy Boyi Liu Qi Cai Zhuoran Yang Zhaoran Wang 22 108 0 25 Jun 2019
Policy Optimization Provably Converges to Nash Equilibria in Zero-Sum Linear Quadratic Games Kaipeng Zhang Zhuoran Yang Tamer Basar 24 125 0 31 May 2019
Some Limit Properties of Markov Chains Induced by Stochastic Recursive Algorithms Abhishek Gupta Hao Chen Jianzong Pi Gaurav Tendolkar 17 0 0 24 Apr 2019