Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.08412
Cited By
v1
v2
v3 (latest)
On the Sample Complexity of Actor-Critic Method for Reinforcement Learning with Function Approximation
18 October 2019
Harshat Kumar
Alec Koppel
Alejandro Ribeiro
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"On the Sample Complexity of Actor-Critic Method for Reinforcement Learning with Function Approximation"
27 / 27 papers shown
Title
Reusing Historical Trajectories in Natural Policy Gradient via Importance Sampling: Convergence and Convergence Rate
Yifan Lin
Yuhao Wang
Enlu Zhou
108
0
0
01 Mar 2024
A Small Gain Analysis of Single Timescale Actor Critic
Alexander Olshevsky
Bahman Gharesifard
69
20
0
04 Mar 2022
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Shangtong Zhang
Rémi Tachet des Combes
Romain Laroche
84
12
0
04 Nov 2021
Learning to Coordinate in Multi-Agent Systems: A Coordinated Actor-Critic Algorithm and Finite-Time Guarantees
Siliang Zeng
Tianyi Chen
Alfredo García
Mingyi Hong
75
11
0
11 Oct 2021
Sample and Communication-Efficient Decentralized Actor-Critic Algorithms with Finite-Time Analysis
Ziyi Chen
Yi Zhou
Rongrong Chen
Shaofeng Zou
74
25
0
08 Sep 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic with Linear Function Approximation
Zaiwei Chen
S. Khodadadian
S. T. Maguluri
OffRL
88
31
0
26 May 2021
Finite-Sample Analysis of Proximal Gradient TD Algorithms
Bo Liu
Ji Liu
Mohammad Ghavamzadeh
Sridhar Mahadevan
Marek Petrik
62
158
0
06 Jun 2020
Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
76
58
0
07 May 2020
A Finite Time Analysis of Two Time-Scale Actor Critic Methods
Yue Wu
Weitong Zhang
Pan Xu
Quanquan Gu
171
149
0
04 May 2020
A Tale of Two-Timescale Reinforcement Learning with the Tightest Finite-Time Bound
Gal Dalal
Balazs Szorenyi
Gugan Thoppe
OffRL
57
52
0
20 Nov 2019
Neural Policy Gradient Methods: Global Optimality and Rates of Convergence
Lingxiao Wang
Qi Cai
Zhuoran Yang
Zhaoran Wang
89
242
0
29 Aug 2019
Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies
Kai Zhang
Alec Koppel
Haoqi Zhu
Tamer Basar
70
191
0
19 Jun 2019
Finite-Sample Analysis for SARSA with Linear Function Approximation
Shaofeng Zou
Tengyu Xu
Yingbin Liang
63
148
0
06 Feb 2019
Finite-Time Error Bounds For Linear Stochastic Approximation and TD Learning
R. Srikant
Lei Ying
63
252
0
03 Feb 2019
TD-Regularized Actor-Critic Methods
Simone Parisi
Voot Tangkaratt
Jan Peters
Mohammad Emtiyaz Khan
OffRL
51
31
0
19 Dec 2018
Is Q-learning Provably Efficient?
Chi Jin
Zeyuan Allen-Zhu
Sébastien Bubeck
Michael I. Jordan
OffRL
75
812
0
10 Jul 2018
Stochastic Variance-Reduced Policy Gradient
Matteo Papini
Damiano Binaghi
Giuseppe Canonaco
Matteo Pirotta
Marcello Restelli
79
178
0
14 Jun 2018
A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation
Jalaj Bhandari
Daniel Russo
Raghav Singal
106
340
0
06 Jun 2018
Nonparametric Stochastic Compositional Gradient Descent for Q-Learning in Continuous Markov Decision Problems
Alec Koppel
Ekaterina V. Tolstaya
Ethan Stump
Alejandro Ribeiro
45
21
0
19 Apr 2018
Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents
Kai Zhang
Zhuoran Yang
Han Liu
Tong Zhang
Tamer Basar
100
592
0
23 Feb 2018
Symmetry, Saddle Points, and Global Optimization Landscape of Nonconvex Matrix Factorization
Xingguo Li
Junwei Lu
R. Arora
Jarvis Haupt
Han Liu
Zhaoran Wang
T. Zhao
59
53
0
29 Dec 2016
Sample Efficient Actor-Critic with Experience Replay
Ziyun Wang
V. Bapst
N. Heess
Volodymyr Mnih
Rémi Munos
Koray Kavukcuoglu
Nando de Freitas
102
762
0
03 Nov 2016
OpenAI Gym
Greg Brockman
Vicki Cheung
Ludwig Pettersson
Jonas Schneider
John Schulman
Jie Tang
Wojciech Zaremba
OffRL
ODL
223
5,085
0
05 Jun 2016
A Geometric Analysis of Phase Retrieval
Ju Sun
Qing Qu
John N. Wright
129
526
0
22 Feb 2016
Asynchronous Methods for Deep Reinforcement Learning
Volodymyr Mnih
Adria Puigdomenech Badia
M. Berk Mirza
Alex Graves
Timothy Lillicrap
Tim Harley
David Silver
Koray Kavukcuoglu
204
8,875
0
04 Feb 2016
Escaping From Saddle Points --- Online Stochastic Gradient for Tensor Decomposition
Rong Ge
Furong Huang
Chi Jin
Yang Yuan
140
1,059
0
06 Mar 2015
Stochastic Compositional Gradient Descent: Algorithms for Minimizing Compositions of Expected-Value Functions
Mengdi Wang
Ethan X. Fang
Han Liu
96
264
0
14 Nov 2014
1