ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.08412
  4. Cited By
On the Sample Complexity of Actor-Critic Method for Reinforcement
  Learning with Function Approximation
v1v2v3 (latest)

On the Sample Complexity of Actor-Critic Method for Reinforcement Learning with Function Approximation

18 October 2019
Harshat Kumar
Alec Koppel
Alejandro Ribeiro
ArXiv (abs)PDFHTML

Papers citing "On the Sample Complexity of Actor-Critic Method for Reinforcement Learning with Function Approximation"

27 / 27 papers shown
Title
Reusing Historical Trajectories in Natural Policy Gradient via Importance Sampling: Convergence and Convergence Rate
Reusing Historical Trajectories in Natural Policy Gradient via Importance Sampling: Convergence and Convergence Rate
Yifan Lin
Yuhao Wang
Enlu Zhou
111
0
0
01 Mar 2024
A Small Gain Analysis of Single Timescale Actor Critic
A Small Gain Analysis of Single Timescale Actor Critic
Alexander Olshevsky
Bahman Gharesifard
69
20
0
04 Mar 2022
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Shangtong Zhang
Rémi Tachet des Combes
Romain Laroche
84
12
0
04 Nov 2021
Learning to Coordinate in Multi-Agent Systems: A Coordinated
  Actor-Critic Algorithm and Finite-Time Guarantees
Learning to Coordinate in Multi-Agent Systems: A Coordinated Actor-Critic Algorithm and Finite-Time Guarantees
Siliang Zeng
Tianyi Chen
Alfredo García
Mingyi Hong
75
11
0
11 Oct 2021
Sample and Communication-Efficient Decentralized Actor-Critic Algorithms
  with Finite-Time Analysis
Sample and Communication-Efficient Decentralized Actor-Critic Algorithms with Finite-Time Analysis
Ziyi Chen
Yi Zhou
Rongrong Chen
Shaofeng Zou
74
25
0
08 Sep 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic with Linear
  Function Approximation
Finite-Sample Analysis of Off-Policy Natural Actor-Critic with Linear Function Approximation
Zaiwei Chen
S. Khodadadian
S. T. Maguluri
OffRL
88
31
0
26 May 2021
Finite-Sample Analysis of Proximal Gradient TD Algorithms
Finite-Sample Analysis of Proximal Gradient TD Algorithms
Bo Liu
Ji Liu
Mohammad Ghavamzadeh
Sridhar Mahadevan
Marek Petrik
62
158
0
06 Jun 2020
Non-asymptotic Convergence Analysis of Two Time-scale (Natural)
  Actor-Critic Algorithms
Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
76
58
0
07 May 2020
A Finite Time Analysis of Two Time-Scale Actor Critic Methods
A Finite Time Analysis of Two Time-Scale Actor Critic Methods
Yue Wu
Weitong Zhang
Pan Xu
Quanquan Gu
171
149
0
04 May 2020
A Tale of Two-Timescale Reinforcement Learning with the Tightest
  Finite-Time Bound
A Tale of Two-Timescale Reinforcement Learning with the Tightest Finite-Time Bound
Gal Dalal
Balazs Szorenyi
Gugan Thoppe
OffRL
57
52
0
20 Nov 2019
Neural Policy Gradient Methods: Global Optimality and Rates of
  Convergence
Neural Policy Gradient Methods: Global Optimality and Rates of Convergence
Lingxiao Wang
Qi Cai
Zhuoran Yang
Zhaoran Wang
91
242
0
29 Aug 2019
Global Convergence of Policy Gradient Methods to (Almost) Locally
  Optimal Policies
Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies
Kai Zhang
Alec Koppel
Haoqi Zhu
Tamer Basar
70
191
0
19 Jun 2019
Finite-Sample Analysis for SARSA with Linear Function Approximation
Finite-Sample Analysis for SARSA with Linear Function Approximation
Shaofeng Zou
Tengyu Xu
Yingbin Liang
63
148
0
06 Feb 2019
Finite-Time Error Bounds For Linear Stochastic Approximation and TD
  Learning
Finite-Time Error Bounds For Linear Stochastic Approximation and TD Learning
R. Srikant
Lei Ying
63
252
0
03 Feb 2019
TD-Regularized Actor-Critic Methods
TD-Regularized Actor-Critic Methods
Simone Parisi
Voot Tangkaratt
Jan Peters
Mohammad Emtiyaz Khan
OffRL
51
31
0
19 Dec 2018
Is Q-learning Provably Efficient?
Is Q-learning Provably Efficient?
Chi Jin
Zeyuan Allen-Zhu
Sébastien Bubeck
Michael I. Jordan
OffRL
75
812
0
10 Jul 2018
Stochastic Variance-Reduced Policy Gradient
Stochastic Variance-Reduced Policy Gradient
Matteo Papini
Damiano Binaghi
Giuseppe Canonaco
Matteo Pirotta
Marcello Restelli
79
178
0
14 Jun 2018
A Finite Time Analysis of Temporal Difference Learning With Linear
  Function Approximation
A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation
Jalaj Bhandari
Daniel Russo
Raghav Singal
106
340
0
06 Jun 2018
Nonparametric Stochastic Compositional Gradient Descent for Q-Learning
  in Continuous Markov Decision Problems
Nonparametric Stochastic Compositional Gradient Descent for Q-Learning in Continuous Markov Decision Problems
Alec Koppel
Ekaterina V. Tolstaya
Ethan Stump
Alejandro Ribeiro
45
21
0
19 Apr 2018
Fully Decentralized Multi-Agent Reinforcement Learning with Networked
  Agents
Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents
Kai Zhang
Zhuoran Yang
Han Liu
Tong Zhang
Tamer Basar
100
592
0
23 Feb 2018
Symmetry, Saddle Points, and Global Optimization Landscape of Nonconvex
  Matrix Factorization
Symmetry, Saddle Points, and Global Optimization Landscape of Nonconvex Matrix Factorization
Xingguo Li
Junwei Lu
R. Arora
Jarvis Haupt
Han Liu
Zhaoran Wang
T. Zhao
59
53
0
29 Dec 2016
Sample Efficient Actor-Critic with Experience Replay
Sample Efficient Actor-Critic with Experience Replay
Ziyun Wang
V. Bapst
N. Heess
Volodymyr Mnih
Rémi Munos
Koray Kavukcuoglu
Nando de Freitas
102
762
0
03 Nov 2016
OpenAI Gym
OpenAI Gym
Greg Brockman
Vicki Cheung
Ludwig Pettersson
Jonas Schneider
John Schulman
Jie Tang
Wojciech Zaremba
OffRLODL
223
5,085
0
05 Jun 2016
A Geometric Analysis of Phase Retrieval
A Geometric Analysis of Phase Retrieval
Ju Sun
Qing Qu
John N. Wright
129
526
0
22 Feb 2016
Asynchronous Methods for Deep Reinforcement Learning
Asynchronous Methods for Deep Reinforcement Learning
Volodymyr Mnih
Adria Puigdomenech Badia
M. Berk Mirza
Alex Graves
Timothy Lillicrap
Tim Harley
David Silver
Koray Kavukcuoglu
204
8,875
0
04 Feb 2016
Escaping From Saddle Points --- Online Stochastic Gradient for Tensor
  Decomposition
Escaping From Saddle Points --- Online Stochastic Gradient for Tensor Decomposition
Rong Ge
Furong Huang
Chi Jin
Yang Yuan
140
1,059
0
06 Mar 2015
Stochastic Compositional Gradient Descent: Algorithms for Minimizing
  Compositions of Expected-Value Functions
Stochastic Compositional Gradient Descent: Algorithms for Minimizing Compositions of Expected-Value Functions
Mengdi Wang
Ethan X. Fang
Han Liu
96
264
0
14 Nov 2014
1