Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1806.02450
Cited By
A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation
6 June 2018
Jalaj Bhandari
Daniel Russo
Raghav Singal
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation"
50 / 223 papers shown
Title
A Lyapunov Theory for Finite-Sample Guarantees of Asynchronous Q-Learning and TD-Learning Variants
Zaiwei Chen
S. T. Maguluri
Sanjay Shakkottai
Karthikeyan Shanmugam
OffRL
105
54
0
02 Feb 2021
On the Stability of Random Matrix Product with Markovian Noise: Application to Linear Stochastic Approximation and TD Learning
Alain Durmus
Eric Moulines
A. Naumov
S. Samsonov
Hoi-To Wai
35
19
0
30 Jan 2021
Finite Sample Analysis of Two-Time-Scale Natural Actor-Critic Algorithm
S. Khodadadian
Thinh T. Doan
Justin Romberg
S. T. Maguluri
40
42
0
26 Jan 2021
Towards Understanding Asynchronous Advantage Actor-critic: Convergence and Linear Speedup
Han Shen
Kaipeng Zhang
Min-Fong Hong
Tianyi Chen
35
28
0
31 Dec 2020
On Convergence of Gradient Expected Sarsa(
λ
λ
λ
)
Long Yang
Gang Zheng
Yu Zhang
Qian Zheng
Pengfei Li
Gang Pan
23
2
0
14 Dec 2020
Optimal oracle inequalities for solving projected fixed-point equations
Wenlong Mou
A. Pananjady
Martin J. Wainwright
29
14
0
09 Dec 2020
Simple and optimal methods for stochastic variational inequalities, II: Markovian noise and policy evaluation in reinforcement learning
Georgios Kotsalis
Guanghui Lan
Tianjiao Li
OffRL
6
31
0
15 Nov 2020
Reinforcement Learning Control of Constrained Dynamic Systems with Uniformly Ultimate Boundedness Stability Guarantee
Minghao Han
Yuan Tian
Lixian Zhang
Jun Wang
Wei Pan
16
46
0
13 Nov 2020
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee
Tengyu Xu
Yingbin Liang
Guanghui Lan
52
122
0
11 Nov 2020
Sample Complexity Bounds for Two Timescale Value-based Reinforcement Learning Algorithms
Tengyu Xu
Yingbin Liang
21
26
0
10 Nov 2020
Nonlinear Two-Time-Scale Stochastic Approximation: Convergence and Finite-Time Performance
Thinh T. Doan
14
45
0
03 Nov 2020
System Identification via Meta-Learning in Linear Time-Varying Environments
Sen Lin
Hang Wang
Junshan Zhang
OffRL
42
2
0
27 Oct 2020
Temporal Difference Learning as Gradient Splitting
Rui Liu
Alexander Olshevsky
6
14
0
27 Oct 2020
Variance-Reduced Off-Policy TDC Learning: Non-Asymptotic Convergence Analysis
Shaocong Ma
Yi Zhou
Shaofeng Zou
OffRL
19
14
0
26 Oct 2020
Provable Fictitious Play for General Mean-Field Games
Qiaomin Xie
Zhuoran Yang
Zhaoran Wang
Andreea Minca
32
18
0
08 Oct 2020
Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth Nonlinear TD Learning
Shuang Qiu
Zhuoran Yang
Xiaohan Wei
Jieping Ye
Zhaoran Wang
33
38
0
23 Aug 2020
On the Convergence of Consensus Algorithms with Markovian Noise and Gradient Bias
Hoi-To Wai
9
12
0
18 Aug 2020
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Zuyue Fu
Zhuoran Yang
Zhaoran Wang
21
42
0
02 Aug 2020
Momentum Q-learning with Finite-Sample Convergence Guarantee
Bowen Weng
Huaqing Xiong
Linna Zhao
Yingbin Liang
Wei Zhang
16
8
0
30 Jul 2020
Analysis of Q-learning with Adaptation and Momentum Restart for Gradient Descent
Chuhan Wu
Fangzhao Wu
Tao Qi
Yongfeng Huang
22
25
0
15 Jul 2020
A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic
Mingyi Hong
Hoi-To Wai
Zhaoran Wang
Zhuoran Yang
18
135
0
10 Jul 2020
The Mean-Squared Error of Double Q-Learning
Wentao Weng
Harsh Gupta
Niao He
Lei Ying
R. Srikant
8
17
0
09 Jul 2020
Local Stochastic Approximation: A Unified View of Federated Learning and Distributed Multi-Task Reinforcement Learning Algorithms
Thinh T. Doan
FedML
17
9
0
24 Jun 2020
Provably Efficient Reinforcement Learning for Discounted MDPs with Feature Mapping
Dongruo Zhou
Jiafan He
Quanquan Gu
35
133
0
23 Jun 2020
Least Squares Regression with Markovian Data: Fundamental Limits and Algorithms
Guy Bresler
Prateek Jain
Dheeraj M. Nagaraj
Praneeth Netrapalli
Xian Wu
30
61
0
16 Jun 2020
Zeroth-order Deterministic Policy Gradient
Harshat Kumar
Dionysios S. Kalogerias
George J. Pappas
Alejandro Ribeiro
OffRL
25
14
0
12 Jun 2020
Multi-Agent Reinforcement Learning in Stochastic Networked Systems
Yiheng Lin
Guannan Qu
Longbo Huang
Adam Wierman
34
38
0
11 Jun 2020
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory
Yufeng Zhang
Qi Cai
Zhuoran Yang
Yongxin Chen
Zhaoran Wang
OOD
MLT
159
11
0
08 Jun 2020
Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction
Gen Li
Yuting Wei
Yuejie Chi
Yuantao Gu
Yuxin Chen
OffRL
23
114
0
04 Jun 2020
Breaking the Sample Size Barrier in Model-Based Reinforcement Learning with a Generative Model
Gen Li
Yuting Wei
Yuejie Chi
Yuxin Chen
36
125
0
26 May 2020
Finite-sample Analysis of Greedy-GQ with Linear Function Approximation under Markovian Noise
Yue Wang
Shaofeng Zou
17
21
0
20 May 2020
Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
26
57
0
07 May 2020
A Finite Time Analysis of Two Time-Scale Actor Critic Methods
Yue Wu
Weitong Zhang
Pan Xu
Quanquan Gu
92
146
0
04 May 2020
Actor-Critic Reinforcement Learning for Control with Stability Guarantee
Minghao Han
Lixian Zhang
Jun Wang
Wei Pan
16
106
0
29 Apr 2020
Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
27
25
0
27 Apr 2020
On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and Non-Asymptotic Concentration
Wenlong Mou
C. J. Li
Martin J. Wainwright
Peter L. Bartlett
Michael I. Jordan
33
75
0
09 Apr 2020
A Distributional Analysis of Sampling-Based Reinforcement Learning Algorithms
P. Amortila
Doina Precup
Prakash Panangaden
Marc G. Bellemare
15
8
0
27 Mar 2020
Convergence of Recursive Stochastic Algorithms using Wasserstein Divergence
Abhishek Gupta
W. Haskell
6
4
0
25 Mar 2020
Is Temporal Difference Learning Optimal? An Instance-Dependent Analysis
K. Khamaru
A. Pananjady
Feng Ruan
Martin J. Wainwright
Michael I. Jordan
OffRL
27
47
0
16 Mar 2020
Adaptive Temporal Difference Learning with Linear Function Approximation
Tao Sun
Han Shen
Tianyi Chen
Dongsheng Li
8
23
0
20 Feb 2020
Non-asymptotic Convergence of Adam-type Reinforcement Learning Algorithms under Markovian Sampling
Huaqing Xiong
Tengyu Xu
Yingbin Liang
Wei Zhang
25
33
0
15 Feb 2020
Explicit Mean-Square Error Bounds for Monte-Carlo and Linear Stochastic Approximation
Shuhang Chen
Adithya M. Devraj
Ana Bušić
Sean P. Meyn
21
31
0
07 Feb 2020
Does the Markov Decision Process Fit the Data: Testing for the Markov Property in Sequential Decision Making
C. Shi
Runzhe Wan
R. Song
Wenbin Lu
Ling Leng
28
37
0
05 Feb 2020
Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement Learning Framework
C. Shi
Xiaoyu Wang
Shuang Luo
Hongtu Zhu
Jieping Ye
R. Song
CML
OffRL
30
33
0
05 Feb 2020
Finite Time Analysis of Linear Two-timescale Stochastic Approximation with Markovian Noise
Maxim Kaledin
Eric Moulines
A. Naumov
V. Tadic
Hoi-To Wai
8
73
0
04 Feb 2020
Finite-Sample Analysis of Stochastic Approximation Using Smooth Convex Envelopes
Zaiwei Chen
S. T. Maguluri
Sanjay Shakkottai
Karthikeyan Shanmugam
53
33
0
03 Feb 2020
Reanalysis of Variance Reduced Temporal Difference Learning
Tengyu Xu
Zhe Wang
Yi Zhou
Yingbin Liang
OffRL
32
38
0
07 Jan 2020
Finite-Time Analysis and Restarting Scheme for Linear Two-Time-Scale Stochastic Approximation
Thinh T. Doan
21
36
0
23 Dec 2019
A Finite-Time Analysis of Q-Learning with Neural Network Function Approximation
Pan Xu
Quanquan Gu
29
66
0
10 Dec 2019
Decentralized Multi-Agent Reinforcement Learning with Networked Agents: Recent Advances
Kaipeng Zhang
Zhuoran Yang
Tamer Basar
6
67
0
09 Dec 2019
Previous
1
2
3
4
5
Next