Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2102.01567
Cited By
A Lyapunov Theory for Finite-Sample Guarantees of Asynchronous Q-Learning and TD-Learning Variants
2 February 2021
Zaiwei Chen
S. T. Maguluri
Sanjay Shakkottai
Karthikeyan Shanmugam
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Lyapunov Theory for Finite-Sample Guarantees of Asynchronous Q-Learning and TD-Learning Variants"
18 / 18 papers shown
Title
No Algorithmic Collusion in Two-Player Blindfolded Game with Thompson Sampling
Ningyuan Chen
Xuefeng Gao
Yi Xiong
52
0
0
23 May 2024
Convergence Rates for Stochastic Approximation: Biased Noise with Unbounded Variance, and Applications
Rajeeva Laxman Karandikar
M. Vidyasagar
25
8
0
05 Dec 2023
Finite-Time Analysis of Minimax Q-Learning for Two-Player Zero-Sum Markov Games: Switching System Approach
Dong-hwan Lee
21
2
0
09 Jun 2023
First-order Policy Optimization for Robust Markov Decision Process
Yan Li
Guanghui Lan
Tuo Zhao
77
23
0
21 Sep 2022
Finite-Time Analysis of Asynchronous Q-learning under Diminishing Step-Size from Control-Theoretic View
Han-Dong Lim
Dong-hwan Lee
30
1
0
25 Jul 2022
The Efficacy of Pessimism in Asynchronous Q-Learning
Yuling Yan
Gen Li
Yuxin Chen
Jianqing Fan
OffRL
78
40
0
14 Mar 2022
Target Network and Truncation Overcome The Deadly Triad in
Q
Q
Q
-Learning
Zaiwei Chen
John-Paul Clarke
S. T. Maguluri
18
19
0
05 Mar 2022
On the Convergence of SARSA with Linear Function Approximation
Shangtong Zhang
Rémi Tachet des Combes
Romain Laroche
13
10
0
14 Feb 2022
A Statistical Analysis of Polyak-Ruppert Averaged Q-learning
Xiang Li
Wenhao Yang
Jiadong Liang
Zhihua Zhang
Michael I. Jordan
40
15
0
29 Dec 2021
Accelerated and instance-optimal policy evaluation with linear function approximation
Tianjiao Li
Guanghui Lan
A. Pananjady
OffRL
37
13
0
24 Dec 2021
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Shangtong Zhang
Rémi Tachet des Combes
Romain Laroche
30
10
0
04 Nov 2021
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning
Gen Li
Laixi Shi
Yuxin Chen
Yuejie Chi
OffRL
45
51
0
09 Oct 2021
Online Bootstrap Inference For Policy Evaluation in Reinforcement Learning
Pratik Ramprasad
Yuantong Li
Zhuoran Yang
Zhaoran Wang
W. Sun
Guang Cheng
OffRL
50
27
0
08 Aug 2021
Concentration of Contractive Stochastic Approximation and Reinforcement Learning
Siddharth Chandak
Vivek Borkar
Parth Dodhia
43
17
0
27 Jun 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic with Linear Function Approximation
Zaiwei Chen
S. Khodadadian
S. T. Maguluri
OffRL
63
29
0
26 May 2021
On the Linear convergence of Natural Policy Gradient Algorithm
S. Khodadadian
P. Jhunjhunwala
Sushil Mahavir Varma
S. T. Maguluri
40
56
0
04 May 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic Algorithm
S. Khodadadian
Zaiwei Chen
S. T. Maguluri
CML
OffRL
71
26
0
18 Feb 2021
Is Q-Learning Minimax Optimal? A Tight Sample Complexity Analysis
Gen Li
Changxiao Cai
Ee
Yuting Wei
Yuejie Chi
OffRL
48
75
0
12 Feb 2021
1