Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2001.01898
Cited By
Reanalysis of Variance Reduced Temporal Difference Learning
7 January 2020
Tengyu Xu
Zhe Wang
Yi Zhou
Yingbin Liang
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Reanalysis of Variance Reduced Temporal Difference Learning"
19 / 19 papers shown
Title
KETCHUP: K-Step Return Estimation for Sequential Knowledge Distillation
Jiabin Fan
Guoqing Luo
Michael Bowling
Lili Mou
OffRL
68
0
0
26 Apr 2025
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Zhong Zheng
Haochen Zhang
Lingzhou Xue
OffRL
78
2
0
10 Oct 2024
Regret-Optimal Model-Free Reinforcement Learning for Discounted MDPs with Short Burn-In Time
Xiang Ji
Gen Li
OffRL
32
7
0
24 May 2023
n-Step Temporal Difference Learning with Optimal n
Lakshmi Mandal
S. Bhatnagar
29
2
0
13 Mar 2023
The Efficacy of Pessimism in Asynchronous Q-Learning
Yuling Yan
Gen Li
Yuxin Chen
Jianqing Fan
OffRL
78
40
0
14 Mar 2022
Instance-Dependent Confidence and Early Stopping for Reinforcement Learning
K. Khamaru
Eric Xia
Martin J. Wainwright
Michael I. Jordan
37
5
0
21 Jan 2022
Accelerated and instance-optimal policy evaluation with linear function approximation
Tianjiao Li
Guanghui Lan
A. Pananjady
OffRL
37
13
0
24 Dec 2021
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning
Gen Li
Laixi Shi
Yuxin Chen
Yuejie Chi
OffRL
45
51
0
09 Oct 2021
Online Bootstrap Inference For Policy Evaluation in Reinforcement Learning
Pratik Ramprasad
Yuantong Li
Zhuoran Yang
Zhaoran Wang
W. Sun
Guang Cheng
OffRL
50
27
0
08 Aug 2021
Tighter Analysis of Alternating Stochastic Gradient Method for Stochastic Nested Problems
Tianyi Chen
Yuejiao Sun
W. Yin
26
33
0
25 Jun 2021
Greedy-GQ with Variance Reduction: Finite-time Analysis and Improved Complexity
Shaocong Ma
Ziyi Chen
Yi Zhou
Shaofeng Zou
17
11
0
30 Mar 2021
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Tengyu Xu
Zhuoran Yang
Zhaoran Wang
Yingbin Liang
OffRL
47
24
0
23 Feb 2021
Is Q-Learning Minimax Optimal? A Tight Sample Complexity Analysis
Gen Li
Changxiao Cai
Ee
Yuting Wei
Yuejie Chi
OffRL
48
75
0
12 Feb 2021
Variance-Reduced Off-Policy TDC Learning: Non-Asymptotic Convergence Analysis
Shaocong Ma
Yi Zhou
Shaofeng Zou
OffRL
11
14
0
26 Oct 2020
Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth Nonlinear TD Learning
Shuang Qiu
Zhuoran Yang
Xiaohan Wei
Jieping Ye
Zhaoran Wang
33
38
0
23 Aug 2020
Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
26
57
0
07 May 2020
Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
21
25
0
27 Apr 2020
Is Temporal Difference Learning Optimal? An Instance-Dependent Analysis
K. Khamaru
A. Pananjady
Feng Ruan
Martin J. Wainwright
Michael I. Jordan
OffRL
11
47
0
16 Mar 2020
A Multistep Lyapunov Approach for Finite-Time Analysis of Biased Stochastic Approximation
Gang Wang
Bingcong Li
G. Giannakis
29
28
0
10 Sep 2019
1