ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2001.01898
  4. Cited By
Reanalysis of Variance Reduced Temporal Difference Learning

Reanalysis of Variance Reduced Temporal Difference Learning

7 January 2020
Tengyu Xu
Zhe Wang
Yi Zhou
Yingbin Liang
    OffRL
ArXivPDFHTML

Papers citing "Reanalysis of Variance Reduced Temporal Difference Learning"

19 / 19 papers shown
Title
KETCHUP: K-Step Return Estimation for Sequential Knowledge Distillation
KETCHUP: K-Step Return Estimation for Sequential Knowledge Distillation
Jiabin Fan
Guoqing Luo
Michael Bowling
Lili Mou
OffRL
68
0
0
26 Apr 2025
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Zhong Zheng
Haochen Zhang
Lingzhou Xue
OffRL
78
2
0
10 Oct 2024
Regret-Optimal Model-Free Reinforcement Learning for Discounted MDPs
  with Short Burn-In Time
Regret-Optimal Model-Free Reinforcement Learning for Discounted MDPs with Short Burn-In Time
Xiang Ji
Gen Li
OffRL
32
7
0
24 May 2023
n-Step Temporal Difference Learning with Optimal n
n-Step Temporal Difference Learning with Optimal n
Lakshmi Mandal
S. Bhatnagar
29
2
0
13 Mar 2023
The Efficacy of Pessimism in Asynchronous Q-Learning
The Efficacy of Pessimism in Asynchronous Q-Learning
Yuling Yan
Gen Li
Yuxin Chen
Jianqing Fan
OffRL
78
40
0
14 Mar 2022
Instance-Dependent Confidence and Early Stopping for Reinforcement
  Learning
Instance-Dependent Confidence and Early Stopping for Reinforcement Learning
K. Khamaru
Eric Xia
Martin J. Wainwright
Michael I. Jordan
37
5
0
21 Jan 2022
Accelerated and instance-optimal policy evaluation with linear function
  approximation
Accelerated and instance-optimal policy evaluation with linear function approximation
Tianjiao Li
Guanghui Lan
A. Pananjady
OffRL
37
13
0
24 Dec 2021
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free
  Reinforcement Learning
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning
Gen Li
Laixi Shi
Yuxin Chen
Yuejie Chi
OffRL
45
51
0
09 Oct 2021
Online Bootstrap Inference For Policy Evaluation in Reinforcement
  Learning
Online Bootstrap Inference For Policy Evaluation in Reinforcement Learning
Pratik Ramprasad
Yuantong Li
Zhuoran Yang
Zhaoran Wang
W. Sun
Guang Cheng
OffRL
50
27
0
08 Aug 2021
Tighter Analysis of Alternating Stochastic Gradient Method for
  Stochastic Nested Problems
Tighter Analysis of Alternating Stochastic Gradient Method for Stochastic Nested Problems
Tianyi Chen
Yuejiao Sun
W. Yin
26
33
0
25 Jun 2021
Greedy-GQ with Variance Reduction: Finite-time Analysis and Improved
  Complexity
Greedy-GQ with Variance Reduction: Finite-time Analysis and Improved Complexity
Shaocong Ma
Ziyi Chen
Yi Zhou
Shaofeng Zou
17
11
0
30 Mar 2021
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Tengyu Xu
Zhuoran Yang
Zhaoran Wang
Yingbin Liang
OffRL
47
24
0
23 Feb 2021
Is Q-Learning Minimax Optimal? A Tight Sample Complexity Analysis
Is Q-Learning Minimax Optimal? A Tight Sample Complexity Analysis
Gen Li
Changxiao Cai
Ee
Yuting Wei
Yuejie Chi
OffRL
48
75
0
12 Feb 2021
Variance-Reduced Off-Policy TDC Learning: Non-Asymptotic Convergence
  Analysis
Variance-Reduced Off-Policy TDC Learning: Non-Asymptotic Convergence Analysis
Shaocong Ma
Yi Zhou
Shaofeng Zou
OffRL
11
14
0
26 Oct 2020
Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth
  Nonlinear TD Learning
Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth Nonlinear TD Learning
Shuang Qiu
Zhuoran Yang
Xiaohan Wei
Jieping Ye
Zhaoran Wang
33
38
0
23 Aug 2020
Non-asymptotic Convergence Analysis of Two Time-scale (Natural)
  Actor-Critic Algorithms
Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
26
57
0
07 May 2020
Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms
Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
24
25
0
27 Apr 2020
Is Temporal Difference Learning Optimal? An Instance-Dependent Analysis
Is Temporal Difference Learning Optimal? An Instance-Dependent Analysis
K. Khamaru
A. Pananjady
Feng Ruan
Martin J. Wainwright
Michael I. Jordan
OffRL
15
47
0
16 Mar 2020
A Multistep Lyapunov Approach for Finite-Time Analysis of Biased
  Stochastic Approximation
A Multistep Lyapunov Approach for Finite-Time Analysis of Biased Stochastic Approximation
Gang Wang
Bingcong Li
G. Giannakis
31
28
0
10 Sep 2019
1