ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1902.02234
  4. Cited By
Finite-Sample Analysis for SARSA with Linear Function Approximation

Finite-Sample Analysis for SARSA with Linear Function Approximation

6 February 2019
Shaofeng Zou
Tengyu Xu
Yingbin Liang
ArXivPDFHTML

Papers citing "Finite-Sample Analysis for SARSA with Linear Function Approximation"

45 / 95 papers shown
Title
Variance Reduction based Experience Replay for Policy Optimization
Variance Reduction based Experience Replay for Policy Optimization
Hua Zheng
Wei Xie
M. Feng
OffRL
39
2
0
17 Oct 2021
Online Target Q-learning with Reverse Experience Replay: Efficiently
  finding the Optimal Policy for Linear MDPs
Online Target Q-learning with Reverse Experience Replay: Efficiently finding the Optimal Policy for Linear MDPs
Naman Agarwal
Syomantak Chaudhuri
Prateek Jain
Dheeraj M. Nagaraj
Praneeth Netrapalli
OffRL
40
21
0
16 Oct 2021
Sim and Real: Better Together
Sim and Real: Better Together
Shirli Di-Castro Shashua
Dotan DiCastro
Shie Mannor
58
11
0
01 Oct 2021
A Two-Time-Scale Stochastic Optimization Framework with Applications in
  Control and Reinforcement Learning
A Two-Time-Scale Stochastic Optimization Framework with Applications in Control and Reinforcement Learning
Sihan Zeng
Thinh T. Doan
Justin Romberg
65
22
0
29 Sep 2021
Online Robust Reinforcement Learning with Model Uncertainty
Online Robust Reinforcement Learning with Model Uncertainty
Yue Wang
Shaofeng Zou
OOD
OffRL
76
97
0
29 Sep 2021
Truncated Emphatic Temporal Difference Methods for Prediction and
  Control
Truncated Emphatic Temporal Difference Methods for Prediction and Control
Shangtong Zhang
Shimon Whiteson
OffRL
15
11
0
11 Aug 2021
Analysis of a Target-Based Actor-Critic Algorithm with Linear Function
  Approximation
Analysis of a Target-Based Actor-Critic Algorithm with Linear Function Approximation
Anas Barakat
Pascal Bianchi
Julien Lehmann
32
9
0
14 Jun 2021
Successive Convex Approximation Based Off-Policy Optimization for
  Constrained Reinforcement Learning
Successive Convex Approximation Based Off-Policy Optimization for Constrained Reinforcement Learning
Chang Tian
An Liu
Guang-Li Huang
Wu Luo
16
12
0
26 May 2021
Deeply-Debiased Off-Policy Interval Estimation
Deeply-Debiased Off-Policy Interval Estimation
C. Shi
Runzhe Wan
Victor Chernozhukov
R. Song
OffRL
25
36
0
10 May 2021
Predictor-Corrector(PC) Temporal Difference(TD) Learning (PCTD)
Predictor-Corrector(PC) Temporal Difference(TD) Learning (PCTD)
C. Bowyer
24
1
0
15 Apr 2021
Non-Asymptotic Analysis for Two Time-scale TDC with General Smooth
  Function Approximation
Non-Asymptotic Analysis for Two Time-scale TDC with General Smooth Function Approximation
Yue Wang
Shaofeng Zou
Yi Zhou
14
11
0
07 Apr 2021
Greedy-GQ with Variance Reduction: Finite-time Analysis and Improved
  Complexity
Greedy-GQ with Variance Reduction: Finite-time Analysis and Improved Complexity
Shaocong Ma
Ziyi Chen
Yi Zhou
Shaofeng Zou
17
11
0
30 Mar 2021
Multi-Agent Off-Policy TD Learning: Finite-Time Analysis with
  Near-Optimal Sample Complexity and Communication Complexity
Multi-Agent Off-Policy TD Learning: Finite-Time Analysis with Near-Optimal Sample Complexity and Communication Complexity
Ziyi Chen
Yi Zhou
Rongrong Chen
OffRL
21
7
0
24 Mar 2021
Breaking the Deadly Triad with a Target Network
Breaking the Deadly Triad with a Target Network
Shangtong Zhang
Hengshuai Yao
Shimon Whiteson
AAML
20
43
0
21 Jan 2021
Towards Understanding Asynchronous Advantage Actor-critic: Convergence
  and Linear Speedup
Towards Understanding Asynchronous Advantage Actor-critic: Convergence and Linear Speedup
Han Shen
Kaipeng Zhang
Min-Fong Hong
Tianyi Chen
35
28
0
31 Dec 2020
On Convergence of Gradient Expected Sarsa($λ$)
On Convergence of Gradient Expected Sarsa(λλλ)
Long Yang
Gang Zheng
Yu Zhang
Qian Zheng
Pengfei Li
Gang Pan
21
2
0
14 Dec 2020
Reinforcement Learning Control of Constrained Dynamic Systems with
  Uniformly Ultimate Boundedness Stability Guarantee
Reinforcement Learning Control of Constrained Dynamic Systems with Uniformly Ultimate Boundedness Stability Guarantee
Minghao Han
Yuan Tian
Lixian Zhang
Jun Wang
Wei Pan
8
46
0
13 Nov 2020
Sample Complexity Bounds for Two Timescale Value-based Reinforcement
  Learning Algorithms
Sample Complexity Bounds for Two Timescale Value-based Reinforcement Learning Algorithms
Tengyu Xu
Yingbin Liang
13
26
0
10 Nov 2020
Variance-Reduced Off-Policy TDC Learning: Non-Asymptotic Convergence
  Analysis
Variance-Reduced Off-Policy TDC Learning: Non-Asymptotic Convergence Analysis
Shaocong Ma
Yi Zhou
Shaofeng Zou
OffRL
8
14
0
26 Oct 2020
Finite-Time Analysis for Double Q-learning
Finite-Time Analysis for Double Q-learning
Huaqing Xiong
Linna Zhao
Yingbin Liang
Wei Zhang
25
31
0
29 Sep 2020
Deep Q-Learning: Theoretical Insights from an Asymptotic Analysis
Deep Q-Learning: Theoretical Insights from an Asymptotic Analysis
Arunselvan Ramaswamy
Eyke Hüllermeier
23
4
0
25 Aug 2020
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Zuyue Fu
Zhuoran Yang
Zhaoran Wang
18
42
0
02 Aug 2020
Momentum Q-learning with Finite-Sample Convergence Guarantee
Momentum Q-learning with Finite-Sample Convergence Guarantee
Bowen Weng
Huaqing Xiong
Linna Zhao
Yingbin Liang
Wei Zhang
8
8
0
30 Jul 2020
Analysis of Q-learning with Adaptation and Momentum Restart for Gradient
  Descent
Analysis of Q-learning with Adaptation and Momentum Restart for Gradient Descent
Chuhan Wu
Fangzhao Wu
Tao Qi
Yongfeng Huang
14
25
0
15 Jul 2020
Provably Efficient Reinforcement Learning for Discounted MDPs with
  Feature Mapping
Provably Efficient Reinforcement Learning for Discounted MDPs with Feature Mapping
Dongruo Zhou
Jiafan He
Quanquan Gu
30
133
0
23 Jun 2020
Sample Efficient Reinforcement Learning via Low-Rank Matrix Estimation
Sample Efficient Reinforcement Learning via Low-Rank Matrix Estimation
Devavrat Shah
Dogyoon Song
Zhi Xu
Yuzhe Yang
19
31
0
11 Jun 2020
Can Temporal-Difference and Q-Learning Learn Representation? A
  Mean-Field Theory
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory
Yufeng Zhang
Qi Cai
Zhuoran Yang
Yongxin Chen
Zhaoran Wang
OOD
MLT
135
11
0
08 Jun 2020
Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and
  Variance Reduction
Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction
Gen Li
Yuting Wei
Yuejie Chi
Yuantao Gu
Yuxin Chen
OffRL
20
114
0
04 Jun 2020
Finite-sample Analysis of Greedy-GQ with Linear Function Approximation
  under Markovian Noise
Finite-sample Analysis of Greedy-GQ with Linear Function Approximation under Markovian Noise
Yue Wang
Shaofeng Zou
9
21
0
20 May 2020
Non-asymptotic Convergence Analysis of Two Time-scale (Natural)
  Actor-Critic Algorithms
Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
26
57
0
07 May 2020
A Finite Time Analysis of Two Time-Scale Actor Critic Methods
A Finite Time Analysis of Two Time-Scale Actor Critic Methods
Yue Wu
Weitong Zhang
Pan Xu
Quanquan Gu
90
146
0
04 May 2020
Actor-Critic Reinforcement Learning for Control with Stability Guarantee
Actor-Critic Reinforcement Learning for Control with Stability Guarantee
Minghao Han
Lixian Zhang
Jun Wang
Wei Pan
8
106
0
29 Apr 2020
Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms
Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
21
25
0
27 Apr 2020
Agnostic Q-learning with Function Approximation in Deterministic
  Systems: Tight Bounds on Approximation Error and Sample Complexity
Agnostic Q-learning with Function Approximation in Deterministic Systems: Tight Bounds on Approximation Error and Sample Complexity
S. Du
J. Lee
G. Mahajan
Ruosong Wang
10
37
0
17 Feb 2020
Non-asymptotic Convergence of Adam-type Reinforcement Learning
  Algorithms under Markovian Sampling
Non-asymptotic Convergence of Adam-type Reinforcement Learning Algorithms under Markovian Sampling
Huaqing Xiong
Tengyu Xu
Yingbin Liang
Wei Zhang
19
33
0
15 Feb 2020
Does the Markov Decision Process Fit the Data: Testing for the Markov
  Property in Sequential Decision Making
Does the Markov Decision Process Fit the Data: Testing for the Markov Property in Sequential Decision Making
C. Shi
Runzhe Wan
R. Song
Wenbin Lu
Ling Leng
28
37
0
05 Feb 2020
Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement
  Learning Framework
Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement Learning Framework
C. Shi
Xiaoyu Wang
Shuang Luo
Hongtu Zhu
Jieping Ye
R. Song
CML
OffRL
30
33
0
05 Feb 2020
Reanalysis of Variance Reduced Temporal Difference Learning
Reanalysis of Variance Reduced Temporal Difference Learning
Tengyu Xu
Zhe Wang
Yi Zhou
Yingbin Liang
OffRL
26
38
0
07 Jan 2020
Scalable Reinforcement Learning for Multi-Agent Networked Systems
Scalable Reinforcement Learning for Multi-Agent Networked Systems
Guannan Qu
Adam Wierman
Na Li
16
33
0
05 Dec 2019
A Unified Switching System Perspective and O.D.E. Analysis of Q-Learning
  Algorithms
A Unified Switching System Perspective and O.D.E. Analysis of Q-Learning Algorithms
Dong-hwan Lee
Niao He
10
28
0
04 Dec 2019
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and
  Algorithms
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms
Kaipeng Zhang
Zhuoran Yang
Tamer Basar
63
1,184
0
24 Nov 2019
Generalized Speedy Q-learning
Generalized Speedy Q-learning
I. John
Chandramouli Kamanchi
S. Bhatnagar
9
17
0
01 Nov 2019
On the Sample Complexity of Actor-Critic Method for Reinforcement
  Learning with Function Approximation
On the Sample Complexity of Actor-Critic Method for Reinforcement Learning with Function Approximation
Harshat Kumar
Alec Koppel
Alejandro Ribeiro
104
80
0
18 Oct 2019
Two Time-scale Off-Policy TD Learning: Non-asymptotic Analysis over
  Markovian Samples
Two Time-scale Off-Policy TD Learning: Non-asymptotic Analysis over Markovian Samples
Tengyu Xu
Shaofeng Zou
Yingbin Liang
17
73
0
26 Sep 2019
A simpler approach to obtaining an O(1/t) convergence rate for the
  projected stochastic subgradient method
A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method
Simon Lacoste-Julien
Mark W. Schmidt
Francis R. Bach
128
259
0
10 Dec 2012
Previous
12