Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1902.02234
Cited By
Finite-Sample Analysis for SARSA with Linear Function Approximation
6 February 2019
Shaofeng Zou
Tengyu Xu
Yingbin Liang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Finite-Sample Analysis for SARSA with Linear Function Approximation"
45 / 95 papers shown
Title
Variance Reduction based Experience Replay for Policy Optimization
Hua Zheng
Wei Xie
M. Feng
OffRL
39
2
0
17 Oct 2021
Online Target Q-learning with Reverse Experience Replay: Efficiently finding the Optimal Policy for Linear MDPs
Naman Agarwal
Syomantak Chaudhuri
Prateek Jain
Dheeraj M. Nagaraj
Praneeth Netrapalli
OffRL
40
21
0
16 Oct 2021
Sim and Real: Better Together
Shirli Di-Castro Shashua
Dotan DiCastro
Shie Mannor
58
11
0
01 Oct 2021
A Two-Time-Scale Stochastic Optimization Framework with Applications in Control and Reinforcement Learning
Sihan Zeng
Thinh T. Doan
Justin Romberg
65
22
0
29 Sep 2021
Online Robust Reinforcement Learning with Model Uncertainty
Yue Wang
Shaofeng Zou
OOD
OffRL
76
97
0
29 Sep 2021
Truncated Emphatic Temporal Difference Methods for Prediction and Control
Shangtong Zhang
Shimon Whiteson
OffRL
15
11
0
11 Aug 2021
Analysis of a Target-Based Actor-Critic Algorithm with Linear Function Approximation
Anas Barakat
Pascal Bianchi
Julien Lehmann
32
9
0
14 Jun 2021
Successive Convex Approximation Based Off-Policy Optimization for Constrained Reinforcement Learning
Chang Tian
An Liu
Guang-Li Huang
Wu Luo
16
12
0
26 May 2021
Deeply-Debiased Off-Policy Interval Estimation
C. Shi
Runzhe Wan
Victor Chernozhukov
R. Song
OffRL
25
36
0
10 May 2021
Predictor-Corrector(PC) Temporal Difference(TD) Learning (PCTD)
C. Bowyer
24
1
0
15 Apr 2021
Non-Asymptotic Analysis for Two Time-scale TDC with General Smooth Function Approximation
Yue Wang
Shaofeng Zou
Yi Zhou
14
11
0
07 Apr 2021
Greedy-GQ with Variance Reduction: Finite-time Analysis and Improved Complexity
Shaocong Ma
Ziyi Chen
Yi Zhou
Shaofeng Zou
17
11
0
30 Mar 2021
Multi-Agent Off-Policy TD Learning: Finite-Time Analysis with Near-Optimal Sample Complexity and Communication Complexity
Ziyi Chen
Yi Zhou
Rongrong Chen
OffRL
21
7
0
24 Mar 2021
Breaking the Deadly Triad with a Target Network
Shangtong Zhang
Hengshuai Yao
Shimon Whiteson
AAML
20
43
0
21 Jan 2021
Towards Understanding Asynchronous Advantage Actor-critic: Convergence and Linear Speedup
Han Shen
Kaipeng Zhang
Min-Fong Hong
Tianyi Chen
35
28
0
31 Dec 2020
On Convergence of Gradient Expected Sarsa(
λ
λ
λ
)
Long Yang
Gang Zheng
Yu Zhang
Qian Zheng
Pengfei Li
Gang Pan
21
2
0
14 Dec 2020
Reinforcement Learning Control of Constrained Dynamic Systems with Uniformly Ultimate Boundedness Stability Guarantee
Minghao Han
Yuan Tian
Lixian Zhang
Jun Wang
Wei Pan
8
46
0
13 Nov 2020
Sample Complexity Bounds for Two Timescale Value-based Reinforcement Learning Algorithms
Tengyu Xu
Yingbin Liang
13
26
0
10 Nov 2020
Variance-Reduced Off-Policy TDC Learning: Non-Asymptotic Convergence Analysis
Shaocong Ma
Yi Zhou
Shaofeng Zou
OffRL
8
14
0
26 Oct 2020
Finite-Time Analysis for Double Q-learning
Huaqing Xiong
Linna Zhao
Yingbin Liang
Wei Zhang
25
31
0
29 Sep 2020
Deep Q-Learning: Theoretical Insights from an Asymptotic Analysis
Arunselvan Ramaswamy
Eyke Hüllermeier
23
4
0
25 Aug 2020
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Zuyue Fu
Zhuoran Yang
Zhaoran Wang
18
42
0
02 Aug 2020
Momentum Q-learning with Finite-Sample Convergence Guarantee
Bowen Weng
Huaqing Xiong
Linna Zhao
Yingbin Liang
Wei Zhang
8
8
0
30 Jul 2020
Analysis of Q-learning with Adaptation and Momentum Restart for Gradient Descent
Chuhan Wu
Fangzhao Wu
Tao Qi
Yongfeng Huang
14
25
0
15 Jul 2020
Provably Efficient Reinforcement Learning for Discounted MDPs with Feature Mapping
Dongruo Zhou
Jiafan He
Quanquan Gu
30
133
0
23 Jun 2020
Sample Efficient Reinforcement Learning via Low-Rank Matrix Estimation
Devavrat Shah
Dogyoon Song
Zhi Xu
Yuzhe Yang
19
31
0
11 Jun 2020
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory
Yufeng Zhang
Qi Cai
Zhuoran Yang
Yongxin Chen
Zhaoran Wang
OOD
MLT
135
11
0
08 Jun 2020
Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction
Gen Li
Yuting Wei
Yuejie Chi
Yuantao Gu
Yuxin Chen
OffRL
20
114
0
04 Jun 2020
Finite-sample Analysis of Greedy-GQ with Linear Function Approximation under Markovian Noise
Yue Wang
Shaofeng Zou
9
21
0
20 May 2020
Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
26
57
0
07 May 2020
A Finite Time Analysis of Two Time-Scale Actor Critic Methods
Yue Wu
Weitong Zhang
Pan Xu
Quanquan Gu
90
146
0
04 May 2020
Actor-Critic Reinforcement Learning for Control with Stability Guarantee
Minghao Han
Lixian Zhang
Jun Wang
Wei Pan
8
106
0
29 Apr 2020
Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
21
25
0
27 Apr 2020
Agnostic Q-learning with Function Approximation in Deterministic Systems: Tight Bounds on Approximation Error and Sample Complexity
S. Du
J. Lee
G. Mahajan
Ruosong Wang
10
37
0
17 Feb 2020
Non-asymptotic Convergence of Adam-type Reinforcement Learning Algorithms under Markovian Sampling
Huaqing Xiong
Tengyu Xu
Yingbin Liang
Wei Zhang
19
33
0
15 Feb 2020
Does the Markov Decision Process Fit the Data: Testing for the Markov Property in Sequential Decision Making
C. Shi
Runzhe Wan
R. Song
Wenbin Lu
Ling Leng
28
37
0
05 Feb 2020
Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement Learning Framework
C. Shi
Xiaoyu Wang
Shuang Luo
Hongtu Zhu
Jieping Ye
R. Song
CML
OffRL
30
33
0
05 Feb 2020
Reanalysis of Variance Reduced Temporal Difference Learning
Tengyu Xu
Zhe Wang
Yi Zhou
Yingbin Liang
OffRL
26
38
0
07 Jan 2020
Scalable Reinforcement Learning for Multi-Agent Networked Systems
Guannan Qu
Adam Wierman
Na Li
16
33
0
05 Dec 2019
A Unified Switching System Perspective and O.D.E. Analysis of Q-Learning Algorithms
Dong-hwan Lee
Niao He
10
28
0
04 Dec 2019
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms
Kaipeng Zhang
Zhuoran Yang
Tamer Basar
63
1,184
0
24 Nov 2019
Generalized Speedy Q-learning
I. John
Chandramouli Kamanchi
S. Bhatnagar
9
17
0
01 Nov 2019
On the Sample Complexity of Actor-Critic Method for Reinforcement Learning with Function Approximation
Harshat Kumar
Alec Koppel
Alejandro Ribeiro
104
80
0
18 Oct 2019
Two Time-scale Off-Policy TD Learning: Non-asymptotic Analysis over Markovian Samples
Tengyu Xu
Shaofeng Zou
Yingbin Liang
17
73
0
26 Sep 2019
A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method
Simon Lacoste-Julien
Mark W. Schmidt
Francis R. Bach
128
259
0
10 Dec 2012
Previous
1
2