Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1909.11907
Cited By
Two Time-scale Off-Policy TD Learning: Non-asymptotic Analysis over Markovian Samples
26 September 2019
Tengyu Xu
Shaofeng Zou
Yingbin Liang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Two Time-scale Off-Policy TD Learning: Non-asymptotic Analysis over Markovian Samples"
30 / 30 papers shown
Title
Regularized Q-Learning with Linear Function Approximation
Jiachen Xi
Alfredo Garcia
P. Momcilovic
38
2
0
26 Jan 2024
Tight Finite Time Bounds of Two-Time-Scale Linear Stochastic Approximation with Markovian Noise
Shaan ul Haque
S. Khodadadian
S. T. Maguluri
44
11
0
31 Dec 2023
High-probability sample complexities for policy evaluation with linear function approximation
Gen Li
Weichen Wu
Yuejie Chi
Cong Ma
Alessandro Rinaldo
Yuting Wei
OffRL
33
7
0
30 May 2023
Statistical Inference with Stochastic Gradient Methods under
φ
φ
φ
-mixing Data
Ruiqi Liu
Xinyu Chen
Zuofeng Shang
FedML
19
6
0
24 Feb 2023
Finite-Time Error Bounds for Greedy-GQ
Yue Wang
Yi Zhou
Shaofeng Zou
34
1
0
06 Sep 2022
Exact Formulas for Finite-Time Estimation Errors of Decentralized Temporal Difference Learning with Linear Function Approximation
Xing-ming Guo
Bin Hu
13
2
0
20 Apr 2022
Convex Programs and Lyapunov Functions for Reinforcement Learning: A Unified Perspective on the Analysis of Value-Based Methods
Xing-ming Guo
Bin Hu
OffRL
30
3
0
14 Feb 2022
Stochastic Gradient Descent with Dependent Data for Offline Reinforcement Learning
Jing-rong Dong
Xin T. Tong
OffRL
35
2
0
06 Feb 2022
Finite-Time Error Bounds for Distributed Linear Stochastic Approximation
Yixuan Lin
V. Gupta
Ji Liu
32
3
0
24 Nov 2021
PER-ETD: A Polynomially Efficient Emphatic Temporal Difference Learning Method
Ziwei Guan
Tengyu Xu
Yingbin Liang
17
4
0
13 Oct 2021
Online Robust Reinforcement Learning with Model Uncertainty
Yue Wang
Shaofeng Zou
OOD
OffRL
76
97
0
29 Sep 2021
Sample and Communication-Efficient Decentralized Actor-Critic Algorithms with Finite-Time Analysis
Ziyi Chen
Yi Zhou
Rongrong Chen
Shaofeng Zou
19
24
0
08 Sep 2021
Analysis of a Target-Based Actor-Critic Algorithm with Linear Function Approximation
Anas Barakat
Pascal Bianchi
Julien Lehmann
32
9
0
14 Jun 2021
Greedy-GQ with Variance Reduction: Finite-time Analysis and Improved Complexity
Shaocong Ma
Ziyi Chen
Yi Zhou
Shaofeng Zou
17
11
0
30 Mar 2021
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Tengyu Xu
Zhuoran Yang
Zhaoran Wang
Yingbin Liang
OffRL
47
24
0
23 Feb 2021
Is Q-Learning Minimax Optimal? A Tight Sample Complexity Analysis
Gen Li
Changxiao Cai
Ee
Yuting Wei
Yuejie Chi
OffRL
50
75
0
12 Feb 2021
On the Stability of Random Matrix Product with Markovian Noise: Application to Linear Stochastic Approximation and TD Learning
Alain Durmus
Eric Moulines
A. Naumov
S. Samsonov
Hoi-To Wai
27
19
0
30 Jan 2021
Convergence Proof for Actor-Critic Methods Applied to PPO and RUDDER
Markus Holzleitner
Lukas Gruber
Jose A. Arjona-Medina
Johannes Brandstetter
Sepp Hochreiter
33
38
0
02 Dec 2020
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee
Tengyu Xu
Yingbin Liang
Guanghui Lan
49
122
0
11 Nov 2020
Variance-Reduced Off-Policy TDC Learning: Non-Asymptotic Convergence Analysis
Shaocong Ma
Yi Zhou
Shaofeng Zou
OffRL
11
14
0
26 Oct 2020
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Zuyue Fu
Zhuoran Yang
Zhaoran Wang
21
42
0
02 Aug 2020
Multi-Agent Reinforcement Learning in Stochastic Networked Systems
Yiheng Lin
Guannan Qu
Longbo Huang
Adam Wierman
34
38
0
11 Jun 2020
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory
Yufeng Zhang
Qi Cai
Zhuoran Yang
Yongxin Chen
Zhaoran Wang
OOD
MLT
153
11
0
08 Jun 2020
Breaking the Sample Size Barrier in Model-Based Reinforcement Learning with a Generative Model
Gen Li
Yuting Wei
Yuejie Chi
Yuxin Chen
34
125
0
26 May 2020
Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
26
57
0
07 May 2020
A Finite Time Analysis of Two Time-Scale Actor Critic Methods
Yue Wu
Weitong Zhang
Pan Xu
Quanquan Gu
90
146
0
04 May 2020
Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
24
25
0
27 Apr 2020
Finite-Time Analysis and Restarting Scheme for Linear Two-Time-Scale Stochastic Approximation
Thinh T. Doan
21
36
0
23 Dec 2019
Finite-Time Performance Bounds and Adaptive Learning Rate Selection for Two Time-Scale Reinforcement Learning
Harsh Gupta
R. Srikant
Lei Ying
15
85
0
14 Jul 2019
Finite-Sample Analysis for SARSA with Linear Function Approximation
Shaofeng Zou
Tengyu Xu
Yingbin Liang
32
146
0
06 Feb 2019
1