Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.14364
Cited By
Finite-Sample Analysis of Proximal Gradient TD Algorithms
6 June 2020
Bo Liu
Ji Liu
Mohammad Ghavamzadeh
Sridhar Mahadevan
Marek Petrik
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Finite-Sample Analysis of Proximal Gradient TD Algorithms"
44 / 44 papers shown
Title
An Improved Finite-time Analysis of Temporal Difference Learning with Deep Neural Networks
Zhifa Ke
Zaiwen Wen
Junyu Zhang
37
0
0
07 May 2024
Tight Finite Time Bounds of Two-Time-Scale Linear Stochastic Approximation with Markovian Noise
Shaan ul Haque
S. Khodadadian
S. T. Maguluri
44
11
0
31 Dec 2023
TD Convergence: An Optimization Perspective
Kavosh Asadi
Shoham Sabach
Yao Liu
Omer Gottesman
Rasool Fakoor
MU
25
8
0
30 Jun 2023
Backstepping Temporal Difference Learning
Han-Dong Lim
Dong-hwan Lee
OffRL
41
2
0
20 Feb 2023
Gradient Descent Temporal Difference-difference Learning
Rong Zhu
James M. Murray
OffRL
24
1
0
10 Sep 2022
Finite-Time Error Bounds for Greedy-GQ
Yue Wang
Yi Zhou
Shaofeng Zou
34
1
0
06 Sep 2022
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning
Andrea Zanette
Martin J. Wainwright
OOD
45
5
0
01 Jun 2022
Stochastic Gradient Descent with Dependent Data for Offline Reinforcement Learning
Jing-rong Dong
Xin T. Tong
OffRL
35
2
0
06 Feb 2022
Online Robust Reinforcement Learning with Model Uncertainty
Yue Wang
Shaofeng Zou
OOD
OffRL
76
97
0
29 Sep 2021
An Empirical Comparison of Off-policy Prediction Learning Algorithms in the Four Rooms Environment
Sina Ghiassian
R. Sutton
AAML
OffRL
21
6
0
10 Sep 2021
On the Convergence Rate of Off-Policy Policy Optimization Methods with Density-Ratio Correction
Jiawei Huang
Nan Jiang
19
5
0
02 Jun 2021
An Empirical Comparison of Off-policy Prediction Learning Algorithms on the Collision Task
Sina Ghiassian
R. Sutton
AAML
OffRL
19
5
0
02 Jun 2021
Greedy-GQ with Variance Reduction: Finite-time Analysis and Improved Complexity
Shaocong Ma
Ziyi Chen
Yi Zhou
Shaofeng Zou
17
11
0
30 Mar 2021
Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth Nonlinear TD Learning
Shuang Qiu
Zhuoran Yang
Xiaohan Wei
Jieping Ye
Zhaoran Wang
33
38
0
23 Aug 2020
Stable and Efficient Policy Evaluation
Daoming Lyu
Bo Liu
M. Geist
Wen Dong
S. Biaz
Qi Wang
OffRL
8
7
0
06 Jun 2020
Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning
Shangtong Zhang
Bo Liu
Shimon Whiteson
29
38
0
22 Apr 2020
Reinforcement Learning via Fenchel-Rockafellar Duality
Ofir Nachum
Bo Dai
OffRL
16
118
0
07 Jan 2020
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms
Kaipeng Zhang
Zhuoran Yang
Tamer Basar
63
1,184
0
24 Nov 2019
A Tale of Two-Timescale Reinforcement Learning with the Tightest Finite-Time Bound
Gal Dalal
Balazs Szorenyi
Gugan Thoppe
OffRL
11
53
0
20 Nov 2019
On the Sample Complexity of Actor-Critic Method for Reinforcement Learning with Function Approximation
Harshat Kumar
Alec Koppel
Alejandro Ribeiro
104
80
0
18 Oct 2019
Actor-Critic Provably Finds Nash Equilibria of Linear-Quadratic Mean-Field Games
Zuyue Fu
Zhuoran Yang
Yongxin Chen
Zhaoran Wang
27
54
0
16 Oct 2019
Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation
Ziyang Tang
Yihao Feng
Lihong Li
Dengyong Zhou
Qiang Liu
OffRL
30
67
0
16 Oct 2019
A Multistep Lyapunov Approach for Finite-Time Analysis of Biased Stochastic Approximation
Gang Wang
Bingcong Li
G. Giannakis
31
28
0
10 Sep 2019
Gradient Q
(
σ
,
λ
)
(σ, λ)
(
σ
,
λ
)
: A Unified Algorithm with Function Approximation for Reinforcement Learning
Long Yang
Yu Zhang
Qian Zheng
Pengfei Li
Gang Pan
20
1
0
06 Sep 2019
Finite-Time Performance of Distributed Temporal Difference Learning with Linear Function Approximation
Thinh T. Doan
S. T. Maguluri
Justin Romberg
30
41
0
25 Jul 2019
Stochastic Variance Reduced Primal Dual Algorithms for Empirical Composition Optimization
Adithya M. Devraj
Jianshu Chen
25
13
0
22 Jul 2019
On the Global Convergence of Actor-Critic: A Case for Linear Quadratic Regulator with Ergodic Cost
Zhuoran Yang
Yongxin Chen
Mingyi Hong
Zhaoran Wang
32
39
0
14 Jul 2019
Is the Policy Gradient a Gradient?
Chris Nota
Philip S. Thomas
8
57
0
17 Jun 2019
A Kernel Loss for Solving the Bellman Equation
Yihao Feng
Lihong Li
Qiang Liu
30
70
0
25 May 2019
Neural Temporal-Difference and Q-Learning Provably Converge to Global Optima
Qi Cai
Zhuoran Yang
Jason D. Lee
Zhaoran Wang
42
29
0
24 May 2019
Finite-Sample Analysis For Decentralized Batch Multi-Agent Reinforcement Learning With Networked Agents
Kaipeng Zhang
Zhuoran Yang
Han Liu
Tong Zhang
Tamer Basar
OffRL
26
26
0
06 Dec 2018
Multi-Agent Fully Decentralized Value Function Learning with Linear Convergence Rates
Lucas Cassano
Kun Yuan
Ali H. Sayed
22
39
0
17 Oct 2018
A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation
Jalaj Bhandari
Daniel Russo
Raghav Singal
38
334
0
06 Jun 2018
Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual Optimization
Hoi-To Wai
Zhuoran Yang
Zhaoran Wang
Mingyi Hong
30
169
0
03 Jun 2018
Model-Free Linear Quadratic Control via Reduction to Expert Prediction
Yasin Abbasi-Yadkori
N. Lazić
Csaba Szepesvári
OffRL
18
94
0
17 Apr 2018
Path Consistency Learning in Tsallis Entropy Regularized MDPs
Ofir Nachum
Yinlam Chow
Mohammad Ghavamzadeh
26
45
0
10 Feb 2018
SBEED: Convergent Reinforcement Learning with Nonlinear Function Approximation
Bo Dai
Albert Eaton Shaw
Lihong Li
Lin Xiao
Niao He
Zhen Liu
Jianshu Chen
Le Song
34
25
0
29 Dec 2017
Least-Squares Temporal Difference Learning for the Linear Quadratic Regulator
Stephen Tu
Benjamin Recht
OffRL
34
130
0
22 Dec 2017
Linear Stochastic Approximation: Constant Step-Size and Iterate Averaging
Chandrashekar Lakshminarayanan
Csaba Szepesvári
26
12
0
12 Sep 2017
Accelerating Stochastic Composition Optimization
Mengdi Wang
Ji Liu
Ethan X. Fang
10
145
0
25 Jul 2016
Learning from Conditional Distributions via Dual Embeddings
Bo Dai
Niao He
Yunpeng Pan
Byron Boots
Le Song
35
21
0
15 Jul 2016
Investigating practical linear temporal difference learning
Adam White
Martha White
OffRL
10
41
0
28 Feb 2016
Weak Convergence Properties of Constrained Emphatic Temporal-difference Learning with Constant and Slowly Diminishing Stepsize
Huizhen Yu
25
29
0
23 Nov 2015
Generalized Emphatic Temporal Difference Learning: Bias-Variance Analysis
Assaf Hallak
Aviv Tamar
Rémi Munos
Shie Mannor
OffRL
44
56
0
17 Sep 2015
1