Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.02628
Cited By
Target Network and Truncation Overcome The Deadly Triad in
Q
Q
Q
-Learning
5 March 2022
Zaiwei Chen
John-Paul Clarke
S. T. Maguluri
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Target Network and Truncation Overcome The Deadly Triad in $Q$-Learning"
17 / 17 papers shown
Title
Understanding the theoretical properties of projected Bellman equation, linear Q-learning, and approximate value iteration
Han-Dong Lim
Donghwan Lee
21
0
0
15 Apr 2025
Dual Approximation Policy Optimization
Zhihan Xiong
Maryam Fazel
Lin Xiao
38
1
0
02 Oct 2024
Improving Deep Reinforcement Learning by Reducing the Chain Effect of Value and Policy Churn
Hongyao Tang
Glen Berseth
OffRL
45
1
0
07 Sep 2024
Target Networks and Over-parameterization Stabilize Off-policy Bootstrapping with Function Approximation
Fengdi Che
Chenjun Xiao
Jincheng Mei
Bo Dai
Ramki Gummadi
Oscar A Ramirez
Christopher K Harris
A. R. Mahmood
Dale Schuurmans
38
5
0
31 May 2024
Enhancing Q-Learning with Large Language Model Heuristics
Xiefeng Wu
LRM
32
0
0
06 May 2024
Analysis of Off-Policy Multi-Step TD-Learning with Linear Function Approximation
Donghwan Lee
48
0
0
24 Feb 2024
Regularized Q-Learning with Linear Function Approximation
Jiachen Xi
Alfredo Garcia
P. Momcilovic
38
2
0
26 Jan 2024
Multi-Bellman operator for convergence of
Q
Q
Q
-learning with linear function approximation
Diogo S. Carvalho
D. L. McPherson
Francisco S. Melo
29
1
0
28 Sep 2023
Stability of Q-Learning Through Design and Optimism
Sean P. Meyn
31
10
0
05 Jul 2023
Performance Bounds for Policy-Based Average Reward Reinforcement Learning Algorithms
Yashaswini Murthy
Mehrdad Moharrami
R. Srikant
OffRL
32
5
0
02 Feb 2023
Finite time analysis of temporal difference learning with linear function approximation: Tail averaging and regularisation
Gandharv Patil
Prashanth L.A.
Dheeraj M. Nagaraj
Doina Precup
11
15
0
12 Oct 2022
A Note on Target Q-learning For Solving Finite MDPs with A Generative Oracle
Ziniu Li
Tian Xu
Yang Yu
55
5
0
22 Mar 2022
The Efficacy of Pessimism in Asynchronous Q-Learning
Yuling Yan
Gen Li
Yuxin Chen
Jianqing Fan
OffRL
78
40
0
14 Mar 2022
Regularized Q-learning
Han-Dong Lim
Donghwan Lee
27
10
0
11 Feb 2022
Rethinking ValueDice: Does It Really Improve Performance?
Ziniu Li
Tian Xu
Yang Yu
Zhimin Luo
OffRL
23
17
0
05 Feb 2022
Finite-Sample Analysis of Off-Policy Natural Actor-Critic Algorithm
S. Khodadadian
Zaiwei Chen
S. T. Maguluri
CML
OffRL
71
26
0
18 Feb 2021
Provably Efficient Reinforcement Learning with Linear Function Approximation Under Adaptivity Constraints
Chi Jin
Zhuoran Yang
Zhaoran Wang
OffRL
122
166
0
06 Jan 2021
1