The Statistical Benefits of Quantile Temporal-Difference Learning for
Value Estimation

The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation

28 May 2023

Marc G. Bellemare

ArXiv (abs)PDF HTML

Papers citing "The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation"

9 / 9 papers shown

Title
Demystifying the Paradox of Importance Sampling with an Estimated History-Dependent Behavior Policy in Off-Policy Evaluation Hongyi Zhou Josiah P. Hanna Jin Zhu Ying Yang Chengchun Shi OffRL 64 0 0 28 May 2025
Stop Regressing: Training Value Functions via Classification for Scalable Deep RL Jesse Farebrother Jordi Orbay Q. Vuong Adrien Ali Taïga Yevgen Chebotar ... Sergey Levine Pablo Samuel Castro Aleksandra Faust Aviral Kumar Rishabh Agarwal OffRL 105 66 0 06 Mar 2024
More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning Kaiwen Wang Owen Oertell Alekh Agarwal Nathan Kallus Wen Sun OffRL 125 12 0 11 Feb 2024
Beyond Average Return in Markov Decision Processes Alexandre Marthe Aurélien Garivier Claire Vernade 71 7 0 31 Oct 2023
Robust Offline Reinforcement learning with Heavy-Tailed Rewards Jin Zhu Runzhe Wan Zhengling Qi Shuang Luo C. Shi OffRL 76 1 0 28 Oct 2023
Variance Control for Distributional Reinforcement Learning Qi Kuang Zhoufan Zhu Liwen Zhang Fan Zhou OffRL 143 3 0 30 Jul 2023
An Analysis of Quantile Temporal-Difference Learning Mark Rowland Rémi Munos M. G. Azar Yunhao Tang Georg Ostrovski Anna Harutyunyan K. Tuyls Marc G. Bellemare Will Dabney 159 24 0 11 Jan 2023
Distributional Reinforcement Learning by Sinkhorn Divergence Ke Sun Yingnan Zhao Wulong Liu Bei Jiang Linglong Kong 76 0 0 01 Feb 2022
The Benefits of Being Categorical Distributional: Uncertainty-aware Regularized Exploration in Reinforcement Learning Ke Sun Yingnan Zhao Enze Shi Yafei Wang Xiaodong Yan Bei Jiang Linglong Kong OOD OffRL UQCV 86 2 0 07 Oct 2021