Approximate Temporal Difference Learning is a Gradient Descent for Reversible Policies

2 May 2018

Papers citing "Approximate Temporal Difference Learning is a Gradient Descent for Reversible Policies"

3 / 3 papers shown

Title
Towards a Better Understanding of Representation Dynamics under TD-learning Yunhao Tang Rémi Munos OffRL 31 2 0 29 May 2023
Distributed TD(0) with Almost No Communication R. Liu Alexander Olshevsky FedML 28 15 0 16 Apr 2021
A semigroup method for high dimensional committor functions based on neural network Haoya Li Y. Khoo Yinuo Ren Lexing Ying 24 6 0 12 Dec 2020