Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory

8 June 2020

Papers citing "Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory"

3 / 3 papers shown

Title
The RL Perceptron: Generalisation Dynamics of Policy Learning in High Dimensions Nishil Patel Sebastian Lee Stefano Sarao Mannelli Sebastian Goldt Adrew Saxe OffRL 28 3 0 17 Jun 2023
Towards a Better Understanding of Representation Dynamics under TD-learning Yunhao Tang Rémi Munos OffRL 23 1 0 29 May 2023
Global optimality of softmax policy gradient with single hidden layer neural networks in the mean-field regime Andrea Agazzi Jianfeng Lu 13 15 0 22 Oct 2020