Target Networks and Over-parameterization Stabilize Off-policy Bootstrapping with Function Approximation

31 May 2024

Papers citing "Target Networks and Over-parameterization Stabilize Off-policy Bootstrapping with Function Approximation"

4 / 4 papers shown

Title
Understanding the theoretical properties of projected Bellman equation, linear Q-learning, and approximate value iteration Han-Dong Lim Donghwan Lee 16 0 0 15 Apr 2025
Offline Reinforcement Learning with Implicit Q-Learning Ilya Kostrikov Ashvin Nair Sergey Levine OffRL 214 843 0 12 Oct 2021
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems Sergey Levine Aviral Kumar George Tucker Justin Fu OffRL GP 340 1,960 0 04 May 2020
A Farewell to Arms: Sequential Reward Maximization on a Budget with a Giving Up Option P. Sharoff Nishant A. Mehta Ravi Ganti 26 2 0 06 Mar 2020