ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.05262
15
1

Factors of Influence of the Overestimation Bias of Q-Learning

11 October 2022
Julius Wagenbach
M. Sabatelli
ArXivPDFHTML
Abstract

We study whether the learning rate α\alphaα, the discount factor γ\gammaγ and the reward signal rrr have an influence on the overestimation bias of the Q-Learning algorithm. Our preliminary results in environments which are stochastic and that require the use of neural networks as function approximators, show that all three parameters influence overestimation significantly. By carefully tuning α\alphaα and γ\gammaγ, and by using an exponential moving average of rrr in Q-Learning's temporal difference target, we show that the algorithm can learn value estimates that are more accurate than the ones of several other popular model-free methods that have addressed its overestimation bias in the past.

View on arXiv
Comments on this paper