ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.09946
20
0

Finite-Time Analysis of Simultaneous Double Q-learning

14 June 2024
Hyunjun Na
Donghwan Lee
ArXivPDFHTML
Abstract

QQQ-learning is one of the most fundamental reinforcement learning (RL) algorithms. Despite its widespread success in various applications, it is prone to overestimation bias in the QQQ-learning update. To address this issue, double QQQ-learning employs two independent QQQ-estimators which are randomly selected and updated during the learning process. This paper proposes a modified double QQQ-learning, called simultaneous double QQQ-learning (SDQ), with its finite-time analysis. SDQ eliminates the need for random selection between the two QQQ-estimators, and this modification allows us to analyze double QQQ-learning through the lens of a novel switching system framework facilitating efficient finite-time analysis. Empirical studies demonstrate that SDQ converges faster than double QQQ-learning while retaining the ability to mitigate the maximization bias. Finally, we derive a finite-time expected error bound for SDQ.

View on arXiv
Comments on this paper