ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.16173
32
19

On the Convergence and Sample Complexity Analysis of Deep Q-Networks with εεε-Greedy Exploration

24 October 2023
Shuai Zhang
Hongkang Li
Meng Wang
Miao Liu
Pin-Yu Chen
Songtao Lu
Sijia Liu
K. Murugesan
Subhajit Chaudhury
ArXivPDFHTML
Abstract

This paper provides a theoretical understanding of Deep Q-Network (DQN) with the ε\varepsilonε-greedy exploration in deep reinforcement learning. Despite the tremendous empirical achievement of the DQN, its theoretical characterization remains underexplored. First, the exploration strategy is either impractical or ignored in the existing analysis. Second, in contrast to conventional Q-learning algorithms, the DQN employs the target network and experience replay to acquire an unbiased estimation of the mean-square Bellman error (MSBE) utilized in training the Q-network. However, the existing theoretical analysis of DQNs lacks convergence analysis or bypasses the technical challenges by deploying a significantly overparameterized neural network, which is not computationally efficient. This paper provides the first theoretical convergence and sample complexity analysis of the practical setting of DQNs with ϵ\epsilonϵ-greedy policy. We prove an iterative procedure with decaying ϵ\epsilonϵ converges to the optimal Q-value function geometrically. Moreover, a higher level of ϵ\epsilonϵ values enlarges the region of convergence but slows down the convergence, while the opposite holds for a lower level of ϵ\epsilonϵ values. Experiments justify our established theoretical insights on DQNs.

View on arXiv
Comments on this paper