ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1905.06265
14
104

Stochastic approximation with cone-contractive operators: Sharp ℓ∞\ell_\inftyℓ∞​-bounds for QQQ-learning

15 May 2019
Martin J. Wainwright
ArXivPDFHTML
Abstract

Motivated by the study of QQQ-learning algorithms in reinforcement learning, we study a class of stochastic approximation procedures based on operators that satisfy monotonicity and quasi-contractivity conditions with respect to an underlying cone. We prove a general sandwich relation on the iterate error at each time, and use it to derive non-asymptotic bounds on the error in terms of a cone-induced gauge norm. These results are derived within a deterministic framework, requiring no assumptions on the noise. We illustrate these general bounds in application to synchronous QQQ-learning for discounted Markov decision processes with discrete state-action spaces, in particular by deriving non-asymptotic bounds on the ℓ∞\ell_\inftyℓ∞​-norm for a range of stepsizes. These results are the sharpest known to date, and we show via simulation that the dependence of our bounds cannot be improved in a worst-case sense. These results show that relative to a model-based QQQ-iteration, the ℓ∞\ell_\inftyℓ∞​-based sample complexity of QQQ-learning is suboptimal in terms of the discount factor γ\gammaγ.

View on arXiv
Comments on this paper