ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.10438
27
0

A Fine-grained Analysis of Fitted Q-evaluation: Beyond Parametric Models

14 June 2024
Jiayi Wang
Zhengling Qi
Raymond K. W. Wong
ArXivPDFHTML
Abstract

In this paper, we delve into the statistical analysis of the fitted Q-evaluation (FQE) method, which focuses on estimating the value of a target policy using offline data generated by some behavior policy. We provide a comprehensive theoretical understanding of FQE estimators under both parameteric and nonparametric models on the QQQ-function. Specifically, we address three key questions related to FQE that remain largely unexplored in the current literature: (1) Is the optimal convergence rate for estimating the policy value regarding the sample size nnn (n−1/2n^{-1/2}n−1/2) achievable for FQE under a non-parametric model with a fixed horizon (TTT)? (2) How does the error bound depend on the horizon TTT? (3) What is the role of the probability ratio function in improving the convergence of FQE estimators? Specifically, we show that under the completeness assumption of QQQ-functions, which is mild in the non-parametric setting, the estimation errors for policy value using both parametric and non-parametric FQE estimators can achieve an optimal rate in terms of nnn. The corresponding error bounds in terms of both nnn and TTT are also established. With an additional realizability assumption on ratio functions, the rate of estimation errors can be improved from T1.5/nT^{1.5}/\sqrt{n}T1.5/n​ to T/nT/\sqrt{n}T/n​, which matches the sharpest known bound in the current literature under the tabular setting.

View on arXiv
Comments on this paper