ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.14102
57
0

High-dimensional Nonparametric Contextual Bandit Problem

20 May 2025
Shogo Iwazaki
Junpei Komiyama
Masaaki Imaizumi
ArXiv (abs)PDFHTML
Main:33 Pages
2 Figures
Bibliography:5 Pages
3 Tables
Abstract

We consider the kernelized contextual bandit problem with a large feature space. This problem involves KKK arms, and the goal of the forecaster is to maximize the cumulative rewards through learning the relationship between the contexts and the rewards. It serves as a general framework for various decision-making scenarios, such as personalized online advertising and recommendation systems. Kernelized contextual bandits generalize the linear contextual bandit problem and offers a greater modeling flexibility. Existing methods, when applied to Gaussian kernels, yield a trivial bound of O(T)O(T)O(T) when we consider Ω(log⁡T)\Omega(\log T)Ω(logT) feature dimensions. To address this, we introduce stochastic assumptions on the context distribution and show that no-regret learning is achievable even when the number of dimensions grows up to the number of samples. Furthermore, we analyze lenient regret, which allows a per-round regret of at most Δ>0\Delta > 0Δ>0. We derive the rate of lenient regret in terms of Δ\DeltaΔ.

View on arXiv
@article{iwazaki2025_2505.14102,
  title={ High-dimensional Nonparametric Contextual Bandit Problem },
  author={ Shogo Iwazaki and Junpei Komiyama and Masaaki Imaizumi },
  journal={arXiv preprint arXiv:2505.14102},
  year={ 2025 }
}
Comments on this paper