ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.18554
36
12

Improved Regret Bounds of (Multinomial) Logistic Bandits via Regret-to-Confidence-Set Conversion

28 October 2023
Junghyun Lee
Se-Young Yun
Kwang-Sung Jun
ArXivPDFHTML
Abstract

Logistic bandit is a ubiquitous framework of modeling users' choices, e.g., click vs. no click for advertisement recommender system. We observe that the prior works overlook or neglect dependencies in S≥∥θ⋆∥2S \geq \lVert \theta_\star \rVert_2S≥∥θ⋆​∥2​, where θ⋆∈Rd\theta_\star \in \mathbb{R}^dθ⋆​∈Rd is the unknown parameter vector, which is particularly problematic when SSS is large, e.g., S≥dS \geq dS≥d. In this work, we improve the dependency on SSS via a novel approach called {\it regret-to-confidence set conversion (R2CS)}, which allows us to construct a convex confidence set based on only the \textit{existence} of an online learning algorithm with a regret guarantee. Using R2CS, we obtain a strict improvement in the regret bound w.r.t. SSS in logistic bandits while retaining computational feasibility and the dependence on other factors such as ddd and TTT. We apply our new confidence set to the regret analyses of logistic bandits with a new martingale concentration step that circumvents an additional factor of SSS. We then extend this analysis to multinomial logistic bandits and obtain similar improvements in the regret, showing the efficacy of R2CS. While we applied R2CS to the (multinomial) logistic model, R2CS is a generic approach for developing confidence sets that can be used for various models, which can be of independent interest.

View on arXiv
Comments on this paper