ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2303.05018
17
1

Improved Regret Bounds for Online Kernel Selection under Bandit Feedback

9 March 2023
Junfan Li
Shizhong Liao
ArXivPDFHTML
Abstract

In this paper, we improve the regret bound for online kernel selection under bandit feedback. Previous algorithm enjoys a O((∥f∥Hi2+1)K13T23)O((\Vert f\Vert^2_{\mathcal{H}_i}+1)K^{\frac{1}{3}}T^{\frac{2}{3}})O((∥f∥Hi​2​+1)K31​T32​) expected bound for Lipschitz loss functions. We prove two types of regret bounds improving the previous bound. For smooth loss functions, we propose an algorithm with a O(U23K−13(∑i=1KLT(fi∗))23)O(U^{\frac{2}{3}}K^{-\frac{1}{3}}(\sum^K_{i=1}L_T(f^\ast_i))^{\frac{2}{3}})O(U32​K−31​(∑i=1K​LT​(fi∗​))32​) expected bound where LT(fi∗)L_T(f^\ast_i)LT​(fi∗​) is the cumulative losses of optimal hypothesis in Hi={f∈Hi:∥f∥Hi≤U}\mathbb{H}_{i}=\{f\in\mathcal{H}_i:\Vert f\Vert_{\mathcal{H}_i}\leq U\}Hi​={f∈Hi​:∥f∥Hi​​≤U}. The data-dependent bound keeps the previous worst-case bound and is smaller if most of candidate kernels match well with the data. For Lipschitz loss functions, we propose an algorithm with a O(UKTln⁡23T)O(U\sqrt{KT}\ln^{\frac{2}{3}}{T})O(UKT​ln32​T) expected bound asymptotically improving the previous bound. We apply the two algorithms to online kernel selection with time constraint and prove new regret bounds matching or improving the previous O(Tln⁡K+∥f∥Hi2max⁡{T,TR})O(\sqrt{T\ln{K}} +\Vert f\Vert^2_{\mathcal{H}_i}\max\{\sqrt{T},\frac{T}{\sqrt{\mathcal{R}}}\})O(TlnK​+∥f∥Hi​2​max{T​,R​T​}) expected bound where R\mathcal{R}R is the time budget. Finally, we empirically verify our algorithms on online regression and classification tasks.

View on arXiv
Comments on this paper