ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.08265
8
4

Dimension Reduction in Contextual Online Learning via Nonparametric Variable Selection

17 September 2020
Wenhao Li
Ningyuan Chen
Alekh Agarwal
ArXivPDFHTML
Abstract

We consider a contextual online learning (multi-armed bandit) problem with high-dimensional covariate x\mathbf{x}x and decision y\mathbf{y}y. The reward function to learn, f(x,y)f(\mathbf{x},\mathbf{y})f(x,y), does not have a particular parametric form. The literature has shown that the optimal regret is O~(T(dx+dy+1)/(dx+dy+2))\tilde{O}(T^{(d_x+d_y+1)/(d_x+d_y+2)})O~(T(dx​+dy​+1)/(dx​+dy​+2)), where dxd_xdx​ and dyd_ydy​ are the dimensions of x\mathbf xx and y\mathbf yy, and thus it suffers from the curse of dimensionality. In many applications, only a small subset of variables in the covariate affect the value of fff, which is referred to as \textit{sparsity} in statistics. To take advantage of the sparsity structure of the covariate, we propose a variable selection algorithm called \textit{BV-LASSO}, which incorporates novel ideas such as binning and voting to apply LASSO to nonparametric settings. Our algorithm achieves the regret O~(T(dx∗+dy+1)/(dx∗+dy+2))\tilde{O}(T^{(d_x^*+d_y+1)/(d_x^*+d_y+2)})O~(T(dx∗​+dy​+1)/(dx∗​+dy​+2)), where dx∗d_x^*dx∗​ is the effective covariate dimension. The regret matches the optimal regret when the covariate is dx∗d^*_xdx∗​-dimensional and thus cannot be improved. Our algorithm may serve as a general recipe to achieve dimension reduction via variable selection in nonparametric settings.

View on arXiv
Comments on this paper