ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2110.05852
16
4

On the Self-Penalization Phenomenon in Feature Selection

12 October 2021
Michael I. Jordan
Keli Liu
Feng Ruan
ArXivPDFHTML
Abstract

We describe an implicit sparsity-inducing mechanism based on minimization over a family of kernels: \begin{equation*} \min_{\beta, f}~\widehat{\mathbb{E}}[L(Y, f(\beta^{1/q} \odot X)] + \lambda_n \|f\|_{\mathcal{H}_q}^2~~\text{subject to}~~\beta \ge 0, \end{equation*} where LLL is the loss, ⊙\odot⊙ is coordinate-wise multiplication and Hq\mathcal{H}_qHq​ is the reproducing kernel Hilbert space based on the kernel kq(x,x′)=h(∥x−x′∥qq)k_q(x, x') = h(\|x-x'\|_q^q)kq​(x,x′)=h(∥x−x′∥qq​), where ∥⋅∥q\|\cdot\|_q∥⋅∥q​ is the ℓq\ell_qℓq​ norm. Using gradient descent to optimize this objective with respect to β\betaβ leads to exactly sparse stationary points with high probability. The sparsity is achieved without using any of the well-known explicit sparsification techniques such as penalization (e.g., ℓ1\ell_1ℓ1​), early stopping or post-processing (e.g., clipping). As an application, we use this sparsity-inducing mechanism to build algorithms consistent for feature selection.

View on arXiv
Comments on this paper