ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.09434
10
257

Few-Shot Learning via Learning the Representation, Provably

21 February 2020
S. Du
Wei Hu
Sham Kakade
Jason D. Lee
Qi Lei
    SSL
ArXivPDFHTML
Abstract

This paper studies few-shot learning via representation learning, where one uses TTT source tasks with n1n_1n1​ data per task to learn a representation in order to reduce the sample complexity of a target task for which there is only n2(≪n1)n_2 (\ll n_1)n2​(≪n1​) data. Specifically, we focus on the setting where there exists a good \emph{common representation} between source and target, and our goal is to understand how much of a sample size reduction is possible. First, we study the setting where this common representation is low-dimensional and provide a fast rate of O(C(Φ)n1T+kn2)O\left(\frac{\mathcal{C}\left(\Phi\right)}{n_1T} + \frac{k}{n_2}\right)O(n1​TC(Φ)​+n2​k​); here, Φ\PhiΦ is the representation function class, C(Φ)\mathcal{C}\left(\Phi\right)C(Φ) is its complexity measure, and kkk is the dimension of the representation. When specialized to linear representation functions, this rate becomes O(dkn1T+kn2)O\left(\frac{dk}{n_1T} + \frac{k}{n_2}\right)O(n1​Tdk​+n2​k​) where d(≫k)d (\gg k)d(≫k) is the ambient input dimension, which is a substantial improvement over the rate without using representation learning, i.e. over the rate of O(dn2)O\left(\frac{d}{n_2}\right)O(n2​d​). This result bypasses the Ω(1T)\Omega(\frac{1}{T})Ω(T1​) barrier under the i.i.d. task assumption, and can capture the desired property that all n1Tn_1Tn1​T samples from source tasks can be \emph{pooled} together for representation learning. Next, we consider the setting where the common representation may be high-dimensional but is capacity-constrained (say in norm); here, we again demonstrate the advantage of representation learning in both high-dimensional linear regression and neural network learning. Our results demonstrate representation learning can fully utilize all n1Tn_1Tn1​T samples from source tasks.

View on arXiv
Comments on this paper