ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.09383
125
1
v1v2v3 (latest)

Deep Transfer Learning: Model Framework and Error Analysis

12 October 2024
Yuling Jiao
Huazhen Lin
Yuchen Luo
Jerry Zhijian Yang
ArXiv (abs)PDFHTML
Abstract

This paper presents a framework for deep transfer learning, which aims to leverage information from multi-domain upstream data with a large number of samples nnn to a single-domain downstream task with a considerably smaller number of samples mmm, where m≪nm \ll nm≪n, in order to enhance performance on downstream task. Our framework has several intriguing features. First, it allows the existence of both shared and specific features among multi-domain data and provides a framework for automatic identification, achieving precise transfer and utilization of information. Second, our model framework explicitly indicates the upstream features that contribute to downstream tasks, establishing a relationship between upstream domains and downstream tasks, thereby enhancing interpretability. Error analysis demonstrates that the transfer under our framework can significantly improve the convergence rate for learning Lipschitz functions in downstream supervised tasks, reducing it from O~(m−12(d+2)+n−12(d+2))\tilde{O}(m^{-\frac{1}{2(d+2)}}+n^{-\frac{1}{2(d+2)}})O~(m−2(d+2)1​+n−2(d+2)1​) ("no transfer") to O~(m−12(d∗+3)+n−12(d+2))\tilde{O}(m^{-\frac{1}{2(d^*+3)}} + n^{-\frac{1}{2(d+2)}})O~(m−2(d∗+3)1​+n−2(d+2)1​) ("partial transfer"), and even to O~(m−1/2+n−12(d+2))\tilde{O}(m^{-1/2}+n^{-\frac{1}{2(d+2)}})O~(m−1/2+n−2(d+2)1​) ("complete transfer"), where d∗≪dd^* \ll dd∗≪d and ddd is the dimension of the observed data. Our theoretical findings are substantiated by empirical experiments conducted on image classification datasets, along with a regression dataset.

View on arXiv
Comments on this paper