ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.24455
25
0

Domain Pre-training Impact on Representations

30 May 2025
César González-Gutiérrez
A. Quattoni
ArXiv (abs)PDFHTML
Main:4 Pages
8 Figures
Bibliography:5 Pages
7 Tables
Appendix:5 Pages
Abstract

This empirical study analyzes the effects of the pre-training corpus on the quality of learned transformer representations. We focus on the representation quality induced solely through pre-training. Our experiments show that pre-training on a small, specialized corpus can yield effective representations, and that the success of combining a generic and a specialized corpus depends on the distributional similarity between the target task and the specialized corpus.

View on arXiv
@article{gonzalez-gutierrez2025_2505.24455,
  title={ Domain Pre-training Impact on Representations },
  author={ Cesar Gonzalez-Gutierrez and Ariadna Quattoni },
  journal={arXiv preprint arXiv:2505.24455},
  year={ 2025 }
}
Comments on this paper