ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.18646
53
0

ZeroLM: Data-Free Transformer Architecture Search for Language Models

24 March 2025
Zhen-Song Chen
Hong-Wei Ding
Xian-Jia Wang
Witold Pedrycz
ArXivPDFHTML
Abstract

Neural architecture search (NAS) provides a systematic framework for automating the design of neural network architectures, yet its widespread adoption is hindered by prohibitive computational requirements. Existing zero-cost proxy methods, while reducing search overhead, demonstrate inadequate performance in architecture ranking tasks, particularly for Transformer-based models where they often underperform simple parameter counting metrics. Current automated proxy discovery approaches suffer from extended search times, susceptibility to data overfitting, and structural complexity. This paper introduces a novel zero-cost proxy methodology that quantifies model capacity through efficient weight statistics computation while decomposing Transformer architectures into functionally distinct sub-modules, thereby optimizing the balance of their contributions to overall performance. Our comprehensive evaluation demonstrates the superiority of this approach, achieving a Spearman's rho of 0.76 and Kendall's tau of 0.53 on the FlexiBERT benchmark. The proposed method exhibits exceptional computational efficiency while maintaining robust performance across diverse NAS benchmark tasks, offering a practical solution for large-scale architecture search.

View on arXiv
@article{chen2025_2503.18646,
  title={ ZeroLM: Data-Free Transformer Architecture Search for Language Models },
  author={ Zhen-Song Chen and Hong-Wei Ding and Xian-Jia Wang and Witold Pedrycz },
  journal={arXiv preprint arXiv:2503.18646},
  year={ 2025 }
}
Comments on this paper