ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.10251
45
1

Numerical Error Analysis of Large Language Models

13 March 2025
Stanislav Budzinskiy
Wenyi Fang
Longbin Zeng
Philipp Petersen
ArXivPDFHTML
Abstract

Large language models based on transformer architectures have become integral to state-of-the-art natural language processing applications. However, their training remains computationally expensive and exhibits instabilities, some of which are expected to be caused by finite-precision computations. We provide a theoretical analysis of the impact of round-off errors within the forward pass of a transformer architecture which yields fundamental bounds for these effects. In addition, we conduct a series of numerical experiments which demonstrate the practical relevance of our bounds. Our results yield concrete guidelines for choosing hyperparameters that mitigate round-off errors, leading to more robust and stable inference.

View on arXiv
@article{budzinskiy2025_2503.10251,
  title={ Numerical Error Analysis of Large Language Models },
  author={ Stanislav Budzinskiy and Wenyi Fang and Longbin Zeng and Philipp Petersen },
  journal={arXiv preprint arXiv:2503.10251},
  year={ 2025 }
}
Comments on this paper