ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2506.15006
5
0

Scaling Intelligence: Designing Data Centers for Next-Gen Language Models

17 June 2025
Jesmin Jahan Tithi
Hanjiang Wu
Avishaii Abuhatzera
Fabrizio Petrini
    MoEALM
ArXiv (abs)PDFHTML
Main:12 Pages
9 Figures
Bibliography:1 Pages
10 Tables
Appendix:1 Pages
Abstract

The explosive growth of Large Language Models (LLMs) - such as GPT-4 with 1.8 trillion parameters - demands a radical rethinking of data center architecture to ensure scalability, efficiency, and cost-effectiveness. Our work provides a comprehensive co-design framework that jointly explores FLOPS, HBM bandwidth and capacity, multiple network topologies (two-tier vs. FullFlat optical), the size of the scale-out domain, and popular parallelism/optimization strategies used in LLMs. We introduce and evaluate FullFlat network architectures, which provide uniform high-bandwidth, low-latency connectivity between all nodes, and demonstrate their transformative impact on performance and scalability. Through detailed sensitivity analyses, we quantify the benefits of overlapping compute and communication, leveraging hardware-accelerated collectives, wider scale-out domains, and larger memory capacity. Our study spans both sparse (mixture of experts) and dense transformer-based LLMs, revealing how system design choices affect Model FLOPS Utilization (MFU = Model flops per token x Observed tokens per sec / Peak flops of the hardware) and overall throughput. For the co-design study, we extended and validated a performance modeling tool capable of predicting LLM runtime within 10% of real-world measurements. Our findings offer actionable insights and a practical roadmap for designing AI data centers that can efficiently support trillion-parameter models, reduce optimization complexity, and sustain the rapid evolution of AI capabilities.

View on arXiv
@article{tithi2025_2506.15006,
  title={ Scaling Intelligence: Designing Data Centers for Next-Gen Language Models },
  author={ Jesmin Jahan Tithi and Hanjiang Wu and Avishaii Abuhatzera and Fabrizio Petrini },
  journal={arXiv preprint arXiv:2506.15006},
  year={ 2025 }
}
Comments on this paper