ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2501.18512
  4. Cited By
Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch

Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch

30 January 2025
Arthur Douillard
Yanislav Donchev
Keith Rush
Satyen Kale
Zachary Charles
Zachary Garrett
Gabriel Teston
Dave Lacey
Ross McIlroy
Jiajun Shen
Alexandre Ramé
Arthur Szlam
MarcÁurelio Ranzato
P. Barham
ArXiv (abs)PDFHTML

Papers citing "Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch"

4 / 4 papers shown
Title
Weight Factorization and Centralization for Continual Learning in Speech Recognition
Weight Factorization and Centralization for Continual Learning in Speech Recognition
Enes Yavuz Ugan
Ngoc-Quan Pham
Alexander Waibel
CLLMoMe
27
0
0
19 Jun 2025
TAH-QUANT: Effective Activation Quantization in Pipeline Parallelism over Slow Network
TAH-QUANT: Effective Activation Quantization in Pipeline Parallelism over Slow Network
Guangxin He
Yuan Cao
Yutong He
Tianyi Bai
Kun Yuan
Binhang Yuan
MQ
61
0
0
02 Jun 2025
MuLoCo: Muon is a practical inner optimizer for DiLoCo
MuLoCo: Muon is a practical inner optimizer for DiLoCo
Benjamin Thérien
Xiaolong Huang
Irina Rish
Eugene Belilovsky
MoE
57
0
0
29 May 2025
Communication-Efficient Language Model Training Scales Reliably and Robustly: Scaling Laws for DiLoCo
Zachary B. Charles
Gabriel Teston
Lucio Dery
Keith Rush
Nova Fallen
Zachary Garrett
Arthur Szlam
Arthur Douillard
461
6
0
12 Mar 2025
1