Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.10598
Cited By
v1
v2 (latest)
DropCompute: simple and more robust distributed synchronous training via compute variance reduction
18 June 2023
Niv Giladi
Shahar Gottlieb
Moran Shkolnik
A. Karnieli
Ron Banner
Elad Hoffer
Kfir Y. Levy
Daniel Soudry
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"DropCompute: simple and more robust distributed synchronous training via compute variance reduction"
2 / 2 papers shown
Title
Accelerating AllReduce with a Persistent Straggler
Arjun Devraj
Eric Ding
Abhishek Vijaya Kumar
Robert Kleinberg
Rachee Singh
56
0
0
29 May 2025
Understanding Stragglers in Large Model Training Using What-if Analysis
Jinkun Lin
Ziheng Jiang
Zuquan Song
Sida Zhao
Menghan Yu
...
Shuguang Wang
Yanghua Peng
Xin Liu
Aurojit Panda
Jinyang Li
142
1
0
09 May 2025
1