Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1811.02084
Cited By
Mesh-TensorFlow: Deep Learning for Supercomputers
5 November 2018
Noam M. Shazeer
Youlong Cheng
Niki Parmar
Dustin Tran
Ashish Vaswani
Gregory J. Puleo
Peter Hawkins
HyoukJoong Lee
O. Milenkovic
C. Young
Ryan Sepassi
Blake Hechtman
GNN
MoE
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Mesh-TensorFlow: Deep Learning for Supercomputers"
6 / 6 papers shown
Title
Mixtera: A Data Plane for Foundation Model Training
Maximilian Böther
Xiaozhe Yao
Tolga Kerimoglu
Ana Klimovic
Viktor Gsteiger
Ana Klimovic
MoE
132
0
0
27 Feb 2025
FairKV: Balancing Per-Head KV Cache for Fast Multi-GPU Inference
Bingzhe Zhao
Ke Cheng
Aomufei Yuan
Yuxuan Tian
Ruiguang Zhong
Chengchen Hu
Tong Yang
Lian Yu
71
0
0
19 Feb 2025
Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
Jianing Yang
Alexander Sax
Kevin J. Liang
Mikael Henaff
Hao Tang
Ang Cao
J. Chai
Franziska Meier
Matt Feiszli
3DGS
124
20
0
23 Jan 2025
Hardware Scaling Trends and Diminishing Returns in Large-Scale Distributed Training
Jared Fernandez
Luca Wehrstedt
Leonid Shamis
Mostafa Elhoushi
Kalyan Saladi
Yonatan Bisk
Emma Strubell
Jacob Kahn
418
3
0
20 Nov 2024
UniAP: Unifying Inter- and Intra-Layer Automatic Parallelism by Mixed Integer Quadratic Programming
Hao Lin
Ke Wu
Jie Li
Jun Yu Li
Wu-Jun Li
58
2
0
31 Jul 2023
PDPP: Projected Diffusion for Procedure Planning in Instructional Videos
Hanlin Wang
Yilu Wu
Sheng Guo
Limin Wang
VGen
DiffM
108
30
0
26 Mar 2023
1