Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.06511
Cited By
TorchTitan: One-stop PyTorch native solution for production ready LLM pre-training
9 October 2024
Wanchao Liang
Tianyu Liu
Less Wright
Will Constable
Andrew Gu
Chien-chin Huang
Iris Zhang
Wei Feng
Howard Huang
Junjie Wang
Sanket Purandare
Gokul Nadathur
Stratos Idreos
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"TorchTitan: One-stop PyTorch native solution for production ready LLM pre-training"
5 / 5 papers shown
Title
Prioritizing Image-Related Tokens Enhances Vision-Language Pre-Training
Yiran Chen
Hao Peng
Tong Zhang
Heng Ji
VLM
28
0
0
13 May 2025
Softpick: No Attention Sink, No Massive Activations with Rectified Softmax
Zayd Muhammad Kawakibi Zuhri
Erland Hilman Fuadi
Alham Fikri Aji
33
0
0
29 Apr 2025
MSCCL++: Rethinking GPU Communication Abstractions for Cutting-edge AI Applications
Aashaka Shah
Abhinav Jangda
Yangqiu Song
Caio Rocha
Changho Hwang
...
Peng Cheng
Qinghua Zhou
Roshan Dathathri
Saeed Maleki
Ziyue Yang
GNN
54
0
0
11 Apr 2025
Mixtera: A Data Plane for Foundation Model Training
Maximilian Böther
Xiaozhe Yao
Tolga Kerimoglu
Ana Klimovic
Viktor Gsteiger
Ana Klimovic
MoE
101
0
0
27 Feb 2025
Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Tim Tsz-Kit Lau
Weijian Li
Chenwei Xu
Han Liu
Mladen Kolar
156
0
0
30 Dec 2024
1