Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.09149
Cited By
InternEvo: Efficient Long-sequence Large Language Model Training via Hybrid Parallelism and Redundant Sharding
17 January 2024
Qiaoling Chen
Diandian Gu
Guoteng Wang
Xun Chen
Yingtong Xiong
Ting Huang
Qi Hu
Xin Jin
Yonggang Wen
Tianwei Zhang
Peng Sun
Re-assign community
ArXiv
PDF
HTML
Papers citing
"InternEvo: Efficient Long-sequence Large Language Model Training via Hybrid Parallelism and Redundant Sharding"
5 / 5 papers shown
Title
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
Jinguo Zhu
Weiyun Wang
Zhe Chen
Z. Liu
Shenglong Ye
...
Dahua Lin
Yu Qiao
Jifeng Dai
Wenhai Wang
Wei Wang
MLLM
VLM
70
15
1
14 Apr 2025
ZeRO++: Extremely Efficient Collective Communication for Giant Model Training
Guanhua Wang
Heyang Qin
S. A. Jacobs
Connor Holmes
Samyam Rajbhandari
Olatunji Ruwase
Feng Yan
Lei Yang
Yuxiong He
VLM
59
57
0
16 Jun 2023
Varuna: Scalable, Low-cost Training of Massive Deep Learning Models
Sanjith Athlur
Nitika Saran
Muthian Sivathanu
Ramachandran Ramjee
Nipun Kwatra
GNN
31
80
0
07 Nov 2021
P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks
Xiao Liu
Kaixuan Ji
Yicheng Fu
Weng Lam Tam
Zhengxiao Du
Zhilin Yang
Jie Tang
VLM
238
806
0
14 Oct 2021
Survey: Transformer based Video-Language Pre-training
Ludan Ruan
Qin Jin
VLM
ViT
72
44
0
21 Sep 2021
1