Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2002.05645
Cited By
v1
v2
v3
v4
v5 (latest)
Training Large Neural Networks with Constant Memory using a New Execution Algorithm
13 February 2020
B. Pudipeddi
Maral Mesmakhosroshahi
Jinwen Xi
S. Bharadwaj
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Training Large Neural Networks with Constant Memory using a New Execution Algorithm"
6 / 6 papers shown
Title
GPU Memory Usage Optimization for Backward Propagation in Deep Network Training
Ding-Yong Hong
Tzu-Hsien Tsai
Ning Wang
Pangfeng Liu
Jan-Jan Wu
105
0
0
18 Feb 2025
Merging Feed-Forward Sublayers for Compressed Transformers
Neha Verma
Kenton W. Murray
Kevin Duh
AI4CE
116
0
0
10 Jan 2025
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
Yang You
Jing Li
Sashank J. Reddi
Jonathan Hseu
Sanjiv Kumar
Srinadh Bhojanapalli
Xiaodan Song
J. Demmel
Kurt Keutzer
Cho-Jui Hsieh
ODL
257
999
0
01 Apr 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
1.1K
7,196
0
20 Apr 2018
Revisiting Distributed Synchronous SGD
Jianmin Chen
Xinghao Pan
R. Monga
Samy Bengio
Rafal Jozefowicz
87
801
0
04 Apr 2016
One weird trick for parallelizing convolutional neural networks
A. Krizhevsky
GNN
93
1,303
0
23 Apr 2014
1