Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1904.10631
Cited By
Low-Memory Neural Network Training: A Technical Report
24 April 2019
N. Sohoni
Christopher R. Aberger
Megan Leszczynski
Jian Zhang
Christopher Ré
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Low-Memory Neural Network Training: A Technical Report"
21 / 21 papers shown
Title
GPU Memory Usage Optimization for Backward Propagation in Deep Network Training
Ding-Yong Hong
Tzu-Hsien Tsai
Ning Wang
Pangfeng Liu
Jan-Jan Wu
44
0
0
18 Feb 2025
Breaking the Memory Wall for Heterogeneous Federated Learning via Model Splitting
Chunlin Tian
Li Li
Kahou Tam
Yebo Wu
Chengzhong Xu
FedML
29
1
0
12 Oct 2024
AdaShadow: Responsive Test-time Model Adaptation in Non-stationary Mobile Environments
Cheng Fang
Sicong Liu
Zimu Zhou
Bin Guo
Jiaqi Tang
Ke Ma
Zhiwen Yu
TTA
39
1
0
10 Oct 2024
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models
Byung-Kwan Lee
Chae Won Kim
Beomchan Park
Yonghyun Ro
MLLM
LRM
41
18
0
24 May 2024
Breaking On-device Training Memory Wall: A Systematic Survey
Shitian Li
Chunlin Tian
Kahou Tam
Ruirui Ma
Li Li
23
2
0
17 Jun 2023
Systems for Parallel and Distributed Large-Model Deep Learning Training
Kabir Nagrecha
GNN
VLM
MoE
26
7
0
06 Jan 2023
Compressed Gastric Image Generation Based on Soft-Label Dataset Distillation for Medical Data Sharing
Guang Li
Ren Togo
Takahiro Ogawa
Miki Haseyama
DD
32
40
0
29 Sep 2022
On-device Synaptic Memory Consolidation using Fowler-Nordheim Quantum-tunneling
Mustafizur Rahman
Subhankar Bose
S. Chakrabartty
24
3
0
27 Jun 2022
FuncPipe: A Pipelined Serverless Framework for Fast and Cost-efficient Training of Deep Learning Models
Yunzhuo Liu
Bo Jiang
Tian Guo
Zimeng Huang
Wen-ping Ma
Xinbing Wang
Chenghu Zhou
24
9
0
28 Apr 2022
DropIT: Dropping Intermediate Tensors for Memory-Efficient DNN Training
Joya Chen
Kai Xu
Yuhui Wang
Yifei Cheng
Angela Yao
19
7
0
28 Feb 2022
Enabling On-Device Smartphone GPU based Training: Lessons Learned
Anish Das
Young D. Kwon
Jagmohan Chauhan
Cecilia Mascolo
3DH
30
10
0
21 Feb 2022
BitTrain: Sparse Bitmap Compression for Memory-Efficient Training on the Edge
Abdelrahman I. Hosny
Marina Neseem
Sherief Reda
MQ
35
4
0
29 Oct 2021
Hydra: A System for Large Multi-Model Deep Learning
Kabir Nagrecha
Arun Kumar
MoE
AI4CE
38
5
0
16 Oct 2021
Feature Alignment as a Generative Process
T. S. Farias
Jonas Maziero
DiffM
BDL
21
1
0
23 Jun 2021
How Low Can We Go: Trading Memory for Error in Low-Precision Training
Chengrun Yang
Ziyang Wu
Jerry Chee
Christopher De Sa
Madeleine Udell
18
2
0
17 Jun 2021
Improving Formality Style Transfer with Context-Aware Rule Injection
Zonghai Yao
Hong-ye Yu
20
16
0
01 Jun 2021
Enabling Binary Neural Network Training on the Edge
Erwei Wang
James J. Davis
Daniele Moro
Piotr Zielinski
Jia Jie Lim
C. Coelho
S. Chatterjee
P. Cheung
George A. Constantinides
MQ
20
24
0
08 Feb 2021
Zero-shot Entity Linking with Efficient Long Range Sequence Modeling
Zonghai Yao
Liangliang Cao
Huapu Pan
VLM
15
21
0
12 Oct 2020
Dynamic Tensor Rematerialization
Marisa Kirisame
Steven Lyubomirsky
Altan Haan
Jennifer Brennan
Mike He
Jared Roesch
Tianqi Chen
Zachary Tatlock
27
93
0
17 Jun 2020
Blockwise Self-Attention for Long Document Understanding
J. Qiu
Hao Ma
Omer Levy
Scott Yih
Sinong Wang
Jie Tang
11
251
0
07 Nov 2019
On improving deep learning generalization with adaptive sparse connectivity
Shiwei Liu
Decebal Constantin Mocanu
Mykola Pechenizkiy
ODL
20
7
0
27 Jun 2019
1