ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.01626
  4. Cited By
Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep
  Neural Networks

Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks

3 May 2017
Minsoo Rhu
Mike O'Connor
Niladrish Chatterjee
Jeff Pool
S. Keckler
ArXivPDFHTML

Papers citing "Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks"

24 / 24 papers shown
Title
Oaken: Fast and Efficient LLM Serving with Online-Offline Hybrid KV Cache Quantization
Oaken: Fast and Efficient LLM Serving with Online-Offline Hybrid KV Cache Quantization
Minsu Kim
Seongmin Hong
RyeoWook Ko
S. Choi
Hunjong Lee
Junsoo Kim
Joo-Young Kim
Jongse Park
57
0
0
24 Mar 2025
Vision Transformers for Mobile Applications: A Short Survey
Vision Transformers for Mobile Applications: A Short Survey
Nahid Alam
Steven Kolawole
S. Sethi
Nishant Bansali
Karina Nguyen
ViT
37
3
0
30 May 2023
SPADE: Sparse Pillar-based 3D Object Detection Accelerator for
  Autonomous Driving
SPADE: Sparse Pillar-based 3D Object Detection Accelerator for Autonomous Driving
Minjae Lee
Seongmin Park
Hyung-Se Kim
Minyong Yoon
Jangwhan Lee
Junwon Choi
Nam Sung Kim
Mingu Kang
Jungwook Choi
3DPC
26
5
0
12 May 2023
Improved Projection Learning for Lower Dimensional Feature Maps
Improved Projection Learning for Lower Dimensional Feature Maps
Ilan Price
Jared Tanner
24
3
0
27 Oct 2022
Fine-tuning Language Models over Slow Networks using Activation
  Compression with Guarantees
Fine-tuning Language Models over Slow Networks using Activation Compression with Guarantees
Jue Wang
Binhang Yuan
Luka Rimanic
Yongjun He
Tri Dao
Beidi Chen
Christopher Ré
Ce Zhang
AI4CE
24
11
0
02 Jun 2022
SmartSAGE: Training Large-scale Graph Neural Networks using In-Storage
  Processing Architectures
SmartSAGE: Training Large-scale Graph Neural Networks using In-Storage Processing Architectures
Yunjae Lee
Jin-Won Chung
Minsoo Rhu
GNN
29
49
0
10 May 2022
Training Personalized Recommendation Systems from (GPU) Scratch: Look
  Forward not Backwards
Training Personalized Recommendation Systems from (GPU) Scratch: Look Forward not Backwards
Youngeun Kwon
Minsoo Rhu
31
27
0
10 May 2022
DELTA: Dynamically Optimizing GPU Memory beyond Tensor Recomputation
DELTA: Dynamically Optimizing GPU Memory beyond Tensor Recomputation
Yu Tang
Chenyu Wang
Yufan Zhang
Yuliang Liu
Xingcheng Zhang
Linbo Qiao
Zhiquan Lai
Dongsheng Li
21
4
0
30 Mar 2022
GROW: A Row-Stationary Sparse-Dense GEMM Accelerator for
  Memory-Efficient Graph Convolutional Neural Networks
GROW: A Row-Stationary Sparse-Dense GEMM Accelerator for Memory-Efficient Graph Convolutional Neural Networks
Ranggi Hwang
M. Kang
Jiwon Lee
D. Kam
Youngjoo Lee
Minsoo Rhu
GNN
18
22
0
01 Mar 2022
COMET: A Novel Memory-Efficient Deep Learning Training Framework by
  Using Error-Bounded Lossy Compression
COMET: A Novel Memory-Efficient Deep Learning Training Framework by Using Error-Bounded Lossy Compression
Sian Jin
Chengming Zhang
Xintong Jiang
Yunhe Feng
Hui Guan
Guanpeng Li
Shuaiwen Leon Song
Dingwen Tao
27
23
0
18 Nov 2021
DynO: Dynamic Onloading of Deep Neural Networks from Cloud to Device
DynO: Dynamic Onloading of Deep Neural Networks from Cloud to Device
Mario Almeida
Stefanos Laskaridis
Stylianos I. Venieris
Ilias Leontiadis
Nicholas D. Lane
17
36
0
20 Apr 2021
Extending Sparse Tensor Accelerators to Support Multiple Compression
  Formats
Extending Sparse Tensor Accelerators to Support Multiple Compression Formats
Eric Qin
Geonhwa Jeong
William Won
Sheng-Chun Kao
Hyoukjun Kwon
Sudarshan Srinivasan
Dipankar Das
G. Moon
S. Rajamanickam
T. Krishna
35
18
0
18 Mar 2021
A Novel Memory-Efficient Deep Learning Training Framework via
  Error-Bounded Lossy Compression
A Novel Memory-Efficient Deep Learning Training Framework via Error-Bounded Lossy Compression
Sian Jin
Guanpeng Li
Shuaiwen Leon Song
Dingwen Tao
AI4CE
29
12
0
18 Nov 2020
LazyBatching: An SLA-aware Batching System for Cloud Machine Learning
  Inference
LazyBatching: An SLA-aware Batching System for Cloud Machine Learning Inference
Yujeong Choi
Yunseong Kim
Minsoo Rhu
24
66
0
25 Oct 2020
Tensor Casting: Co-Designing Algorithm-Architecture for Personalized
  Recommendation Training
Tensor Casting: Co-Designing Algorithm-Architecture for Personalized Recommendation Training
Youngeun Kwon
Yunjae Lee
Minsoo Rhu
27
40
0
25 Oct 2020
FPRaker: A Processing Element For Accelerating Neural Network Training
FPRaker: A Processing Element For Accelerating Neural Network Training
Omar Mohamed Awad
Mostafa Mahmoud
Isak Edo Vivancos
Ali Hadi Zadeh
Ciaran Bannon
Anand Jayarajan
Gennady Pekhimenko
Andreas Moshovos
25
15
0
15 Oct 2020
SPINN: Synergistic Progressive Inference of Neural Networks over Device
  and Cloud
SPINN: Synergistic Progressive Inference of Neural Networks over Device and Cloud
Stefanos Laskaridis
Stylianos I. Venieris
Mario Almeida
Ilias Leontiadis
Nicholas D. Lane
28
265
0
14 Aug 2020
PERMDNN: Efficient Compressed DNN Architecture with Permuted Diagonal
  Matrices
PERMDNN: Efficient Compressed DNN Architecture with Permuted Diagonal Matrices
Chunhua Deng
Siyu Liao
Yi Xie
Keshab K. Parhi
Xuehai Qian
Bo Yuan
40
93
0
23 Apr 2020
Sparse Weight Activation Training
Sparse Weight Activation Training
Md Aamir Raihan
Tor M. Aamodt
34
73
0
07 Jan 2020
DASNet: Dynamic Activation Sparsity for Neural Network Efficiency
  Improvement
DASNet: Dynamic Activation Sparsity for Neural Network Efficiency Improvement
Qing Yang
Jiachen Mao
Zuoguan Wang
H. Li
21
15
0
13 Sep 2019
Neural Network Model Extraction Attacks in Edge Devices by Hearing
  Architectural Hints
Neural Network Model Extraction Attacks in Edge Devices by Hearing Architectural Hints
Xing Hu
Ling Liang
Lei Deng
Shuangchen Li
Xinfeng Xie
Yu Ji
Yufei Ding
Chang Liu
T. Sherwood
Yuan Xie
AAML
MLAU
23
36
0
10 Mar 2019
Analyzing Machine Learning Workloads Using a Detailed GPU Simulator
Analyzing Machine Learning Workloads Using a Detailed GPU Simulator
Jonathan Lew
Deval Shah
Suchita Pati
Shaylin Cattell
Mengchi Zhang
...
Christopher Ng
Negar Goli
Matthew D. Sinclair
Timothy G. Rogers
Tor M. Aamodt
29
65
0
18 Nov 2018
Echo: Compiler-based GPU Memory Footprint Reduction for LSTM RNN
  Training
Echo: Compiler-based GPU Memory Footprint Reduction for LSTM RNN Training
Bojian Zheng
Abhishek Tiwari
Nandita Vijaykumar
Gennady Pekhimenko
27
44
0
22 May 2018
Deeply learned face representations are sparse, selective, and robust
Deeply learned face representations are sparse, selective, and robust
Yi Sun
Xiaogang Wang
Xiaoou Tang
CVBM
252
921
0
03 Dec 2014
1