Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2207.08006
Cited By
S4: a High-sparsity, High-performance AI Accelerator
16 July 2022
Ian En-Hsu Yen
Zhibin Xiao
Dongkuan Xu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"S4: a High-sparsity, High-performance AI Accelerator"
8 / 8 papers shown
Title
Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm
Shaoyi Huang
Dongkuan Xu
Ian En-Hsu Yen
Yijue Wang
Sung-En Chang
...
Shiyang Chen
Mimi Xie
Sanguthevar Rajasekaran
Hang Liu
Caiwen Ding
CLL
VLM
42
32
0
15 Oct 2021
I-BERT: Integer-only BERT Quantization
Sehoon Kim
A. Gholami
Z. Yao
Michael W. Mahoney
Kurt Keutzer
MQ
144
351
0
05 Jan 2021
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
Wenhui Wang
Furu Wei
Li Dong
Hangbo Bao
Nan Yang
Ming Zhou
VLM
150
1,267
0
25 Feb 2020
TinyBERT: Distilling BERT for Natural Language Understanding
Xiaoqi Jiao
Yichun Yin
Lifeng Shang
Xin Jiang
Xiao Chen
Linlin Li
F. Wang
Qun Liu
VLM
105
1,860
0
23 Sep 2019
The State of Sparsity in Deep Neural Networks
Trevor Gale
Erich Elsen
Sara Hooker
161
758
0
25 Feb 2019
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
Jonathan Frankle
Michael Carbin
235
3,473
0
09 Mar 2018
To prune, or not to prune: exploring the efficacy of pruning for model compression
Michael Zhu
Suyog Gupta
194
1,276
0
05 Oct 2017
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
Song Han
Huizi Mao
W. Dally
3DGS
259
8,842
0
01 Oct 2015
1