Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2002.11985
Cited By
Compressing Large-Scale Transformer-Based Models: A Case Study on BERT
27 February 2020
Prakhar Ganesh
Yao Chen
Xin Lou
Mohammad Ali Khan
Yifan Yang
Hassan Sajjad
Preslav Nakov
Deming Chen
Marianne Winslett
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Compressing Large-Scale Transformer-Based Models: A Case Study on BERT"
34 / 34 papers shown
Title
Efficient Split Learning LSTM Models for FPGA-based Edge IoT Devices
Romina Soledad Molina
Vukan Ninkovic
D. Vukobratović
Maria Liz Crespo
Marco Zennaro
45
0
0
12 Feb 2025
MAS-Attention: Memory-Aware Stream Processing for Attention Acceleration on Resource-Constrained Edge Devices
Mohammadali Shakerdargah
Shan Lu
Chao Gao
Di Niu
75
0
0
20 Nov 2024
Mixture of Modular Experts: Distilling Knowledge from a Multilingual Teacher into Specialized Modular Language Models
Mohammed Al-Maamari
Mehdi Ben Amor
Michael Granitzer
KELM
MoE
38
0
0
28 Jul 2024
Pruning via Merging: Compressing LLMs via Manifold Alignment Based Layer Merging
Deyuan Liu
Zhanyue Qin
Han Wang
Zhao Yang
Zecheng Wang
...
Zhao Lv
Zhiying Tu
Dianhui Chu
Bo Li
Dianbo Sui
24
2
0
24 Jun 2024
CANAL -- Cyber Activity News Alerting Language Model: Empirical Approach vs. Expensive LLM
Urjitkumar Patel
Fang-Chun Yeh
Chinmay Gondhalekar
29
3
0
10 May 2024
Only Send What You Need: Learning to Communicate Efficiently in Federated Multilingual Machine Translation
Yun-Wei Chu
Dong-Jun Han
Christopher G. Brinton
28
4
0
15 Jan 2024
LinguAlchemy: Fusing Typological and Geographical Elements for Unseen Language Generalization
Muhammad Farid Adilazuarda
Samuel Cahyawijaya
Alham Fikri Aji
Genta Indra Winata
Ayu Purwarianti
21
5
0
11 Jan 2024
Parameter Efficient Audio Captioning With Faithful Guidance Using Audio-text Shared Latent Representation
A. Sridhar
Yinyi Guo
Erik M. Visser
Rehana Mahfuz
34
5
0
06 Sep 2023
Adaptive Sparsity Level during Training for Efficient Time Series Forecasting with Transformers
Zahra Atashgahi
Mykola Pechenizkiy
Raymond N. J. Veldhuis
Decebal Constantin Mocanu
AI4TS
AI4CE
29
1
0
28 May 2023
Stop Uploading Test Data in Plain Text: Practical Strategies for Mitigating Data Contamination by Evaluation Benchmarks
Alon Jacovi
Avi Caciularu
Omer Goldman
Yoav Goldberg
17
95
0
17 May 2023
LACoS-BLOOM: Low-rank Adaptation with Contrastive objective on 8 bits Siamese-BLOOM
Wenhui Hua
Brian Williams
Davood Shamsi
28
3
0
10 May 2023
Gradient-Free Structured Pruning with Unlabeled Data
Azade Nova
H. Dai
Dale Schuurmans
SyDa
40
20
0
07 Mar 2023
idT5: Indonesian Version of Multilingual T5 Transformer
Mukhlish Fuadi
A. Wibawa
S. Sumpeno
19
6
0
02 Feb 2023
Intriguing Properties of Compression on Multilingual Models
Kelechi Ogueji
Orevaoghene Ahia
Gbemileke Onilude
Sebastian Gehrmann
Sara Hooker
Julia Kreutzer
21
12
0
04 Nov 2022
Too Brittle To Touch: Comparing the Stability of Quantization and Distillation Towards Developing Lightweight Low-Resource MT Models
Harshita Diddee
Sandipan Dandapat
Monojit Choudhury
T. Ganu
Kalika Bali
31
5
0
27 Oct 2022
MiniALBERT: Model Distillation via Parameter-Efficient Recursive Transformers
Mohammadmahdi Nouriborji
Omid Rohanian
Samaneh Kouchaki
David A. Clifton
32
8
0
12 Oct 2022
Survey: Exploiting Data Redundancy for Optimization of Deep Learning
Jou-An Chen
Wei Niu
Bin Ren
Yanzhi Wang
Xipeng Shen
23
24
0
29 Aug 2022
Structured Pruning Learns Compact and Accurate Models
Mengzhou Xia
Zexuan Zhong
Danqi Chen
VLM
9
177
0
01 Apr 2022
ACORT: A Compact Object Relation Transformer for Parameter Efficient Image Captioning
J. Tan
Y. Tan
C. Chan
Joon Huang Chuah
VLM
ViT
26
15
0
11 Feb 2022
An Ensemble of Pre-trained Transformer Models For Imbalanced Multiclass Malware Classification
Ferhat Demirkiran
Aykut Çayır
U. Ünal
Hasan Dag
38
42
0
25 Dec 2021
Benchmark Static API Call Datasets for Malware Family Classification
Berkant Düzgün
Aykut Çayır
Ferhat Demirkiran
Ceyda Nur Kahya
Buket Gençaydın
Hasan Dag
21
5
0
30 Nov 2021
Few-Shot Self-Rationalization with Natural Language Prompts
Ana Marasović
Iz Beltagy
Doug Downey
Matthew E. Peters
LRM
26
106
0
16 Nov 2021
Structured Pattern Pruning Using Regularization
Dongju Park
Geunghee Lee
23
0
0
18 Sep 2021
Layer-wise Model Pruning based on Mutual Information
Chun Fan
Jiwei Li
Xiang Ao
Fei Wu
Yuxian Meng
Xiaofei Sun
46
19
0
28 Aug 2021
Rethinking Network Pruning -- under the Pre-train and Fine-tune Paradigm
Dongkuan Xu
Ian En-Hsu Yen
Jinxi Zhao
Zhibin Xiao
VLM
AAML
31
56
0
18 Apr 2021
Parameter-Efficient Transfer Learning with Diff Pruning
Demi Guo
Alexander M. Rush
Yoon Kim
13
385
0
14 Dec 2020
The Lottery Ticket Hypothesis for Pre-trained BERT Networks
Tianlong Chen
Jonathan Frankle
Shiyu Chang
Sijia Liu
Yang Zhang
Zhangyang Wang
Michael Carbin
156
377
0
23 Jul 2020
GOBO: Quantizing Attention-Based NLP Models for Low Latency and Energy Efficient Inference
Ali Hadi Zadeh
Isak Edo
Omar Mohamed Awad
Andreas Moshovos
MQ
30
183
0
08 May 2020
Pre-trained Models for Natural Language Processing: A Survey
Xipeng Qiu
Tianxiang Sun
Yige Xu
Yunfan Shao
Ning Dai
Xuanjing Huang
LM&MA
VLM
243
1,452
0
18 Mar 2020
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
Canwen Xu
Wangchunshu Zhou
Tao Ge
Furu Wei
Ming Zhou
221
197
0
07 Feb 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,826
0
17 Sep 2019
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
233
576
0
12 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
299
6,984
0
20 Apr 2018
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,746
0
26 Sep 2016
1