Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2101.00234
Cited By
Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers
1 January 2021
Machel Reid
Edison Marrese-Taylor
Y. Matsuo
MoE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers"
30 / 30 papers shown
Title
Merging Feed-Forward Sublayers for Compressed Transformers
Neha Verma
Kenton W. Murray
Kevin Duh
AI4CE
50
0
0
10 Jan 2025
KVSharer: Efficient Inference via Layer-Wise Dissimilar KV Cache Sharing
Yifei Yang
Zouying Cao
Qiguang Chen
L. Qin
Dongjie Yang
Hai Zhao
Zhi Chen
32
6
0
24 Oct 2024
QSpec: Speculative Decoding with Complementary Quantization Schemes
Juntao Zhao
Wenhao Lu
Sheng Wang
Lingpeng Kong
Chuan Wu
MQ
74
5
0
15 Oct 2024
Chain and Causal Attention for Efficient Entity Tracking
Erwan Fagnou
Paul Caillon
Blaise Delattre
Alexandre Allauzen
33
3
0
07 Oct 2024
Computational design of target-specific linear peptide binders with TransformerBeta
Haowen Zhao
Francesco A. Aprile
Barbara Bravi
36
0
0
07 Oct 2024
Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model Compression
Jingcun Wang
Yu-Guang Chen
Ing-Chao Lin
Bing Li
Grace Li Zhang
37
4
0
02 Oct 2024
TroL: Traversal of Layers for Large Language and Vision Models
Byung-Kwan Lee
Sangyun Chung
Chae Won Kim
Beomchan Park
Yong Man Ro
48
6
0
18 Jun 2024
Adapter-X: A Novel General Parameter-Efficient Fine-Tuning Framework for Vision
Minglei Li
Peng Ye
Yongqi Huang
Lin Zhang
Tao Chen
Tong He
Jiayuan Fan
Wanli Ouyang
MoE
47
4
0
05 Jun 2024
Layer-Condensed KV Cache for Efficient Inference of Large Language Models
Haoyi Wu
Kewei Tu
MQ
49
19
0
17 May 2024
SPA: Towards A Computational Friendly Cloud-Base and On-Devices Collaboration Seq2seq Personalized Generation
Yanming Liu
Xinyue Peng
Jiannan Cao
Le Dai
Xingzu Liu
Mingbang Wang
Weihao Liu
SyDa
44
2
0
11 Mar 2024
PRoLoRA: Partial Rotation Empowers More Parameter-Efficient LoRA
Sheng Wang
Boyang Xue
Jiacheng Ye
Jiyue Jiang
Liheng Chen
Lingpeng Kong
Chuan Wu
35
14
0
24 Feb 2024
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Zechun Liu
Changsheng Zhao
Forrest N. Iandola
Chen Lai
Yuandong Tian
...
Ernie Chang
Yangyang Shi
Raghuraman Krishnamoorthi
Liangzhen Lai
Vikas Chandra
ALM
43
78
0
22 Feb 2024
Head-wise Shareable Attention for Large Language Models
Zouying Cao
Yifei Yang
Hai Zhao
41
4
0
19 Feb 2024
DE
3
^3
3
-BERT: Distance-Enhanced Early Exiting for BERT based on Prototypical Networks
Jianing He
Qi Zhang
Weiping Ding
Duoqian Miao
Jun Zhao
Liang Hu
LongBing Cao
40
3
0
03 Feb 2024
A Comprehensive Survey of Compression Algorithms for Language Models
Seungcheol Park
Jaehyeon Choi
Sojin Lee
U. Kang
MQ
34
12
0
27 Jan 2024
Dynamic Layer Tying for Parameter-Efficient Transformers
Tamir David Hay
Lior Wolf
33
3
0
23 Jan 2024
Fixed Point Diffusion Models
Xingjian Bai
Luke Melas-Kyriazi
18
3
0
16 Jan 2024
DSFormer: Effective Compression of Text-Transformers by Dense-Sparse Weight Factorization
Rahul Chand
Yashoteja Prabhu
Pratyush Kumar
20
3
0
20 Dec 2023
The Efficiency Spectrum of Large Language Models: An Algorithmic Survey
Tianyu Ding
Tianyi Chen
Haidong Zhu
Jiachen Jiang
Yiqi Zhong
Jinxin Zhou
Guangzhi Wang
Zhihui Zhu
Ilya Zharkov
Luming Liang
31
22
0
01 Dec 2023
PartialFormer: Modeling Part Instead of Whole for Machine Translation
Tong Zheng
Bei Li
Huiwen Bao
Jiale Wang
Weiqiao Shan
Tong Xiao
Jingbo Zhu
MoE
AI4CE
16
0
0
23 Oct 2023
Zero-Shot Sharpness-Aware Quantization for Pre-trained Language Models
Miaoxi Zhu
Qihuang Zhong
Li Shen
Liang Ding
Juhua Liu
Bo Du
Dacheng Tao
MQ
VLM
34
1
0
20 Oct 2023
SPEED: Speculative Pipelined Execution for Efficient Decoding
Coleman Hooper
Sehoon Kim
Hiva Mohammadzadeh
Hasan Genç
Kurt Keutzer
A. Gholami
Y. Shao
32
35
0
18 Oct 2023
One Wide Feedforward is All You Need
Telmo Pires
António V. Lopes
Yannick Assogba
Hendra Setiawan
48
12
0
04 Sep 2023
Weight-Inherited Distillation for Task-Agnostic BERT Compression
Taiqiang Wu
Cheng-An Hou
Shanshan Lao
Jiayi Li
Ngai Wong
Zhe Zhao
Yujiu Yang
71
10
0
16 May 2023
Scaling Pre-trained Language Models to Deeper via Parameter-efficient Architecture
Peiyu Liu
Ze-Feng Gao
Yushuo Chen
Wayne Xin Zhao
Ji-Rong Wen
MoE
40
0
0
27 Mar 2023
Efficient Methods for Natural Language Processing: A Survey
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
40
109
0
31 Aug 2022
EdgeFormer: A Parameter-Efficient Transformer for On-Device Seq2seq Generation
Tao Ge
Si-Qing Chen
Furu Wei
MoE
32
20
0
16 Feb 2022
A Survey on Model Compression and Acceleration for Pretrained Language Models
Canwen Xu
Julian McAuley
30
58
0
15 Feb 2022
ACORT: A Compact Object Relation Transformer for Parameter Efficient Image Captioning
J. Tan
Y. Tan
C. Chan
Joon Huang Chuah
VLM
ViT
31
15
0
11 Feb 2022
Text Summarization with Pretrained Encoders
Yang Liu
Mirella Lapata
MILM
261
1,436
0
22 Aug 2019
1