Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1909.11556
Cited By
Reducing Transformer Depth on Demand with Structured Dropout
25 September 2019
Angela Fan
Edouard Grave
Armand Joulin
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Reducing Transformer Depth on Demand with Structured Dropout"
50 / 406 papers shown
Title
Enhancing Domain Adaptation through Prompt Gradient Alignment
Hoang Phan
Lam C. Tran
Quyen Tran
Trung Le
184
1
0
13 Jun 2024
DAISY: Data Adaptive Self-Supervised Early Exit for Speech Representation Models
Tzu-Quan Lin
Hung-yi Lee
Hao Tang
95
3
0
08 Jun 2024
VTrans: Accelerating Transformer Compression with Variational Information Bottleneck based Pruning
Oshin Dutta
Ritvik Gupta
Sumeet Agarwal
93
2
0
07 Jun 2024
Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning
Naibin Gu
Peng Fu
Xiyu Liu
Bowen Shen
Zheng Lin
Weiping Wang
69
10
0
06 Jun 2024
Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching
Xinyin Ma
Gongfan Fang
Michael Bi Mi
Xinchao Wang
114
44
0
03 Jun 2024
SUBLLM: A Novel Efficient Architecture with Token Sequence Subsampling for LLM
Quandong Wang
Yuxuan Yuan
Xiaoyu Yang
Ruike Zhang
Kang Zhao
Wei Liu
Jian Luan
Daniel Povey
Bin Wang
71
0
0
03 Jun 2024
S3D: A Simple and Cost-Effective Self-Speculative Decoding Scheme for Low-Memory GPUs
Wei Zhong
Manasa Bharadwaj
115
7
0
30 May 2024
FlexiDrop: Theoretical Insights and Practical Advances in Random Dropout Method on GNNs
Zhiheng Zhou
Sihao Liu
Weichen Zhao
84
0
0
30 May 2024
STAT: Shrinking Transformers After Training
Megan Flynn
Alexander Wang
Dean Edward Alvarez
Christopher De Sa
Anil Damle
79
2
0
29 May 2024
FinerCut: Finer-grained Interpretable Layer Pruning for Large Language Models
Yang Zhang
Yawei Li
Xinpeng Wang
Qianli Shen
Barbara Plank
Bernd Bischl
Mina Rezaei
Kenji Kawaguchi
110
12
0
28 May 2024
CEEBERT: Cross-Domain Inference in Early Exit BERT
Divya J. Bajpai
M. Hanawal
LRM
79
5
0
23 May 2024
A Survey on Transformers in NLP with Focus on Efficiency
Wazib Ansar
Saptarsi Goswami
Amlan Chakrabarti
MedIm
93
2
0
15 May 2024
OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning
Dan Qiao
Yi Su
Pinzheng Wang
Jing Ye
Wen Xie
...
Wenliang Chen
Guohong Fu
Guodong Zhou
Qiaoming Zhu
Min Zhang
MQ
60
0
0
09 May 2024
Switchable Decision: Dynamic Neural Generation Networks
Shujian Zhang
Korawat Tanwisuth
Chengyue Gong
Pengcheng He
Mi Zhou
BDL
77
0
0
07 May 2024
SPAFIT: Stratified Progressive Adaptation Fine-tuning for Pre-trained Large Language Models
Samir Arora
Liangliang Wang
34
0
0
30 Apr 2024
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding
Mostafa Elhoushi
Akshat Shrivastava
Diana Liskovich
Basil Hosmer
Bram Wasti
...
Saurabh Agarwal
Ahmed Roman
Ahmed Aly
Beidi Chen
Carole-Jean Wu
LRM
112
110
0
25 Apr 2024
Accelerating Inference in Large Language Models with a Unified Layer Skipping Strategy
Yijin Liu
Fandong Meng
Jie Zhou
AI4CE
81
9
0
10 Apr 2024
F-MALLOC: Feed-forward Memory Allocation for Continual Learning in Neural Machine Translation
Junhong Wu
Yuchen Liu
Chengqing Zong
CLL
84
1
0
07 Apr 2024
The Unreasonable Ineffectiveness of the Deeper Layers
Andrey Gromov
Kushal Tirumala
Hassan Shapourian
Paolo Glorioso
Daniel A. Roberts
155
106
0
26 Mar 2024
Hierarchical Skip Decoding for Efficient Autoregressive Text Generation
Yunqi Zhu
Xuebing Yang
Yuanyuan Wu
Wensheng Zhang
124
3
0
22 Mar 2024
Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization
Haocheng Xi
Yuxiang Chen
Kang Zhao
Kaijun Zheng
Jianfei Chen
Jun Zhu
MQ
99
23
0
19 Mar 2024
FBPT: A Fully Binary Point Transformer
Zhixing Hou
Yuzhang Shang
Yan Yan
MQ
62
1
0
15 Mar 2024
CHAI: Clustered Head Attention for Efficient LLM Inference
Saurabh Agarwal
Bilge Acun
Basil Homer
Mostafa Elhoushi
Yejin Lee
Shivaram Venkataraman
Dimitris Papailiopoulos
Carole-Jean Wu
107
11
0
12 Mar 2024
MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error Metric
Haokun Lin
Haoli Bai
Zhili Liu
Lu Hou
Muyi Sun
Linqi Song
Ying Wei
Zhenan Sun
CLIP
VLM
94
17
0
12 Mar 2024
Controllable Prompt Tuning For Balancing Group Distributional Robustness
Hoang Phan
Andrew Gordon Wilson
Qi Lei
99
7
0
05 Mar 2024
ImgTrojan: Jailbreaking Vision-Language Models with ONE Image
Xijia Tao
Shuai Zhong
Lei Li
Qi Liu
Lingpeng Kong
133
30
0
05 Mar 2024
OSSCAR: One-Shot Structured Pruning in Vision and Language Models with Combinatorial Optimization
Xiang Meng
Shibal Ibrahim
Kayhan Behdin
Hussein Hazimeh
Natalia Ponomareva
Rahul Mazumder
VLM
104
8
0
02 Mar 2024
NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for Large Language Models
Amit Dhurandhar
Tejaswini Pedapati
Ronny Luss
Soham Dan
Aurélie C. Lozano
Payel Das
Georgios Kollias
84
3
0
28 Feb 2024
DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation
Sunghyeon Woo
Baeseong Park
Byeongwook Kim
Minjung Jo
S. Kwon
Dongsuk Jeon
Dongsoo Lee
132
3
0
27 Feb 2024
Ouroboros: Generating Longer Drafts Phrase by Phrase for Faster Speculative Decoding
Weilin Zhao
Yuxiang Huang
Xu Han
Wang Xu
Chaojun Xiao
Xinrong Zhang
Yewei Fang
Kaihuo Zhang
Zhiyuan Liu
Maosong Sun
125
12
0
21 Feb 2024
Why Lift so Heavy? Slimming Large Language Models by Cutting Off the Layers
Shuzhou Yuan
Ercong Nie
Bolei Ma
Michael Farber
110
3
0
18 Feb 2024
LaCo: Large Language Model Pruning via Layer Collapse
Yifei Yang
Zouying Cao
Hai Zhao
90
63
0
17 Feb 2024
Model Compression and Efficient Inference for Large Language Models: A Survey
Wenxiao Wang
Wei Chen
Yicong Luo
Yongliu Long
Zhengkai Lin
Liye Zhang
Binbin Lin
Deng Cai
Xiaofei He
MQ
116
58
0
15 Feb 2024
Efficient Stagewise Pretraining via Progressive Subnetworks
Abhishek Panigrahi
Nikunj Saunshi
Kaifeng Lyu
Sobhan Miryoosefi
Sashank J. Reddi
Satyen Kale
Sanjiv Kumar
65
6
0
08 Feb 2024
A Survey on Transformer Compression
Yehui Tang
Yunhe Wang
Jianyuan Guo
Zhijun Tu
Kai Han
Hailin Hu
Dacheng Tao
152
35
0
05 Feb 2024
Shortened LLaMA: Depth Pruning for Large Language Models with Comparison of Retraining Methods
Bo-Kyeong Kim
Geonmin Kim
Tae-Ho Kim
Thibault Castells
Shinkook Choi
Junho Shin
Hyoung-Kyu Song
112
40
0
05 Feb 2024
DE
3
^3
3
-BERT: Distance-Enhanced Early Exiting for BERT based on Prototypical Networks
Jianing He
Qi Zhang
Weiping Ding
Duoqian Miao
Jun Zhao
Liang Hu
LongBing Cao
91
3
0
03 Feb 2024
Do deep neural networks utilize the weight space efficiently?
Onur Can Koyun
B. U. Toreyin
54
0
0
26 Jan 2024
Dynamic Layer Tying for Parameter-Efficient Transformers
Tamir David Hay
Lior Wolf
72
3
0
23 Jan 2024
CTC Blank Triggered Dynamic Layer-Skipping for Efficient CTC-based Speech Recognition
Junfeng Hou
Peiyao Wang
Jincheng Zhang
Meng Yang
Minwei Feng
Jingcheng Yin
60
1
0
04 Jan 2024
Adaptive Depth Networks with Skippable Sub-Paths
Woochul Kang
67
1
0
27 Dec 2023
Fairness-Aware Structured Pruning in Transformers
A. Zayed
Gonçalo Mordido
Samira Shabanian
Ioana Baldini
Sarath Chandar
65
19
0
24 Dec 2023
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems
Xupeng Miao
Gabriele Oliaro
Zhihao Zhang
Xinhao Cheng
Hongyi Jin
Tianqi Chen
Zhihao Jia
138
86
0
23 Dec 2023
ConsistentEE: A Consistent and Hardness-Guided Early Exiting Method for Accelerating Language Models Inference
Huiping Zhuang
Yihuai Hong
Hongliang Dai
Huiping Zhuang
Cen Chen
91
10
0
19 Dec 2023
Exploring Sparsity in Graph Transformers
Chuang Liu
Yibing Zhan
Xueqi Ma
Liang Ding
Dapeng Tao
Hongzhi Zhang
Wenbin Hu
Bo Du
94
7
0
09 Dec 2023
LayerCollapse: Adaptive compression of neural networks
Soheil Zibakhsh Shabgahi
Mohammad Soheil Shariff
F. Koushanfar
AI4CE
61
1
0
29 Nov 2023
Learning to Skip for Language Modeling
Dewen Zeng
Nan Du
Tao Wang
Yuanzhong Xu
Tao Lei
Zhifeng Chen
Claire Cui
68
12
0
26 Nov 2023
OrchestraLLM: Efficient Orchestration of Language Models for Dialogue State Tracking
Chia-Hsuan Lee
Hao Cheng
Mari Ostendorf
132
6
0
16 Nov 2023
Transfer Learning for Structured Pruning under Limited Task Data
Lucio Dery
David Grangier
Awni Y. Hannun
70
0
0
10 Nov 2023
Near-Linear Scaling Data Parallel Training with Overlapping-Aware Gradient Compression
Lin Meng
Yuzhong Sun
Weimin Li
77
1
0
08 Nov 2023
Previous
1
2
3
4
5
6
7
8
9
Next