ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.11556
  4. Cited By
Reducing Transformer Depth on Demand with Structured Dropout

Reducing Transformer Depth on Demand with Structured Dropout

25 September 2019
Angela Fan
Edouard Grave
Armand Joulin
ArXivPDFHTML

Papers citing "Reducing Transformer Depth on Demand with Structured Dropout"

50 / 400 papers shown
Title
FlexiDrop: Theoretical Insights and Practical Advances in Random Dropout
  Method on GNNs
FlexiDrop: Theoretical Insights and Practical Advances in Random Dropout Method on GNNs
Zhiheng Zhou
Sihao Liu
Weichen Zhao
24
0
0
30 May 2024
STAT: Shrinking Transformers After Training
STAT: Shrinking Transformers After Training
Megan Flynn
Alexander Wang
Dean Edward Alvarez
Christopher De Sa
Anil Damle
36
2
0
29 May 2024
FinerCut: Finer-grained Interpretable Layer Pruning for Large Language
  Models
FinerCut: Finer-grained Interpretable Layer Pruning for Large Language Models
Yang Zhang
Yawei Li
Xinpeng Wang
Qianli Shen
Barbara Plank
Bernd Bischl
Mina Rezaei
Kenji Kawaguchi
60
7
0
28 May 2024
CEEBERT: Cross-Domain Inference in Early Exit BERT
CEEBERT: Cross-Domain Inference in Early Exit BERT
Divya J. Bajpai
M. Hanawal
LRM
45
4
0
23 May 2024
A Survey on Transformers in NLP with Focus on Efficiency
A Survey on Transformers in NLP with Focus on Efficiency
Wazib Ansar
Saptarsi Goswami
Amlan Chakrabarti
MedIm
40
2
0
15 May 2024
OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage
  Pruning
OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning
Dan Qiao
Yi Su
Pinzheng Wang
Jing Ye
Wen Xie
...
Wenliang Chen
Guohong Fu
Guodong Zhou
Qiaoming Zhu
Min Zhang
MQ
35
0
0
09 May 2024
Switchable Decision: Dynamic Neural Generation Networks
Switchable Decision: Dynamic Neural Generation Networks
Shujian Zhang
Korawat Tanwisuth
Chengyue Gong
Pengcheng He
Mi Zhou
BDL
38
0
0
07 May 2024
SPAFIT: Stratified Progressive Adaptation Fine-tuning for Pre-trained
  Large Language Models
SPAFIT: Stratified Progressive Adaptation Fine-tuning for Pre-trained Large Language Models
Samir Arora
Liangliang Wang
22
0
0
30 Apr 2024
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding
Mostafa Elhoushi
Akshat Shrivastava
Diana Liskovich
Basil Hosmer
Bram Wasti
...
Saurabh Agarwal
Ahmed Roman
Ahmed Aly
Beidi Chen
Carole-Jean Wu
LRM
33
85
0
25 Apr 2024
Accelerating Inference in Large Language Models with a Unified Layer
  Skipping Strategy
Accelerating Inference in Large Language Models with a Unified Layer Skipping Strategy
Yijin Liu
Fandong Meng
Jie Zhou
AI4CE
27
7
0
10 Apr 2024
F-MALLOC: Feed-forward Memory Allocation for Continual Learning in
  Neural Machine Translation
F-MALLOC: Feed-forward Memory Allocation for Continual Learning in Neural Machine Translation
Junhong Wu
Yuchen Liu
Chengqing Zong
CLL
44
1
0
07 Apr 2024
The Unreasonable Ineffectiveness of the Deeper Layers
The Unreasonable Ineffectiveness of the Deeper Layers
Andrey Gromov
Kushal Tirumala
Hassan Shapourian
Paolo Glorioso
Daniel A. Roberts
52
79
0
26 Mar 2024
Hierarchical Skip Decoding for Efficient Autoregressive Text Generation
Hierarchical Skip Decoding for Efficient Autoregressive Text Generation
Yunqi Zhu
Xuebing Yang
Yuanyuan Wu
Wensheng Zhang
33
3
0
22 Mar 2024
Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data
  Flow and Per-Block Quantization
Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization
Haocheng Xi
Yuxiang Chen
Kang Zhao
Kaijun Zheng
Jianfei Chen
Jun Zhu
MQ
39
20
0
19 Mar 2024
FBPT: A Fully Binary Point Transformer
FBPT: A Fully Binary Point Transformer
Zhixing Hou
Yuzhang Shang
Yan Yan
MQ
25
1
0
15 Mar 2024
CHAI: Clustered Head Attention for Efficient LLM Inference
CHAI: Clustered Head Attention for Efficient LLM Inference
Saurabh Agarwal
Bilge Acun
Basil Homer
Mostafa Elhoushi
Yejin Lee
Shivaram Venkataraman
Dimitris Papailiopoulos
Carole-Jean Wu
55
8
0
12 Mar 2024
MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with
  Module-wise Pruning Error Metric
MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error Metric
Haokun Lin
Haoli Bai
Zhili Liu
Lu Hou
Muyi Sun
Linqi Song
Ying Wei
Zhenan Sun
CLIP
VLM
63
14
0
12 Mar 2024
Controllable Prompt Tuning For Balancing Group Distributional Robustness
Controllable Prompt Tuning For Balancing Group Distributional Robustness
Hoang Phan
Andrew Gordon Wilson
Qi Lei
43
5
0
05 Mar 2024
OSSCAR: One-Shot Structured Pruning in Vision and Language Models with
  Combinatorial Optimization
OSSCAR: One-Shot Structured Pruning in Vision and Language Models with Combinatorial Optimization
Xiang Meng
Shibal Ibrahim
Kayhan Behdin
Hussein Hazimeh
Natalia Ponomareva
Rahul Mazumder
VLM
49
5
0
02 Mar 2024
NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for
  Large Language Models
NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for Large Language Models
Amit Dhurandhar
Tejaswini Pedapati
Ronny Luss
Soham Dan
Aurélie C. Lozano
Payel Das
Georgios Kollias
22
3
0
28 Feb 2024
DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation
DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation
Sunghyeon Woo
Baeseong Park
Byeongwook Kim
Minjung Jo
S. Kwon
Dongsuk Jeon
Dongsoo Lee
65
2
0
27 Feb 2024
Ouroboros: Generating Longer Drafts Phrase by Phrase for Faster
  Speculative Decoding
Ouroboros: Generating Longer Drafts Phrase by Phrase for Faster Speculative Decoding
Weilin Zhao
Yuxiang Huang
Xu Han
Wang Xu
Chaojun Xiao
Xinrong Zhang
Yewei Fang
Kaihuo Zhang
Zhiyuan Liu
Maosong Sun
37
11
0
21 Feb 2024
Why Lift so Heavy? Slimming Large Language Models by Cutting Off the Layers
Why Lift so Heavy? Slimming Large Language Models by Cutting Off the Layers
Shuzhou Yuan
Ercong Nie
Bolei Ma
Michael Farber
42
3
0
18 Feb 2024
LaCo: Large Language Model Pruning via Layer Collapse
LaCo: Large Language Model Pruning via Layer Collapse
Yifei Yang
Zouying Cao
Hai Zhao
19
52
0
17 Feb 2024
Model Compression and Efficient Inference for Large Language Models: A
  Survey
Model Compression and Efficient Inference for Large Language Models: A Survey
Wenxiao Wang
Wei Chen
Yicong Luo
Yongliu Long
Zhengkai Lin
Liye Zhang
Binbin Lin
Deng Cai
Xiaofei He
MQ
41
48
0
15 Feb 2024
Efficient Stagewise Pretraining via Progressive Subnetworks
Efficient Stagewise Pretraining via Progressive Subnetworks
Abhishek Panigrahi
Nikunj Saunshi
Kaifeng Lyu
Sobhan Miryoosefi
Sashank J. Reddi
Satyen Kale
Sanjiv Kumar
38
5
0
08 Feb 2024
A Survey on Transformer Compression
A Survey on Transformer Compression
Yehui Tang
Yunhe Wang
Jianyuan Guo
Zhijun Tu
Kai Han
Hailin Hu
Dacheng Tao
37
28
0
05 Feb 2024
Shortened LLaMA: Depth Pruning for Large Language Models with Comparison
  of Retraining Methods
Shortened LLaMA: Depth Pruning for Large Language Models with Comparison of Retraining Methods
Bo-Kyeong Kim
Geonmin Kim
Tae-Ho Kim
Thibault Castells
Shinkook Choi
Junho Shin
Hyoung-Kyu Song
62
30
0
05 Feb 2024
DE$^3$-BERT: Distance-Enhanced Early Exiting for BERT based on
  Prototypical Networks
DE3^33-BERT: Distance-Enhanced Early Exiting for BERT based on Prototypical Networks
Jianing He
Qi Zhang
Weiping Ding
Duoqian Miao
Jun Zhao
Liang Hu
LongBing Cao
38
3
0
03 Feb 2024
Do deep neural networks utilize the weight space efficiently?
Do deep neural networks utilize the weight space efficiently?
Onur Can Koyun
B. U. Toreyin
18
0
0
26 Jan 2024
Dynamic Layer Tying for Parameter-Efficient Transformers
Dynamic Layer Tying for Parameter-Efficient Transformers
Tamir David Hay
Lior Wolf
27
3
0
23 Jan 2024
CTC Blank Triggered Dynamic Layer-Skipping for Efficient CTC-based
  Speech Recognition
CTC Blank Triggered Dynamic Layer-Skipping for Efficient CTC-based Speech Recognition
Junfeng Hou
Peiyao Wang
Jincheng Zhang
Meng-Da Yang
Minwei Feng
Jingcheng Yin
29
1
0
04 Jan 2024
Adaptive Depth Networks with Skippable Sub-Paths
Adaptive Depth Networks with Skippable Sub-Paths
Woochul Kang
33
1
0
27 Dec 2023
Fairness-Aware Structured Pruning in Transformers
Fairness-Aware Structured Pruning in Transformers
A. Zayed
Gonçalo Mordido
Samira Shabanian
Ioana Baldini
Sarath Chandar
33
15
0
24 Dec 2023
Towards Efficient Generative Large Language Model Serving: A Survey from
  Algorithms to Systems
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems
Xupeng Miao
Gabriele Oliaro
Zhihao Zhang
Xinhao Cheng
Hongyi Jin
Tianqi Chen
Zhihao Jia
65
76
0
23 Dec 2023
ConsistentEE: A Consistent and Hardness-Guided Early Exiting Method for
  Accelerating Language Models Inference
ConsistentEE: A Consistent and Hardness-Guided Early Exiting Method for Accelerating Language Models Inference
Ziqian Zeng
Yihuai Hong
Hongliang Dai
Huiping Zhuang
Cen Chen
24
10
0
19 Dec 2023
Exploring Sparsity in Graph Transformers
Exploring Sparsity in Graph Transformers
Chuang Liu
Yibing Zhan
Xueqi Ma
Liang Ding
Dapeng Tao
Jia Wu
Wenbin Hu
Bo Du
34
6
0
09 Dec 2023
LayerCollapse: Adaptive compression of neural networks
LayerCollapse: Adaptive compression of neural networks
Soheil Zibakhsh Shabgahi
Mohammad Soheil Shariff
F. Koushanfar
AI4CE
20
1
0
29 Nov 2023
Learning to Skip for Language Modeling
Learning to Skip for Language Modeling
Dewen Zeng
Nan Du
Tao Wang
Yuanzhong Xu
Tao Lei
Zhifeng Chen
Claire Cui
25
11
0
26 Nov 2023
OrchestraLLM: Efficient Orchestration of Language Models for Dialogue
  State Tracking
OrchestraLLM: Efficient Orchestration of Language Models for Dialogue State Tracking
Chia-Hsuan Lee
Hao Cheng
Mari Ostendorf
47
4
0
16 Nov 2023
Transfer Learning for Structured Pruning under Limited Task Data
Transfer Learning for Structured Pruning under Limited Task Data
Lucio Dery
David Grangier
Awni Y. Hannun
22
0
0
10 Nov 2023
Near-Linear Scaling Data Parallel Training with Overlapping-Aware
  Gradient Compression
Near-Linear Scaling Data Parallel Training with Overlapping-Aware Gradient Compression
Lin Meng
Yuzhong Sun
Weimin Li
34
1
0
08 Nov 2023
FLORA: Fine-grained Low-Rank Architecture Search for Vision Transformer
FLORA: Fine-grained Low-Rank Architecture Search for Vision Transformer
Chi-Chih Chang
Yuan-Yao Sung
Shixing Yu
N. Huang
Diana Marculescu
Kai-Chiang Wu
ViT
28
1
0
07 Nov 2023
Improving Machine Translation with Large Language Models: A Preliminary
  Study with Cooperative Decoding
Improving Machine Translation with Large Language Models: A Preliminary Study with Cooperative Decoding
Jiali Zeng
Fandong Meng
Yongjing Yin
Jie Zhou
29
10
0
06 Nov 2023
TLM: Token-Level Masking for Transformers
TLM: Token-Level Masking for Transformers
Yangjun Wu
Kebin Fang
Dongxian Zhang
Han Wang
Hao Zhang
Gang Chen
23
1
0
28 Oct 2023
Switching Temporary Teachers for Semi-Supervised Semantic Segmentation
Switching Temporary Teachers for Semi-Supervised Semantic Segmentation
Jaemin Na
Jung-Woo Ha
HyungJin Chang
Dongyoon Han
Wonjun Hwang
28
29
0
28 Oct 2023
Variator: Accelerating Pre-trained Models with Plug-and-Play Compression
  Modules
Variator: Accelerating Pre-trained Models with Plug-and-Play Compression Modules
Chaojun Xiao
Yuqi Luo
Wenbin Zhang
Pengle Zhang
Xu Han
...
Zhengyan Zhang
Ruobing Xie
Zhiyuan Liu
Maosong Sun
Jie Zhou
30
0
0
24 Oct 2023
CRaSh: Clustering, Removing, and Sharing Enhance Fine-tuning without
  Full Large Language Model
CRaSh: Clustering, Removing, and Sharing Enhance Fine-tuning without Full Large Language Model
Kaiyan Zhang
Ning Ding
Biqing Qi
Xuekai Zhu
Xinwei Long
Bowen Zhou
43
4
0
24 Oct 2023
Sub-network Discovery and Soft-masking for Continual Learning of Mixed
  Tasks
Sub-network Discovery and Soft-masking for Continual Learning of Mixed Tasks
Zixuan Ke
Bing Liu
Wenhan Xiong
Asli Celikyilmaz
Haoran Li
CLL
32
5
0
13 Oct 2023
A Comparative Analysis of Task-Agnostic Distillation Methods for
  Compressing Transformer Language Models
A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models
Takuma Udagawa
Aashka Trivedi
Michele Merler
Bishwaranjan Bhattacharjee
44
7
0
13 Oct 2023
Previous
12345678
Next