ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.20867
  4. Cited By
Automatic Channel Pruning for Multi-Head Attention

Automatic Channel Pruning for Multi-Head Attention

31 May 2024
Eunho Lee
Youngbae Hwang
    ViT
ArXiv (abs)PDFHTML

Papers citing "Automatic Channel Pruning for Multi-Head Attention"

37 / 37 papers shown
Title
Dynamic Token Pruning in Plain Vision Transformers for Semantic
  Segmentation
Dynamic Token Pruning in Plain Vision Transformers for Semantic Segmentation
Quan Tang
Bowen Zhang
Jiajun Liu
Fagui Liu
Yifan Liu
ViT
85
30
0
02 Aug 2023
FLatten Transformer: Vision Transformer using Focused Linear Attention
FLatten Transformer: Vision Transformer using Focused Linear Attention
Dongchen Han
Xuran Pan
Yizeng Han
Shiji Song
Gao Huang
87
177
0
01 Aug 2023
Joint Token Pruning and Squeezing Towards More Aggressive Compression of
  Vision Transformers
Joint Token Pruning and Squeezing Towards More Aggressive Compression of Vision Transformers
Siyuan Wei
Tianzhu Ye
Shen Zhang
Yao Tang
Jiajun Liang
ViT
59
71
0
21 Apr 2023
SwiftFormer: Efficient Additive Attention for Transformer-based
  Real-time Mobile Vision Applications
SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications
Abdelrahman M. Shaker
Muhammad Maaz
H. Rasheed
Salman Khan
Ming-Hsuan Yang
Fahad Shahbaz Khan
ViT
120
95
0
27 Mar 2023
A Unified Framework for Soft Threshold Pruning
A Unified Framework for Soft Threshold Pruning
Yanqing Chen
Zhengyu Ma
Wei Fang
Xiawu Zheng
Zhaofei Yu
Yonghong Tian
121
21
0
25 Feb 2023
Castling-ViT: Compressing Self-Attention via Switching Towards
  Linear-Angular Attention at Vision Transformer Inference
Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention at Vision Transformer Inference
Haoran You
Yunyang Xiong
Xiaoliang Dai
Bichen Wu
Peizhao Zhang
Haoqi Fan
Peter Vajda
Yingyan Lin
96
34
0
18 Nov 2022
Token Merging: Your ViT But Faster
Token Merging: Your ViT But Faster
Daniel Bolya
Cheng-Yang Fu
Xiaoliang Dai
Peizhao Zhang
Christoph Feichtenhofer
Judy Hoffman
MoMe
121
470
0
17 Oct 2022
EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for
  Mobile Vision Applications
EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications
Muhammad Maaz
Abdelrahman M. Shaker
Hisham Cholakkal
Salman Khan
Syed Waqas Zamir
Rao Muhammad Anwer
Fahad Shahbaz Khan
ViT
107
199
0
21 Jun 2022
Neighborhood Attention Transformer
Neighborhood Attention Transformer
Ali Hassani
Steven Walton
Jiacheng Li
Shengjia Li
Humphrey Shi
ViTAI4TS
92
274
0
14 Apr 2022
CHEX: CHannel EXploration for CNN Model Compression
CHEX: CHannel EXploration for CNN Model Compression
Zejiang Hou
Minghai Qin
Fei Sun
Xiaolong Ma
Kun Yuan
Yi Xu
Yen-kuang Chen
Rong Jin
Yuan Xie
S. Kung
60
74
0
29 Mar 2022
Unified Visual Transformer Compression
Unified Visual Transformer Compression
Shixing Yu
Tianlong Chen
Jiayi Shen
Huan Yuan
Jianchao Tan
Sen Yang
Ji Liu
Zhangyang Wang
ViT
57
94
0
15 Mar 2022
Auto-scaling Vision Transformers without Training
Auto-scaling Vision Transformers without Training
Wuyang Chen
Wei-Ping Huang
Xianzhi Du
Xiaodan Song
Zhangyang Wang
Denny Zhou
ViT
66
25
0
24 Feb 2022
cosFormer: Rethinking Softmax in Attention
cosFormer: Rethinking Softmax in Attention
Zhen Qin
Weixuan Sun
Huicai Deng
Dongxu Li
Yunshen Wei
Baohong Lv
Junjie Yan
Lingpeng Kong
Yiran Zhong
80
222
0
17 Feb 2022
Not All Patches are What You Need: Expediting Vision Transformers via
  Token Reorganizations
Not All Patches are What You Need: Expediting Vision Transformers via Token Reorganizations
Youwei Liang
Chongjian Ge
Zhan Tong
Yibing Song
Jue Wang
P. Xie
ViT
61
255
0
16 Feb 2022
Vision Transformer with Deformable Attention
Vision Transformer with Deformable Attention
Zhuofan Xia
Xuran Pan
S. Song
Li Erran Li
Gao Huang
ViT
93
484
0
03 Jan 2022
Pruning Self-attentions into Convolutional Layers in Single Path
Pruning Self-attentions into Convolutional Layers in Single Path
Haoyu He
Jianfei Cai
Jing Liu
Zizheng Pan
Jing Zhang
Dacheng Tao
Bohan Zhuang
ViT
69
40
0
23 Nov 2021
SOFT: Softmax-free Transformer with Linear Complexity
SOFT: Softmax-free Transformer with Linear Complexity
Jiachen Lu
Jinghan Yao
Junge Zhang
Martin Danelljan
Hang Xu
Weiguo Gao
Chunjing Xu
Thomas B. Schon
Li Zhang
68
166
0
22 Oct 2021
Global Vision Transformer Pruning with Hessian-Aware Saliency
Global Vision Transformer Pruning with Hessian-Aware Saliency
Huanrui Yang
Hongxu Yin
Maying Shen
Pavlo Molchanov
Hai Helen Li
Jan Kautz
ViT
69
45
0
10 Oct 2021
Differentiable Subset Pruning of Transformer Heads
Differentiable Subset Pruning of Transformer Heads
Jiaoda Li
Ryan Cotterell
Mrinmaya Sachan
104
57
0
10 Aug 2021
GLiT: Neural Architecture Search for Global and Local Image Transformer
GLiT: Neural Architecture Search for Global and Local Image Transformer
Boyu Chen
Peixia Li
Chuming Li
Baopu Li
Lei Bai
Chen Lin
Ming Sun
Junjie Yan
Wanli Ouyang
ViT
78
86
0
07 Jul 2021
Chasing Sparsity in Vision Transformers: An End-to-End Exploration
Chasing Sparsity in Vision Transformers: An End-to-End Exploration
Tianlong Chen
Yu Cheng
Zhe Gan
Lu Yuan
Lei Zhang
Zhangyang Wang
ViT
68
222
0
08 Jun 2021
Vision Transformer Pruning
Vision Transformer Pruning
Mingjian Zhu
Yehui Tang
Kai Han
ViT
77
92
0
17 Apr 2021
Going deeper with Image Transformers
Going deeper with Image Transformers
Hugo Touvron
Matthieu Cord
Alexandre Sablayrolles
Gabriel Synnaeve
Hervé Jégou
ViT
160
1,021
0
31 Mar 2021
CvT: Introducing Convolutions to Vision Transformers
CvT: Introducing Convolutions to Vision Transformers
Haiping Wu
Bin Xiao
Noel Codella
Mengchen Liu
Xiyang Dai
Lu Yuan
Lei Zhang
ViT
154
1,917
0
29 Mar 2021
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
ViT
467
21,603
0
25 Mar 2021
ConViT: Improving Vision Transformers with Soft Convolutional Inductive
  Biases
ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases
Stéphane dÁscoli
Hugo Touvron
Matthew L. Leavitt
Ari S. Morcos
Giulio Biroli
Levent Sagun
ViT
136
834
0
19 Mar 2021
Transformer in Transformer
Transformer in Transformer
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
ViT
391
1,574
0
27 Feb 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction
  without Convolutions
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
535
3,740
0
24 Feb 2021
Tokens-to-Token ViT: Training Vision Transformers from Scratch on
  ImageNet
Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
Li-xin Yuan
Yunpeng Chen
Tao Wang
Weihao Yu
Yujun Shi
Zihang Jiang
Francis E. H. Tay
Jiashi Feng
Shuicheng Yan
ViT
146
1,942
0
28 Jan 2021
Training data-efficient image transformers & distillation through
  attention
Training data-efficient image transformers & distillation through attention
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jégou
ViT
389
6,805
0
23 Dec 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at
  Scale
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
676
41,483
0
22 Oct 2020
Linformer: Self-Attention with Linear Complexity
Linformer: Self-Attention with Linear Complexity
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
216
1,716
0
08 Jun 2020
HRank: Filter Pruning using High-Rank Feature Map
HRank: Filter Pruning using High-Rank Feature Map
Mingbao Lin
Rongrong Ji
Yan Wang
Yichen Zhang
Baochang Zhang
Yonghong Tian
Ling Shao
79
728
0
24 Feb 2020
Reformer: The Efficient Transformer
Reformer: The Efficient Transformer
Nikita Kitaev
Lukasz Kaiser
Anselm Levskaya
VLM
204
2,333
0
13 Jan 2020
MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning
MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning
Zechun Liu
Haoyuan Mu
Xiangyu Zhang
Zichao Guo
Xin Yang
K. Cheng
Jian Sun
83
563
0
25 Mar 2019
Squeeze-and-Excitation Networks
Squeeze-and-Excitation Networks
Jie Hu
Li Shen
Samuel Albanie
Gang Sun
Enhua Wu
427
26,568
0
05 Sep 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
795
132,454
0
12 Jun 2017
1