Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.17261
Cited By
Embedding-Free Transformer with Inference Spatial Reduction for Efficient Semantic Segmentation
24 July 2024
Hyunwoo Yu
Yubin Cho
Beoungwoo Kang
Seunghun Moon
Kyeongbo Kong
Suk-Ju Kang
Re-assign community
ArXiv (abs)
PDF
HTML
Github (30★)
Papers citing
"Embedding-Free Transformer with Inference Spatial Reduction for Efficient Semantic Segmentation"
48 / 48 papers shown
Title
SegMAN: Omni-scale Context Modeling with State Space Models and Local Attention for Semantic Segmentation
Yunxiang Fu
Meng Lou
Yizhou Yu
294
1
0
16 Dec 2024
Cross-aware Early Fusion with Stage-divided Vision and Language Transformer Encoders for Referring Image Segmentation
Yubin Cho
Hyunwoo Yu
Suk-Ju Kang
112
21
0
14 Aug 2024
Scale-Aware Modulation Meet Transformer
Wei-Shiang Lin
Ziheng Wu
Jiayu Chen
Jun Huang
Lianwen Jin
MoE
ViT
106
74
0
17 Jul 2023
FasterViT: Fast Vision Transformers with Hierarchical Attention
Ali Hatamizadeh
Greg Heinrich
Hongxu Yin
Andrew Tao
J. Álvarez
Jan Kautz
Pavlo Molchanov
ViT
106
72
0
09 Jun 2023
Content-aware Token Sharing for Efficient Semantic Segmentation with Vision Transformers
Chenyang Lu
Daan de Geus
Gijs Dubbelman
ViT
110
20
0
03 Jun 2023
SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer
Xuanyao Chen
Zhijian Liu
Haotian Tang
Li Yi
Hang Zhao
Song Han
ViT
195
48
0
30 Mar 2023
Token Merging: Your ViT But Faster
Daniel Bolya
Cheng-Yang Fu
Xiaoliang Dai
Peizhao Zhang
Christoph Feichtenhofer
Judy Hoffman
MoMe
115
470
0
17 Oct 2022
SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation
Meng-Hao Guo
Chenggang Lu
Qibin Hou
Zheng Liu
Ming-Ming Cheng
Shiyong Hu
SSeg
ViT
VLM
76
652
0
18 Sep 2022
Global Context Vision Transformers
Ali Hatamizadeh
Hongxu Yin
Greg Heinrich
Jan Kautz
Pavlo Molchanov
ViT
69
127
0
20 Jun 2022
EfficientFormer: Vision Transformers at MobileNet Speed
Yanyu Li
Geng Yuan
Yang Wen
Eric Hu
Georgios Evangelidis
Sergey Tulyakov
Yanzhi Wang
Jian Ren
ViT
89
369
0
02 Jun 2022
TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation
Wenqiang Zhang
Zilong Huang
Guozhong Luo
Tao Chen
Xinggang Wang
Wenyu Liu
Gang Yu
Chunhua Shen
ViT
106
208
0
12 Apr 2022
DaViT: Dual Attention Vision Transformers
Mingyu Ding
Bin Xiao
Noel Codella
Ping Luo
Jingdong Wang
Lu Yuan
ViT
127
250
0
07 Apr 2022
MixFormer: Mixing Features across Windows and Dimensions
Qiang Chen
Qiman Wu
Jian Wang
Qinghao Hu
T. Hu
Errui Ding
Jian Cheng
Jingdong Wang
MDE
ViT
56
109
0
06 Apr 2022
Rethinking Semantic Segmentation: A Prototype View
Tianfei Zhou
Wenguan Wang
E. Konukoglu
Luc Van Gool
SSeg
109
274
0
28 Mar 2022
Beyond Fixation: Dynamic Window Visual Transformer
Pengzhen Ren
Changlin Li
Guangrun Wang
Yun Xiao
Qing Du
Xiaodan Liang
Qing Du Xiaodan Liang Xiaojun Chang
ViT
70
35
0
24 Mar 2022
ScalableViT: Rethinking the Context-oriented Generalization of Vision Transformer
Rui Yang
Hailong Ma
Jie Wu
Yansong Tang
Xuefeng Xiao
Min Zheng
Xiu Li
ViT
123
56
0
21 Mar 2022
A ConvNet for the 2020s
Zhuang Liu
Hanzi Mao
Chaozheng Wu
Christoph Feichtenhofer
Trevor Darrell
Saining Xie
ViT
178
5,213
0
10 Jan 2022
Vision Transformer with Deformable Attention
Zhuofan Xia
Xuran Pan
S. Song
Li Erran Li
Gao Huang
ViT
90
482
0
03 Jan 2022
MPViT: Multi-Path Vision Transformer for Dense Prediction
Youngwan Lee
Jonghee Kim
Jeffrey Willette
Sung Ju Hwang
ViT
87
251
0
21 Dec 2021
Masked-attention Mask Transformer for Universal Image Segmentation
Bowen Cheng
Ishan Misra
Alex Schwing
Alexander Kirillov
Rohit Girdhar
ISeg
255
2,379
0
02 Dec 2021
Shunted Self-Attention via Multi-Scale Token Aggregation
Sucheng Ren
Daquan Zhou
Shengfeng He
Jiashi Feng
Xinchao Wang
ViT
82
228
0
30 Nov 2021
Efficient Self-Ensemble for Semantic Segmentation
Walid Bousselham
Guillaume Thibault
Lucas Pagano
Archana Machireddy
Joe W. Gray
Y. Chang
Xubo B. Song
ViT
63
27
0
26 Nov 2021
NomMer: Nominate Synergistic Context in Vision Transformer for Visual Recognition
Hao Liu
Xinghua Jiang
Xin Li
Zhimin Bao
Deqiang Jiang
Bo Ren
ViT
54
16
0
25 Nov 2021
MetaFormer Is Actually What You Need for Vision
Weihao Yu
Mi Luo
Pan Zhou
Chenyang Si
Yichen Zhou
Xinchao Wang
Jiashi Feng
Shuicheng Yan
173
914
0
22 Nov 2021
Swin Transformer V2: Scaling Up Capacity and Resolution
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
...
Yue Cao
Zheng Zhang
Li Dong
Furu Wei
B. Guo
ViT
219
1,825
0
18 Nov 2021
MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer
Sachin Mehta
Mohammad Rastegari
ViT
288
1,281
0
05 Oct 2021
ResNet strikes back: An improved training procedure in timm
Ross Wightman
Hugo Touvron
Hervé Jégou
AI4TS
263
498
0
01 Oct 2021
Mobile-Former: Bridging MobileNet and Transformer
Yinpeng Chen
Xiyang Dai
Dongdong Chen
Mengchen Liu
Xiaoyi Dong
Lu Yuan
Zicheng Liu
ViT
258
491
0
12 Aug 2021
CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows
Xiaoyi Dong
Jianmin Bao
Dongdong Chen
Weiming Zhang
Nenghai Yu
Lu Yuan
Dong Chen
B. Guo
ViT
152
985
0
01 Jul 2021
Focal Self-attention for Local-Global Interactions in Vision Transformers
Jianwei Yang
Chunyuan Li
Pengchuan Zhang
Xiyang Dai
Bin Xiao
Lu Yuan
Jianfeng Gao
ViT
80
435
0
01 Jul 2021
PVT v2: Improved Baselines with Pyramid Vision Transformer
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
AI4TS
109
1,676
0
25 Jun 2021
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
Enze Xie
Wenhai Wang
Zhiding Yu
Anima Anandkumar
J. Álvarez
Ping Luo
ViT
314
5,065
0
31 May 2021
ResMLP: Feedforward networks for image classification with data-efficient training
Hugo Touvron
Piotr Bojanowski
Mathilde Caron
Matthieu Cord
Alaaeldin El-Nouby
...
Gautier Izacard
Armand Joulin
Gabriel Synnaeve
Jakob Verbeek
Hervé Jégou
VLM
77
667
0
07 May 2021
Twins: Revisiting the Design of Spatial Attention in Vision Transformers
Xiangxiang Chu
Zhi Tian
Yuqing Wang
Bo Zhang
Haibing Ren
Xiaolin K. Wei
Huaxia Xia
Chunhua Shen
ViT
84
1,026
0
28 Apr 2021
Multiscale Vision Transformers
Haoqi Fan
Bo Xiong
K. Mangalam
Yanghao Li
Zhicheng Yan
Jitendra Malik
Christoph Feichtenhofer
ViT
135
1,265
0
22 Apr 2021
EfficientNetV2: Smaller Models and Faster Training
Mingxing Tan
Quoc V. Le
EgoV
125
2,720
0
01 Apr 2021
CvT: Introducing Convolutions to Vision Transformers
Haiping Wu
Bin Xiao
Noel Codella
Mengchen Liu
Xiyang Dai
Lu Yuan
Lei Zhang
ViT
154
1,917
0
29 Mar 2021
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
ViT
463
21,564
0
25 Mar 2021
Vision Transformers for Dense Prediction
René Ranftl
Alexey Bochkovskiy
V. Koltun
ViT
MDE
138
1,746
0
24 Mar 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
533
3,734
0
24 Feb 2021
Exploring Cross-Image Pixel Contrast for Semantic Segmentation
Wenguan Wang
Tianfei Zhou
Feng Yu
Jifeng Dai
E. Konukoglu
Luc Van Gool
196
481
0
28 Jan 2021
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
Sixiao Zheng
Jiachen Lu
Hengshuang Zhao
Xiatian Zhu
Zekun Luo
...
Yanwei Fu
Jianfeng Feng
Tao Xiang
Philip Torr
Li Zhang
ViT
194
2,911
0
31 Dec 2020
Training data-efficient image transformers & distillation through attention
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jégou
ViT
389
6,802
0
23 Dec 2020
COCO-Stuff: Thing and Stuff Classes in Context
Holger Caesar
J. Uijlings
V. Ferrari
139
1,394
0
12 Dec 2016
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
Liang-Chieh Chen
George Papandreou
Iasonas Kokkinos
Kevin Patrick Murphy
Alan Yuille
SSeg
267
18,267
0
02 Jun 2016
Fully Convolutional Networks for Semantic Segmentation
Evan Shelhamer
Jonathan Long
Trevor Darrell
VOS
SSeg
747
37,890
0
20 May 2016
The Cityscapes Dataset for Semantic Urban Scene Understanding
Marius Cordts
Mohamed Omran
Sebastian Ramos
Timo Rehfeld
Markus Enzweiler
Rodrigo Benenson
Uwe Franke
Stefan Roth
Bernt Schiele
1.1K
11,644
0
06 Apr 2016
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger
Philipp Fischer
Thomas Brox
SSeg
3DV
1.9K
77,378
0
18 May 2015
1