Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.09854
Cited By
v1
v2
v3 (latest)
Transformer-Based Visual Segmentation: A Survey
19 April 2023
Xiangtai Li
Henghui Ding
Haobo Yuan
Wenwei Zhang
Jiangmiao Pang
Guangliang Cheng
Kai-xiang Chen
Ziwei Liu
Chen Change Loy
ViT
MedIm
Re-assign community
ArXiv (abs)
PDF
HTML
Github (741★)
Papers citing
"Transformer-Based Visual Segmentation: A Survey"
50 / 216 papers shown
Title
Exploring Sequence Feature Alignment for Domain Adaptive Detection Transformers
Wen Wang
Yang Cao
Jing Zhang
Fengxiang He
Zhengjun Zha
Yonggang Wen
Dacheng Tao
ViT
65
96
0
27 Jul 2021
CycleMLP: A MLP-like Architecture for Dense Prediction
Shoufa Chen
Enze Xie
Chongjian Ge
Runjian Chen
Ding Liang
Ping Luo
127
233
0
21 Jul 2021
Per-Pixel Classification is Not All You Need for Semantic Segmentation
Bowen Cheng
Alex Schwing
Alexander Kirillov
VLM
ViT
210
1,548
0
13 Jul 2021
K-Net: Towards Unified Image Segmentation
Wenwei Zhang
Jiangmiao Pang
Kai-xiang Chen
Chen Change Loy
ISeg
82
369
0
28 Jun 2021
Video Swin Transformer
Ze Liu
Jia Ning
Yue Cao
Yixuan Wei
Zheng Zhang
Stephen Lin
Han Hu
ViT
106
1,487
0
24 Jun 2021
XCiT: Cross-Covariance Image Transformers
Alaaeldin El-Nouby
Hugo Touvron
Mathilde Caron
Piotr Bojanowski
Matthijs Douze
...
Ivan Laptev
Natalia Neverova
Gabriel Synnaeve
Jakob Verbeek
Hervé Jégou
ViT
144
511
0
17 Jun 2021
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
281
2,826
0
15 Jun 2021
A Survey of Transformers
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
159
1,128
0
08 Jun 2021
On the Connection between Local Attention and Dynamic Depth-wise Convolution
Qi Han
Zejia Fan
Qi Dai
Lei-huan Sun
Ming-Ming Cheng
Jiaying Liu
Jingdong Wang
ViT
74
109
0
08 Jun 2021
ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
Yufei Xu
Qiming Zhang
Jing Zhang
Dacheng Tao
ViT
148
338
0
07 Jun 2021
Associating Objects with Transformers for Video Object Segmentation
Zongxin Yang
Yunchao Wei
Yi Yang
85
292
0
04 Jun 2021
SOLQ: Segmenting Objects by Learning Queries
Bin Dong
Fangao Zeng
Tiancai Wang
Xinming Zhang
Yichen Wei
ISeg
76
119
0
04 Jun 2021
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
Enze Xie
Wenhai Wang
Zhiding Yu
Anima Anandkumar
J. Álvarez
Ping Luo
ViT
303
5,051
0
31 May 2021
Segmenter: Transformer for Semantic Segmentation
Robin Strudel
Ricardo Garcia Pinel
Ivan Laptev
Cordelia Schmid
ViT
212
1,470
0
12 May 2021
Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation
Hu Cao
Yueyue Wang
Jieneng Chen
Dongsheng Jiang
Xiaopeng Zhang
Qi Tian
Manning Wang
ViT
MedIm
136
2,914
0
12 May 2021
MOTR: End-to-End Multiple-Object Tracking with Transformer
Fangao Zeng
Bin Dong
Cheng Chen
Tiancai Wang
Xinming Zhang
Yichen Wei
VOT
71
518
0
07 May 2021
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
423
2,682
0
04 May 2021
ISTR: End-to-End Instance Segmentation with Transformers
Jie Hu
Liujuan Cao
Yao Lu
Shengchuan Zhang
Yan Wang
Ke Li
Feiyue Huang
Ling Shao
Rongrong Ji
ISeg
57
95
0
03 May 2021
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
694
6,121
0
29 Apr 2021
Open-vocabulary Object Detection via Vision and Language Knowledge Distillation
Xiuye Gu
Nayeon Lee
Weicheng Kuo
Huayu Chen
VLM
ObjD
283
920
0
28 Apr 2021
Twins: Revisiting the Design of Spatial Attention in Vision Transformers
Xiangxiang Chu
Zhi Tian
Yuqing Wang
Bo Zhang
Haibing Ren
Xiaolin K. Wei
Huaxia Xia
Chunhua Shen
ViT
82
1,026
0
28 Apr 2021
MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding
Aishwarya Kamath
Mannat Singh
Yann LeCun
Gabriel Synnaeve
Ishan Misra
Nicolas Carion
ObjD
VLM
179
883
0
26 Apr 2021
Multiscale Vision Transformers
Haoqi Fan
Bo Xiong
K. Mangalam
Yanghao Li
Zhicheng Yan
Jitendra Malik
Christoph Feichtenhofer
ViT
132
1,259
0
22 Apr 2021
Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation
Chufeng Tang
Hang Chen
Xiao-Li Li
Jianmin Li
Zhaoxiang Zhang
Xiaolin Hu
ISeg
71
80
0
12 Apr 2021
Unidentified Video Objects: A Benchmark for Dense, Open-World Segmentation
Weiyao Wang
Matt Feiszli
Heng Wang
Du Tran
VOS
67
127
0
10 Apr 2021
An Empirical Study of Training Self-Supervised Vision Transformers
Xinlei Chen
Saining Xie
Kaiming He
ViT
157
1,868
0
05 Apr 2021
Efficient DETR: Improving End-to-End Object Detector with Dense Prior
Z. Yao
Jiangbo Ai
Boxun Li
Chi Zhang
ViT
103
222
0
03 Apr 2021
CvT: Introducing Convolutions to Vision Transformers
Haiping Wu
Bin Xiao
Noel Codella
Mengchen Liu
Xiyang Dai
Lu Yuan
Lei Zhang
ViT
152
1,915
0
29 Mar 2021
Enhanced Boundary Learning for Glass-like Object Segmentation
Hao He
Xiangtai Li
Guangliang Cheng
Jianping Shi
Yunhai Tong
Gaofeng Meng
V. Prinet
Lubin Weng
74
75
0
29 Mar 2021
RobustNet: Improving Domain Generalization in Urban-Scene Segmentation via Instance Selective Whitening
Sungha Choi
Sanghun Jung
Huiwon Yun
J. Kim
Seungryong Kim
Jaegul Choo
113
288
0
29 Mar 2021
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
Chun-Fu Chen
Quanfu Fan
Yikang Shen
ViT
71
1,482
0
27 Mar 2021
UNETR: Transformers for 3D Medical Image Segmentation
Ali Hatamizadeh
Yucheng Tang
Vishwesh Nath
Dong Yang
Andriy Myronenko
Bennett Landman
H. Roth
Daguang Xu
ViT
MedIm
180
1,612
0
18 Mar 2021
Simple multi-dataset detection
Xingyi Zhou
V. Koltun
Philipp Krahenbuhl
ObjD
272
118
0
25 Feb 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
530
3,724
0
24 Feb 2021
TransFuse: Fusing Transformers and CNNs for Medical Image Segmentation
Yundong Zhang
Huiye Liu
Qiang Hu
ViT
MedIm
260
922
0
16 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
445
3,887
0
11 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
389
2,061
0
09 Feb 2021
(AF)2-S3Net: Attentive Feature Fusion with Adaptive Feature Selection for Sparse Semantic Segmentation Network
Ran Cheng
Ryan Razani
E. Taghavi
Enxu Li
Bingbing Liu
3DPC
197
246
0
08 Feb 2021
TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation
Jieneng Chen
Yongyi Lu
Qihang Yu
Xiangde Luo
Ehsan Adeli
Yan Wang
Le Lu
Alan Yuille
Yuyin Zhou
ViT
MedIm
98
3,492
0
08 Feb 2021
Occluded Video Instance Segmentation: A Benchmark
Jiyang Qi
Yan Gao
Yao Hu
Xinggang Wang
Xiaoyu Liu
Xiang Bai
Serge Belongie
Alan Yuille
Philip Torr
S. Bai
VOS
VLM
64
140
0
02 Feb 2021
TrackFormer: Multi-Object Tracking with Transformers
Tim Meinhardt
A. Kirillov
Laura Leal-Taixe
Christoph Feichtenhofer
VOT
269
774
0
07 Jan 2021
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
Fahad Shahbaz Khan
M. Shah
ViT
305
2,525
0
04 Jan 2021
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
Sixiao Zheng
Jiachen Lu
Hengshuang Zhao
Xiatian Zhu
Zekun Luo
...
Yanwei Fu
Jianfeng Feng
Tao Xiang
Philip Torr
Li Zhang
ViT
194
2,908
0
31 Dec 2020
TransTrack: Multiple Object Tracking with Transformer
Pei Sun
Jinkun Cao
Yi Jiang
Rufeng Zhang
Enze Xie
Zehuan Yuan
Changhu Wang
Ping Luo
ViT
VOT
312
583
0
31 Dec 2020
A Survey on Visual Transformer
Kai Han
Yunhe Wang
Hanting Chen
Xinghao Chen
Jianyuan Guo
...
Chunjing Xu
Yixing Xu
Zhaohui Yang
Yiman Zhang
Dacheng Tao
ViT
200
2,232
0
23 Dec 2020
PCT: Point cloud transformer
Meng-Hao Guo
Junxiong Cai
Zheng-Ning Liu
Tai-Jiang Mu
Ralph Robert Martin
Shimin Hu
ViT
3DPC
148
1,624
0
17 Dec 2020
Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation
Golnaz Ghiasi
Huayu Chen
A. Srinivas
Rui Qian
Nayeon Lee
E. D. Cubuk
Quoc V. Le
Barret Zoph
ISeg
299
992
0
13 Dec 2020
ViP-DeepLab: Learning Visual Perception with Depth-aware Video Panoptic Segmentation
Siyuan Qiao
Yukun Zhu
Hartwig Adam
Alan Yuille
Liang-Chieh Chen
MDE
89
145
0
09 Dec 2020
CompFeat: Comprehensive Feature Aggregation for Video Instance Segmentation
Yang Fu
Linjie Yang
Ding Liu
Thomas S. Huang
Humphrey Shi
VOS
66
72
0
07 Dec 2020
BoxInst: High-Performance Instance Segmentation with Box Annotations
Zhi Tian
Chunhua Shen
Xinlong Wang
Hao Chen
ISeg
88
240
0
03 Dec 2020
Previous
1
2
3
4
5
Next