ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.09854
  4. Cited By
Transformer-Based Visual Segmentation: A Survey
v1v2v3 (latest)

Transformer-Based Visual Segmentation: A Survey

19 April 2023
Xiangtai Li
Henghui Ding
Haobo Yuan
Wenwei Zhang
Jiangmiao Pang
Guangliang Cheng
Kai-xiang Chen
Ziwei Liu
Chen Change Loy
    ViTMedIm
ArXiv (abs)PDFHTMLGithub (741★)

Papers citing "Transformer-Based Visual Segmentation: A Survey"

50 / 216 papers shown
Title
Exploring Sequence Feature Alignment for Domain Adaptive Detection
  Transformers
Exploring Sequence Feature Alignment for Domain Adaptive Detection Transformers
Wen Wang
Yang Cao
Jing Zhang
Fengxiang He
Zhengjun Zha
Yonggang Wen
Dacheng Tao
ViT
65
96
0
27 Jul 2021
CycleMLP: A MLP-like Architecture for Dense Prediction
CycleMLP: A MLP-like Architecture for Dense Prediction
Shoufa Chen
Enze Xie
Chongjian Ge
Runjian Chen
Ding Liang
Ping Luo
127
233
0
21 Jul 2021
Per-Pixel Classification is Not All You Need for Semantic Segmentation
Per-Pixel Classification is Not All You Need for Semantic Segmentation
Bowen Cheng
Alex Schwing
Alexander Kirillov
VLMViT
210
1,548
0
13 Jul 2021
K-Net: Towards Unified Image Segmentation
K-Net: Towards Unified Image Segmentation
Wenwei Zhang
Jiangmiao Pang
Kai-xiang Chen
Chen Change Loy
ISeg
82
369
0
28 Jun 2021
Video Swin Transformer
Video Swin Transformer
Ze Liu
Jia Ning
Yue Cao
Yixuan Wei
Zheng Zhang
Stephen Lin
Han Hu
ViT
106
1,487
0
24 Jun 2021
XCiT: Cross-Covariance Image Transformers
XCiT: Cross-Covariance Image Transformers
Alaaeldin El-Nouby
Hugo Touvron
Mathilde Caron
Piotr Bojanowski
Matthijs Douze
...
Ivan Laptev
Natalia Neverova
Gabriel Synnaeve
Jakob Verbeek
Hervé Jégou
ViT
144
511
0
17 Jun 2021
BEiT: BERT Pre-Training of Image Transformers
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
281
2,826
0
15 Jun 2021
A Survey of Transformers
A Survey of Transformers
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
159
1,128
0
08 Jun 2021
On the Connection between Local Attention and Dynamic Depth-wise
  Convolution
On the Connection between Local Attention and Dynamic Depth-wise Convolution
Qi Han
Zejia Fan
Qi Dai
Lei-huan Sun
Ming-Ming Cheng
Jiaying Liu
Jingdong Wang
ViT
74
109
0
08 Jun 2021
ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
Yufei Xu
Qiming Zhang
Jing Zhang
Dacheng Tao
ViT
148
338
0
07 Jun 2021
Associating Objects with Transformers for Video Object Segmentation
Associating Objects with Transformers for Video Object Segmentation
Zongxin Yang
Yunchao Wei
Yi Yang
85
292
0
04 Jun 2021
SOLQ: Segmenting Objects by Learning Queries
SOLQ: Segmenting Objects by Learning Queries
Bin Dong
Fangao Zeng
Tiancai Wang
Xinming Zhang
Yichen Wei
ISeg
76
119
0
04 Jun 2021
SegFormer: Simple and Efficient Design for Semantic Segmentation with
  Transformers
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
Enze Xie
Wenhai Wang
Zhiding Yu
Anima Anandkumar
J. Álvarez
Ping Luo
ViT
303
5,051
0
31 May 2021
Segmenter: Transformer for Semantic Segmentation
Segmenter: Transformer for Semantic Segmentation
Robin Strudel
Ricardo Garcia Pinel
Ivan Laptev
Cordelia Schmid
ViT
212
1,470
0
12 May 2021
Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation
Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation
Hu Cao
Yueyue Wang
Jieneng Chen
Dongsheng Jiang
Xiaopeng Zhang
Qi Tian
Manning Wang
ViTMedIm
136
2,914
0
12 May 2021
MOTR: End-to-End Multiple-Object Tracking with Transformer
MOTR: End-to-End Multiple-Object Tracking with Transformer
Fangao Zeng
Bin Dong
Cheng Chen
Tiancai Wang
Xinming Zhang
Yichen Wei
VOT
71
518
0
07 May 2021
MLP-Mixer: An all-MLP Architecture for Vision
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
423
2,682
0
04 May 2021
ISTR: End-to-End Instance Segmentation with Transformers
ISTR: End-to-End Instance Segmentation with Transformers
Jie Hu
Liujuan Cao
Yao Lu
Shengchuan Zhang
Yan Wang
Ke Li
Feiyue Huang
Ling Shao
Rongrong Ji
ISeg
57
95
0
03 May 2021
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
694
6,121
0
29 Apr 2021
Open-vocabulary Object Detection via Vision and Language Knowledge
  Distillation
Open-vocabulary Object Detection via Vision and Language Knowledge Distillation
Xiuye Gu
Nayeon Lee
Weicheng Kuo
Huayu Chen
VLMObjD
283
920
0
28 Apr 2021
Twins: Revisiting the Design of Spatial Attention in Vision Transformers
Twins: Revisiting the Design of Spatial Attention in Vision Transformers
Xiangxiang Chu
Zhi Tian
Yuqing Wang
Bo Zhang
Haibing Ren
Xiaolin K. Wei
Huaxia Xia
Chunhua Shen
ViT
82
1,026
0
28 Apr 2021
MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding
MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding
Aishwarya Kamath
Mannat Singh
Yann LeCun
Gabriel Synnaeve
Ishan Misra
Nicolas Carion
ObjDVLM
179
883
0
26 Apr 2021
Multiscale Vision Transformers
Multiscale Vision Transformers
Haoqi Fan
Bo Xiong
K. Mangalam
Yanghao Li
Zhicheng Yan
Jitendra Malik
Christoph Feichtenhofer
ViT
132
1,259
0
22 Apr 2021
Look Closer to Segment Better: Boundary Patch Refinement for Instance
  Segmentation
Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation
Chufeng Tang
Hang Chen
Xiao-Li Li
Jianmin Li
Zhaoxiang Zhang
Xiaolin Hu
ISeg
71
80
0
12 Apr 2021
Unidentified Video Objects: A Benchmark for Dense, Open-World
  Segmentation
Unidentified Video Objects: A Benchmark for Dense, Open-World Segmentation
Weiyao Wang
Matt Feiszli
Heng Wang
Du Tran
VOS
67
127
0
10 Apr 2021
An Empirical Study of Training Self-Supervised Vision Transformers
An Empirical Study of Training Self-Supervised Vision Transformers
Xinlei Chen
Saining Xie
Kaiming He
ViT
157
1,868
0
05 Apr 2021
Efficient DETR: Improving End-to-End Object Detector with Dense Prior
Efficient DETR: Improving End-to-End Object Detector with Dense Prior
Z. Yao
Jiangbo Ai
Boxun Li
Chi Zhang
ViT
103
222
0
03 Apr 2021
CvT: Introducing Convolutions to Vision Transformers
CvT: Introducing Convolutions to Vision Transformers
Haiping Wu
Bin Xiao
Noel Codella
Mengchen Liu
Xiyang Dai
Lu Yuan
Lei Zhang
ViT
152
1,915
0
29 Mar 2021
Enhanced Boundary Learning for Glass-like Object Segmentation
Enhanced Boundary Learning for Glass-like Object Segmentation
Hao He
Xiangtai Li
Guangliang Cheng
Jianping Shi
Yunhai Tong
Gaofeng Meng
V. Prinet
Lubin Weng
74
75
0
29 Mar 2021
RobustNet: Improving Domain Generalization in Urban-Scene Segmentation
  via Instance Selective Whitening
RobustNet: Improving Domain Generalization in Urban-Scene Segmentation via Instance Selective Whitening
Sungha Choi
Sanghun Jung
Huiwon Yun
J. Kim
Seungryong Kim
Jaegul Choo
113
288
0
29 Mar 2021
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image
  Classification
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
Chun-Fu Chen
Quanfu Fan
Yikang Shen
ViT
71
1,482
0
27 Mar 2021
UNETR: Transformers for 3D Medical Image Segmentation
UNETR: Transformers for 3D Medical Image Segmentation
Ali Hatamizadeh
Yucheng Tang
Vishwesh Nath
Dong Yang
Andriy Myronenko
Bennett Landman
H. Roth
Daguang Xu
ViTMedIm
180
1,612
0
18 Mar 2021
Simple multi-dataset detection
Simple multi-dataset detection
Xingyi Zhou
V. Koltun
Philipp Krahenbuhl
ObjD
272
118
0
25 Feb 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction
  without Convolutions
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
530
3,724
0
24 Feb 2021
TransFuse: Fusing Transformers and CNNs for Medical Image Segmentation
TransFuse: Fusing Transformers and CNNs for Medical Image Segmentation
Yundong Zhang
Huiye Liu
Qiang Hu
ViTMedIm
260
922
0
16 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLMCLIP
445
3,887
0
11 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
389
2,061
0
09 Feb 2021
(AF)2-S3Net: Attentive Feature Fusion with Adaptive Feature Selection
  for Sparse Semantic Segmentation Network
(AF)2-S3Net: Attentive Feature Fusion with Adaptive Feature Selection for Sparse Semantic Segmentation Network
Ran Cheng
Ryan Razani
E. Taghavi
Enxu Li
Bingbing Liu
3DPC
197
246
0
08 Feb 2021
TransUNet: Transformers Make Strong Encoders for Medical Image
  Segmentation
TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation
Jieneng Chen
Yongyi Lu
Qihang Yu
Xiangde Luo
Ehsan Adeli
Yan Wang
Le Lu
Alan Yuille
Yuyin Zhou
ViTMedIm
98
3,492
0
08 Feb 2021
Occluded Video Instance Segmentation: A Benchmark
Occluded Video Instance Segmentation: A Benchmark
Jiyang Qi
Yan Gao
Yao Hu
Xinggang Wang
Xiaoyu Liu
Xiang Bai
Serge Belongie
Alan Yuille
Philip Torr
S. Bai
VOSVLM
64
140
0
02 Feb 2021
TrackFormer: Multi-Object Tracking with Transformers
TrackFormer: Multi-Object Tracking with Transformers
Tim Meinhardt
A. Kirillov
Laura Leal-Taixe
Christoph Feichtenhofer
VOT
269
774
0
07 Jan 2021
Transformers in Vision: A Survey
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
Fahad Shahbaz Khan
M. Shah
ViT
305
2,525
0
04 Jan 2021
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective
  with Transformers
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
Sixiao Zheng
Jiachen Lu
Hengshuang Zhao
Xiatian Zhu
Zekun Luo
...
Yanwei Fu
Jianfeng Feng
Tao Xiang
Philip Torr
Li Zhang
ViT
194
2,908
0
31 Dec 2020
TransTrack: Multiple Object Tracking with Transformer
TransTrack: Multiple Object Tracking with Transformer
Pei Sun
Jinkun Cao
Yi Jiang
Rufeng Zhang
Enze Xie
Zehuan Yuan
Changhu Wang
Ping Luo
ViTVOT
312
583
0
31 Dec 2020
A Survey on Visual Transformer
A Survey on Visual Transformer
Kai Han
Yunhe Wang
Hanting Chen
Xinghao Chen
Jianyuan Guo
...
Chunjing Xu
Yixing Xu
Zhaohui Yang
Yiman Zhang
Dacheng Tao
ViT
200
2,232
0
23 Dec 2020
PCT: Point cloud transformer
PCT: Point cloud transformer
Meng-Hao Guo
Junxiong Cai
Zheng-Ning Liu
Tai-Jiang Mu
Ralph Robert Martin
Shimin Hu
ViT3DPC
148
1,624
0
17 Dec 2020
Simple Copy-Paste is a Strong Data Augmentation Method for Instance
  Segmentation
Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation
Golnaz Ghiasi
Huayu Chen
A. Srinivas
Rui Qian
Nayeon Lee
E. D. Cubuk
Quoc V. Le
Barret Zoph
ISeg
299
992
0
13 Dec 2020
ViP-DeepLab: Learning Visual Perception with Depth-aware Video Panoptic
  Segmentation
ViP-DeepLab: Learning Visual Perception with Depth-aware Video Panoptic Segmentation
Siyuan Qiao
Yukun Zhu
Hartwig Adam
Alan Yuille
Liang-Chieh Chen
MDE
89
145
0
09 Dec 2020
CompFeat: Comprehensive Feature Aggregation for Video Instance
  Segmentation
CompFeat: Comprehensive Feature Aggregation for Video Instance Segmentation
Yang Fu
Linjie Yang
Ding Liu
Thomas S. Huang
Humphrey Shi
VOS
66
72
0
07 Dec 2020
BoxInst: High-Performance Instance Segmentation with Box Annotations
BoxInst: High-Performance Instance Segmentation with Box Annotations
Zhi Tian
Chunhua Shen
Xinlong Wang
Hao Chen
ISeg
88
240
0
03 Dec 2020
Previous
12345
Next