ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.09854
  4. Cited By
Transformer-Based Visual Segmentation: A Survey
v1v2v3 (latest)

Transformer-Based Visual Segmentation: A Survey

19 April 2023
Xiangtai Li
Henghui Ding
Haobo Yuan
Wenwei Zhang
Jiangmiao Pang
Guangliang Cheng
Kai-xiang Chen
Ziwei Liu
Chen Change Loy
    ViTMedIm
ArXiv (abs)PDFHTMLGithub (741★)

Papers citing "Transformer-Based Visual Segmentation: A Survey"

50 / 216 papers shown
Title
Vision Mamba in Remote Sensing: A Comprehensive Survey of Techniques, Applications and Outlook
Vision Mamba in Remote Sensing: A Comprehensive Survey of Techniques, Applications and Outlook
Muyi Bao
Shuchang Lyu
Zhaoyang Xu
Huiyu Zhou
Jinchang Ren
Shiming Xiang
Xuelong Li
Guangliang Cheng
Mamba
255
0
0
01 May 2025
Quantum Complex-Valued Self-Attention Model
Quantum Complex-Valued Self-Attention Model
Fu Chen
Qinglin Zhao
Li Feng
Longfei Tang
Yangbin Lin
Haitao Huang
MQ
122
0
0
24 Mar 2025
Simpler Fast Vision Transformers with a Jumbo CLS Token
Simpler Fast Vision Transformers with a Jumbo CLS Token
A. Fuller
Yousef Yassin
Daniel G. Kyrollos
Evan Shelhamer
James R. Green
162
0
0
20 Feb 2025
Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing
Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing
Hao Fei
Shengqiong Wu
Hao Zhang
Tat-Seng Chua
Shuicheng Yan
164
42
0
31 Dec 2024
Towards Semantic Equivalence of Tokenization in Multimodal LLM
Towards Semantic Equivalence of Tokenization in Multimodal LLM
Shengqiong Wu
Hao Fei
Xiangtai Li
Jiayi Ji
Hanwang Zhang
Tat-Seng Chua
Shuicheng Yan
MLLM
119
37
0
07 Jun 2024
Bring Adaptive Binding Prototypes to Generalized Referring Expression Segmentation
Bring Adaptive Binding Prototypes to Generalized Referring Expression Segmentation
Weize Li
Zhicheng Zhao
Haochen Bai
Fei Su
98
0
0
24 May 2024
MambaAD: Exploring State Space Models for Multi-class Unsupervised Anomaly Detection
MambaAD: Exploring State Space Models for Multi-class Unsupervised Anomaly Detection
Haoyang He
Yuhu Bai
Jiangning Zhang
Qingdong He
Hongxu Chen
Zhenye Gan
Chengjie Wang
Xiangtai Li
Guanzhong Tian
Lei Xie
Mamba
121
44
0
09 Apr 2024
Explore In-Context Segmentation via Latent Diffusion Models
Explore In-Context Segmentation via Latent Diffusion Models
Chaoyang Wang
Xiangtai Li
Henghui Ding
Lu Qi
Jiangning Zhang
Yunhai Tong
Chen Change Loy
Shuicheng Yan
DiffM
132
7
0
14 Mar 2024
MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary
  Instance Segmentation
MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation
Jiahao Xie
Wei Li
Xiangtai Li
Ziwei Liu
Yew-Soon Ong
Chen Change Loy
DiffMVLM
121
37
0
22 Sep 2023
GRES: Generalized Referring Expression Segmentation
GRES: Generalized Referring Expression Segmentation
Chang Liu
Henghui Ding
Xudong Jiang
89
158
0
01 Jun 2023
CLUSTSEG: Clustering for Universal Segmentation
CLUSTSEG: Clustering for Universal Segmentation
James Liang
Tianfei Zhou
Dongfang Liu
Wenguan Wang
VLM
102
49
0
03 May 2023
Domain Adaptive and Generalizable Network Architectures and Training
  Strategies for Semantic Image Segmentation
Domain Adaptive and Generalizable Network Architectures and Training Strategies for Semantic Image Segmentation
Lukas Hoyer
Dengxin Dai
Luc Van Gool
AI4CEOOD
96
25
0
26 Apr 2023
Video-kMaX: A Simple Unified Approach for Online and Near-Online Video
  Panoptic Segmentation
Video-kMaX: A Simple Unified Approach for Online and Near-Online Video Panoptic Segmentation
Inkyu Shin
Dahun Kim
Qihang Yu
Jun Xie
Hong-Seok Kim
Bradley Green
In So Kweon
Kuk-Jin Yoon
Liang-Chieh Chen
VLM
114
18
0
10 Apr 2023
FreeSeg: Unified, Universal and Open-Vocabulary Image Segmentation
FreeSeg: Unified, Universal and Open-Vocabulary Image Segmentation
Jie Qin
Jie Wu
Pengxiang Yan
Ming Li
Ren Yuxi
...
Yitong Wang
Rui Wang
Shilei Wen
X. Pan
Xingang Wang
SSegVLM
75
94
0
30 Mar 2023
LMSeg: Language-guided Multi-dataset Segmentation
LMSeg: Language-guided Multi-dataset Segmentation
Qiang-feng Zhou
Yuang Liu
Chaohui Yu
Jingliang Li
Zhibin Wang
Fan Wang
VLM
76
19
0
27 Feb 2023
Cut and Learn for Unsupervised Object Detection and Instance
  Segmentation
Cut and Learn for Unsupervised Object Detection and Instance Segmentation
Xudong Wang
Rohit Girdhar
Stella X. Yu
Ishan Misra
VLM
105
168
0
26 Jan 2023
Vision Transformers Are Good Mask Auto-Labelers
Vision Transformers Are Good Mask Auto-Labelers
Shiyi Lan
Xitong Yang
Zhiding Yu
Zuxuan Wu
J. Álvarez
Anima Anandkumar
ISegViTMedIm
58
19
0
10 Jan 2023
TarViS: A Unified Approach for Target-based Video Segmentation
TarViS: A Unified Approach for Target-based Video Segmentation
A. Athar
Alexander Hermans
Jonathon Luiten
Deva Ramanan
Bastian Leibe
VOS
77
29
0
06 Jan 2023
PanopticPartFormer++: A Unified and Decoupled View for Panoptic Part
  Segmentation
PanopticPartFormer++: A Unified and Decoupled View for Panoptic Part Segmentation
Xiangtai Li
Shilin Xu
Yibo Yang
Haobo Yuan
Guangliang Cheng
Yu Tong
Zhouchen Lin
Ming-Hsuan Yang
Dacheng Tao
ViT
101
21
0
03 Jan 2023
Generalized Decoding for Pixel, Image, and Language
Generalized Decoding for Pixel, Image, and Language
Xueyan Zou
Zi-Yi Dou
Jianwei Yang
Zhe Gan
Linjie Li
...
Lu Yuan
Nanyun Peng
Lijuan Wang
Yong Jae Lee
Jianfeng Gao
VLMMLLMObjD
95
259
0
21 Dec 2022
Style-Hallucinated Dual Consistency Learning: A Unified Framework for
  Visual Domain Generalization
Style-Hallucinated Dual Consistency Learning: A Unified Framework for Visual Domain Generalization
Yuyang Zhao
Zhun Zhong
Na Zhao
N. Sebe
G. Lee
87
31
0
18 Dec 2022
Look Before You Match: Instance Understanding Matters in Video Object
  Segmentation
Look Before You Match: Instance Understanding Matters in Video Object Segmentation
Junke Wang
Dongdong Chen
Zuxuan Wu
Chong Luo
Chuanxin Tang
Xiyang Dai
Yucheng Zhao
Yujia Xie
Lu Yuan
Yu-Gang Jiang
VOS
91
41
0
13 Dec 2022
Mask Matching Transformer for Few-Shot Segmentation
Mask Matching Transformer for Few-Shot Segmentation
Siyu Jiao
Gengwei Zhang
Shant Navasardyan
Ling-Hao Chen
Yao-Min Zhao
Yunchao Wei
Humphrey Shi
66
29
0
05 Dec 2022
MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation
MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation
Lukas Hoyer
Dengxin Dai
Haoran Wang
Luc Van Gool
120
227
0
02 Dec 2022
Superpoint Transformer for 3D Scene Instance Segmentation
Superpoint Transformer for 3D Scene Instance Segmentation
Jiahao Sun
Chunmei Qing
Junpeng Tan
Xiangmin Xu
3DPC
93
109
0
28 Nov 2022
Prototype as Query for Few Shot Semantic Segmentation
Prototype as Query for Few Shot Semantic Segmentation
Leilei Cao
Yibo Guo
Ye Yuan
Qiangguo Jin
ViT
84
12
0
27 Nov 2022
Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and
  Vision-Language Tasks
Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks
Hao Li
Jinguo Zhu
Xiaohu Jiang
Xizhou Zhu
Hongsheng Li
...
Xiaohua Wang
Yu Qiao
Xiaogang Wang
Wenhai Wang
Jifeng Dai
MLLM
70
57
0
17 Nov 2022
Max Pooling with Vision Transformers reconciles class and shape in
  weakly supervised semantic segmentation
Max Pooling with Vision Transformers reconciles class and shape in weakly supervised semantic segmentation
Simone Rossetti
Damiano Zappia
Marta Sanzari
M. Schaerf
F. Pirri
ViT
93
59
0
31 Oct 2022
SimpleClick: Interactive Image Segmentation with Simple Vision
  Transformers
SimpleClick: Interactive Image Segmentation with Simple Vision Transformers
Qin Liu
Zhenlin Xu
Gedas Bertasius
Marc Niethammer
66
115
0
20 Oct 2022
Exploring Long-Sequence Masked Autoencoders
Exploring Long-Sequence Masked Autoencoders
Ronghang Hu
Shoubhik Debnath
Saining Xie
Xinlei Chen
45
18
0
13 Oct 2022
A Generalist Framework for Panoptic Segmentation of Images and Videos
A Generalist Framework for Panoptic Segmentation of Images and Videos
Ting-Li Chen
Lala Li
Saurabh Saxena
Geoffrey E. Hinton
David J. Fleet
VGenMLLM
62
103
0
12 Oct 2022
Point Transformer V2: Grouped Vector Attention and Partition-based
  Pooling
Point Transformer V2: Grouped Vector Attention and Partition-based Pooling
Xiaoyang Wu
Yixing Lao
Li Jiang
Xihui Liu
Hengshuang Zhao
3DPCViT
99
397
0
11 Oct 2022
Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP
Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP
Feng Liang
Bichen Wu
Xiaoliang Dai
Kunpeng Li
Yinan Zhao
Hang Zhang
Peizhao Zhang
Peter Vajda
Diana Marculescu
CLIPVLM
102
457
0
09 Oct 2022
F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language
  Models
F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models
Weicheng Kuo
Huayu Chen
Xiuye Gu
A. Piergiovanni
A. Angelova
MLLMVLMObjD
131
137
0
30 Sep 2022
SegNeXt: Rethinking Convolutional Attention Design for Semantic
  Segmentation
SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation
Meng-Hao Guo
Chenggang Lu
Qibin Hou
Zheng Liu
Ming-Ming Cheng
Shiyong Hu
SSegViTVLM
76
651
0
18 Sep 2022
Test-Time Training with Masked Autoencoders
Test-Time Training with Masked Autoencoders
Yossi Gandelsman
Yu Sun
Xinlei Chen
Alexei A. Efros
OOD
101
177
0
15 Sep 2022
MinVIS: A Minimal Video Instance Segmentation Framework without
  Video-based Training
MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training
De-An Huang
Zhiding Yu
Anima Anandkumar
VLM
96
81
0
03 Aug 2022
Per-Clip Video Object Segmentation
Per-Clip Video Object Segmentation
Kwanyong Park
Sanghyun Woo
Seoung Wug Oh
In So Kweon
Joon-Young Lee
VLMVOS
78
51
0
03 Aug 2022
Video Mask Transfiner for High-Quality Video Instance Segmentation
Video Mask Transfiner for High-Quality Video Instance Segmentation
Lei Ke
Henghui Ding
Martin Danelljan
Yu-Wing Tai
Chi-Keung Tang
Feng Yu
70
30
0
28 Jul 2022
Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment
Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment
Qiang Chen
Xiaokang Chen
Jian Wang
Shan Zhang
Kun Yao
Haocheng Feng
Junyu Han
Errui Ding
Gang Zeng
Jingdong Wang
ViT
109
130
0
26 Jul 2022
In Defense of Online Models for Video Instance Segmentation
In Defense of Online Models for Video Instance Segmentation
Junfeng Wu
Qihao Liu
Yi Jiang
S. Bai
Alan Yuille
Xiang Bai
76
111
0
21 Jul 2022
Conditional DETR V2: Efficient Detection Transformer with Box Queries
Conditional DETR V2: Efficient Detection Transformer with Box Queries
Xiaokang Chen
Fangyun Wei
Gang Zeng
Jingdong Wang
ViT
70
33
0
18 Jul 2022
EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for
  Mobile Vision Applications
EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications
Muhammad Maaz
Abdelrahman M. Shaker
Hisham Cholakkal
Salman Khan
Syed Waqas Zamir
Rao Muhammad Anwer
Fahad Shahbaz Khan
ViT
98
199
0
21 Jun 2022
CMT-DeepLab: Clustering Mask Transformers for Panoptic Segmentation
CMT-DeepLab: Clustering Mask Transformers for Panoptic Segmentation
Qihang Yu
Huiyu Wang
Dahun Kim
Siyuan Qiao
Maxwell D. Collins
Yukun Zhu
Hartwig Adam
Alan Yuille
Liang-Chieh Chen
ViTMedIm
97
92
0
17 Jun 2022
ReCo: Retrieve and Co-segment for Zero-shot Transfer
ReCo: Retrieve and Co-segment for Zero-shot Transfer
Gyungin Shin
Weidi Xie
Samuel Albanie
VLM
101
92
0
14 Jun 2022
Multimodal Learning with Transformers: A Survey
Multimodal Learning with Transformers: A Survey
Peng Xu
Xiatian Zhu
David Clifton
ViT
168
567
0
13 Jun 2022
Discovering Object Masks with Transformers for Unsupervised Semantic
  Segmentation
Discovering Object Masks with Transformers for Unsupervised Semantic Segmentation
Wouter Van Gansbeke
Simon Vandenhende
Luc Van Gool
90
55
0
13 Jun 2022
VITA: Video Instance Segmentation via Object Token Association
VITA: Video Instance Segmentation via Object Token Association
Miran Heo
Sukjun Hwang
Seoung Wug Oh
Joon-Young Lee
Seon Joo Kim
VOS
68
92
0
09 Jun 2022
Detection Hub: Unifying Object Detection Datasets via Query Adaptation
  on Language Embedding
Detection Hub: Unifying Object Detection Datasets via Query Adaptation on Language Embedding
Lingchen Meng
Xiyang Dai
Yinpeng Chen
Pengchuan Zhang
Dongdong Chen
Mengchen Liu
Jianfeng Wang
Zuxuan Wu
Lu Yuan
Yu-Gang Jiang
ObjD
91
24
0
07 Jun 2022
Mask DINO: Towards A Unified Transformer-based Framework for Object
  Detection and Segmentation
Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation
Feng Li
Hao Zhang
Hu-Sheng Xu
Siyi Liu
Lei Zhang
L. Ni
H. Shum
ISeg
124
384
0
06 Jun 2022
12345
Next