Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2012.00759
Cited By
v1
v2
v3 (latest)
MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers
1 December 2020
Huiyu Wang
Yukun Zhu
Hartwig Adam
Alan Yuille
Liang-Chieh Chen
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Github (1023★)
Papers citing
"MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers"
50 / 323 papers shown
Title
Pyramid Fusion Transformer for Semantic Segmentation
Zipeng Qin
Jianbo Liu
Xiaoling Zhang
Maoqing Tian
Aojun Zhou
Shuai Yi
Hongsheng Li
ViT
86
16
0
11 Jan 2022
Scaling Open-Vocabulary Image Segmentation with Image-Level Labels
Golnaz Ghiasi
Xiuye Gu
Huayu Chen
Nayeon Lee
VLM
191
387
0
22 Dec 2021
MPViT: Multi-Path Vision Transformer for Dense Prediction
Youngwan Lee
Jonghee Kim
Jeffrey Willette
Sung Ju Hwang
ViT
118
254
0
21 Dec 2021
Lite Vision Transformer with Enhanced Self-Attention
Chenglin Yang
Yilin Wang
Jianming Zhang
He Zhang
Zijun Wei
Zhe Lin
Alan Yuille
ViT
84
119
0
20 Dec 2021
Slot-VPS: Object-centric Representation Learning for Video Panoptic Segmentation
Yi Zhou
Hui Zhang
Hana Lee
Shuyang Sun
Pingjun Li
Yangguang Zhu
ByungIn Yoo
Xiaojuan Qi
Jae-Joon Han
VOS
72
28
0
16 Dec 2021
QAHOI: Query-Based Anchors for Human-Object Interaction Detection
Junwen Chen
Keiji Yanai
60
41
0
16 Dec 2021
DualFormer: Local-Global Stratified Transformer for Efficient Video Recognition
Yuxuan Liang
Pan Zhou
Roger Zimmermann
Shuicheng Yan
ViT
84
21
0
09 Dec 2021
PolyphonicFormer: Unified Query Learning for Depth-aware Video Panoptic Segmentation
Haobo Yuan
Xiangtai Li
Yibo Yang
Guangliang Cheng
Jing Zhang
Yunhai Tong
Lefei Zhang
Dacheng Tao
MDE
148
44
0
05 Dec 2021
Hybrid Instance-aware Temporal Fusion for Online Video Instance Segmentation
Xiang Li
Jinglu Wang
Xiao Li
Yan Lu
84
19
0
03 Dec 2021
Masked-attention Mask Transformer for Universal Image Segmentation
Bowen Cheng
Ishan Misra
Alex Schwing
Alexander Kirillov
Rohit Girdhar
ISeg
427
2,407
0
02 Dec 2021
End-to-End Referring Video Object Segmentation with Multimodal Transformers
Adam Botach
Evgenii Zheltonozhskii
Chaim Baskin
VOS
115
150
0
29 Nov 2021
Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point Modeling
Xumin Yu
Lulu Tang
Yongming Rao
Tiejun Huang
Jie Zhou
Jiwen Lu
3DPC
194
692
0
29 Nov 2021
Efficient Self-Ensemble for Semantic Segmentation
Walid Bousselham
Guillaume Thibault
Lucas Pagano
Archana Machireddy
Joe W. Gray
Y. Chang
Xubo B. Song
ViT
80
27
0
26 Nov 2021
PTQ4ViT: Post-training quantization for vision transformers with twin uniform quantization
Zhihang Yuan
Chenhao Xue
Yiqi Chen
Qiang Wu
Guangyu Sun
ViT
MQ
93
142
0
24 Nov 2021
Pruning Self-attentions into Convolutional Layers in Single Path
Haoyu He
Jianfei Cai
Jing Liu
Zizheng Pan
Jing Zhang
Dacheng Tao
Bohan Zhuang
ViT
95
40
0
23 Nov 2021
DSPoint: Dual-scale Point Cloud Recognition with High-frequency Fusion
Renrui Zhang
Ziyao Zeng
Ziyu Guo
Xing Gao
Kexue Fu
Jianbo Shi
3DPC
144
26
0
19 Nov 2021
TransMix: Attend to Mix for Vision Transformers
Jieneng Chen
Shuyang Sun
Ju He
Philip Torr
Alan Yuille
S. Bai
ViT
128
110
0
18 Nov 2021
A Survey of Visual Transformers
Yang Liu
Yao Zhang
Yixin Wang
Feng Hou
Jin Yuan
Jiang Tian
Yang Zhang
Zhongchao Shi
Jianping Fan
Zhiqiang He
3DGS
ViT
207
356
0
11 Nov 2021
Sampling Equivariant Self-attention Networks for Object Detection in Aerial Images
Guo-Ye Yang
Xiang-Li Li
Ralph Robert Martin
Shimin Hu
3DPC
58
14
0
05 Nov 2021
DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction
Hao Feng
Yuechen Wang
Wen-gang Zhou
Jiajun Deng
Houqiang Li
ViT
113
60
0
25 Oct 2021
Video Instance Segmentation by Instance Flow Assembly
Xiang Li
Jinglu Wang
Xiao Li
Yan Lu
VOS
106
15
0
20 Oct 2021
ASFormer: Transformer for Action Segmentation
Fangqiu Yi
Hongyu Wen
Tingting Jiang
ViT
139
177
0
16 Oct 2021
The Center of Attention: Center-Keypoint Grouping via Attention for Multi-Person Pose Estimation
Guillem Brasó
Nikita Kister
Laura Leal-Taixé
3DPC
88
40
0
11 Oct 2021
ProTo: Program-Guided Transformer for Program-Guided Tasks
Zelin Zhao
Karan Samel
Binghong Chen
Le Song
ViT
LM&Ro
98
30
0
02 Oct 2021
Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers
Zhiqi Li
Wenhai Wang
Enze Xie
Zhiding Yu
Anima Anandkumar
J. Álvarez
Ping Luo
Tong Lu
ViT
188
140
0
08 Sep 2021
Voxel Transformer for 3D Object Detection
Jiageng Mao
Yujing Xue
Minzhe Niu
Haoyue Bai
Jiashi Feng
Xiaodan Liang
Hang Xu
Chunjing Xu
3DPC
ViT
112
416
0
06 Sep 2021
Searching for Efficient Multi-Stage Vision Transformers
Yi-Lun Liao
S. Karaman
Vivienne Sze
ViT
78
19
0
01 Sep 2021
Trans4Trans: Efficient Transformer for Transparent Object and Semantic Scene Segmentation in Real-World Navigation Assistance
Jiaming Zhang
Kailun Yang
Angela Constantinescu
Kunyu Peng
Karin Muller
Rainer Stiefelhagen
ViT
88
69
0
20 Aug 2021
Fully Convolutional Networks for Panoptic Segmentation with Point-based Supervision
Yanwei Li
Hengshuang Zhao
Xiaojuan Qi
Yukang Chen
Lu Qi
Liwei Wang
Zeming Li
Jian Sun
Jiaya Jia
106
51
0
17 Aug 2021
PSViT: Better Vision Transformer via Token Pooling and Attention Sharing
Boyu Chen
Peixia Li
Baopu Li
Chuming Li
Lei Bai
Chen Lin
Ming Sun
Junjie Yan
Wanli Ouyang
ViT
129
35
0
07 Aug 2021
Perceiver IO: A General Architecture for Structured Inputs & Outputs
Andrew Jaegle
Sebastian Borgeaud
Jean-Baptiste Alayrac
Carl Doersch
Catalin Ionescu
...
Olivier J. Hénaff
M. Botvinick
Andrew Zisserman
Oriol Vinyals
João Carreira
MLLM
VLM
GNN
171
585
0
30 Jul 2021
PlaneTR: Structure-Guided Transformers for 3D Plane Recovery
Bin Tan
Nan Xue
S. Bai
Tianfu Wu
Guisong Xia
ViT
117
40
0
27 Jul 2021
Image Fusion Transformer
VS Vibashan
Jeya Maria Jose Valanarasu
Poojan Oza
Vishal M. Patel
ViT
87
123
0
19 Jul 2021
Per-Pixel Classification is Not All You Need for Semantic Segmentation
Bowen Cheng
Alex Schwing
Alexander Kirillov
VLM
ViT
226
1,559
0
13 Jul 2021
Locally Enhanced Self-Attention: Combining Self-Attention and Convolution as Local and Context Terms
Chenglin Yang
Siyuan Qiao
Adam Kortylewski
Alan Yuille
144
4
0
12 Jul 2021
Local-to-Global Self-Attention in Vision Transformers
Jinpeng Li
Yichao Yan
Tianran Ouyang
Xiaokang Yang
Ling Shao
ViT
64
29
0
10 Jul 2021
Trans4Trans: Efficient Transformer for Transparent Object Segmentation to Help Visually Impaired People Navigate in the Real World
Jiaming Zhang
Kailun Yang
Angela Constantinescu
Kunyu Peng
Karin Muller
Rainer Stiefelhagen
ViT
96
62
0
07 Jul 2021
Focal Self-attention for Local-Global Interactions in Vision Transformers
Jianwei Yang
Chunyuan Li
Pengchuan Zhang
Xiyang Dai
Bin Xiao
Lu Yuan
Jianfeng Gao
ViT
106
437
0
01 Jul 2021
Looking Outside the Window: Wide-Context Transformer for the Semantic Segmentation of High-Resolution Remote Sensing Images
L. Ding
Dong Lin
Shaofu Lin
Jing Zhang
Xiaojie Cui
Yuebin Wang
Hao Tang
Lorenzo Bruzzone
ViT
154
101
0
29 Jun 2021
K-Net: Towards Unified Image Segmentation
Wenwei Zhang
Jiangmiao Pang
Kai-xiang Chen
Chen Change Loy
ISeg
130
372
0
28 Jun 2021
P2T: Pyramid Pooling Transformer for Scene Understanding
Yu-Huan Wu
Yun-Hai Liu
Xin Zhan
Mingg-Ming Cheng
ViT
188
234
0
22 Jun 2021
Efficient Self-supervised Vision Transformers for Representation Learning
Chunyuan Li
Jianwei Yang
Pengchuan Zhang
Mei Gao
Bin Xiao
Xiyang Dai
Lu Yuan
Jianfeng Gao
ViT
112
214
0
17 Jun 2021
DeepLab2: A TensorFlow Library for Deep Labeling
Mark Weber
Huiyu Wang
Siyuan Qiao
Jun Xie
Maxwell D. Collins
...
Laura Leal-Taixe
Alan Yuille
Florian Schroff
Hartwig Adam
Liang-Chieh Chen
VLM
105
49
0
17 Jun 2021
Improved Transformer for High-Resolution GANs
Long Zhao
Zizhao Zhang
Ting Chen
Dimitris N. Metaxas
Han Zhang
ViT
133
96
0
14 Jun 2021
CAT: Cross Attention in Vision Transformer
Hezheng Lin
Xingyi Cheng
Xiangyu Wu
Fan Yang
Dong Shen
Zhongyuan Wang
Qing Song
Wei Yuan
ViT
65
158
0
10 Jun 2021
Salient Object Ranking with Position-Preserved Attention
Haoyang Fang
Daoxin Zhang
Yi Zhang
Minghao Chen
Jiawei Li
Yao Hu
Deng Cai
Xiaofei He
71
21
0
09 Jun 2021
Chasing Sparsity in Vision Transformers: An End-to-End Exploration
Tianlong Chen
Yu Cheng
Zhe Gan
Lu Yuan
Lei Zhang
Zhangyang Wang
ViT
70
224
0
08 Jun 2021
SIMONe: View-Invariant, Temporally-Abstracted Object Representations via Unsupervised Video Decomposition
Rishabh Kabra
Daniel Zoran
Goker Erdogan
Loic Matthey
Antonia Creswell
M. Botvinick
Alexander Lerchner
Christopher P. Burgess
OCL
125
79
0
07 Jun 2021
Video Instance Segmentation using Inter-Frame Communication Transformers
Sukjun Hwang
Miran Heo
Seoung Wug Oh
Seon Joo Kim
ViT
134
139
0
07 Jun 2021
Combinatorial Optimization for Panoptic Segmentation: A Fully Differentiable Approach
Ahmed Abbas
Paul Swoboda
66
14
0
06 Jun 2021
Previous
1
2
3
4
5
6
7
Next