Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.15808
Cited By
CvT: Introducing Convolutions to Vision Transformers
29 March 2021
Haiping Wu
Bin Xiao
Noel Codella
Mengchen Liu
Xiyang Dai
Lu Yuan
Lei Zhang
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CvT: Introducing Convolutions to Vision Transformers"
50 / 819 papers shown
Title
MatteFormer: Transformer-Based Image Matting via Prior-Tokens
Gyutae Park
S. Son
Jaeyoung Yoo
Seho Kim
Nojun Kwak
ViT
30
65
0
29 Mar 2022
SepViT: Separable Vision Transformer
Wei Li
Xing Wang
Xin Xia
Jie Wu
Jiashi Li
Xuefeng Xiao
Min Zheng
Shiping Wen
ViT
26
40
0
29 Mar 2022
MAT: Mask-Aware Transformer for Large Hole Image Inpainting
Wenbo Li
Zhe-nan Lin
Kun Zhou
Lu Qi
Yi Wang
Jiaya Jia
38
309
0
29 Mar 2022
Affine Medical Image Registration with Coarse-to-Fine Vision Transformer
Tony C. W. Mok
Albert C. S. Chung
ViT
MedIm
34
61
0
29 Mar 2022
Brain-inspired Multilayer Perceptron with Spiking Neurons
Wenshuo Li
Hanting Chen
Jianyuan Guo
Ziyang Zhang
Yunhe Wang
30
35
0
28 Mar 2022
Transformers Meet Visual Learning Understanding: A Comprehensive Review
Yuting Yang
Licheng Jiao
Xuantong Liu
F. Liu
Shuyuan Yang
Zhixi Feng
Xu Tang
ViT
MedIm
27
28
0
24 Mar 2022
Contrastive Transformer-based Multiple Instance Learning for Weakly Supervised Polyp Frame Detection
Yu Tian
Guansong Pang
Fengbei Liu
Yuyuan Liu
Chong Wang
Yuanhong Chen
Johan W. Verjans
G. Carneiro
ViT
MedIm
37
25
0
23 Mar 2022
PaCa-ViT: Learning Patch-to-Cluster Attention in Vision Transformers
Ryan Grainger
Thomas Paniagua
Xi Song
Naresh P. Cuntoor
Mun Wai Lee
Tianfu Wu
ViT
15
7
0
22 Mar 2022
Focal Modulation Networks
Jianwei Yang
Chunyuan Li
Xiyang Dai
Lu Yuan
Jianfeng Gao
3DPC
33
263
0
22 Mar 2022
MixFormer: End-to-End Tracking with Iterative Mixed Attention
Yutao Cui
Jiang Cheng
Limin Wang
Gangshan Wu
VOT
34
454
0
21 Mar 2022
ScalableViT: Rethinking the Context-oriented Generalization of Vision Transformer
Rui Yang
Hailong Ma
Jie Wu
Yansong Tang
Xuefeng Xiao
Min Zheng
Xiu Li
ViT
19
53
0
21 Mar 2022
HIPA: Hierarchical Patch Transformer for Single Image Super Resolution
Qing Cai
Yiming Qian
Jinxing Li
Junjie Lv
Yee-Hong Yang
Feng Wu
Dafan Zhang
25
28
0
19 Mar 2022
SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition
Mingxin Huang
Yuliang Liu
Zhenghao Peng
Chongyu Liu
Dahua Lin
Shenggao Zhu
N. Yuan
Kai Ding
Lianwen Jin
ViT
21
98
0
19 Mar 2022
CodedVTR: Codebook-based Sparse Voxel Transformer with Geometric Guidance
Tianchen Zhao
Niansong Zhang
Xuefei Ning
He Wang
Li Yi
Yu Wang
3DPC
ViT
22
8
0
18 Mar 2022
Three things everyone should know about Vision Transformers
Hugo Touvron
Matthieu Cord
Alaaeldin El-Nouby
Jakob Verbeek
Hervé Jégou
ViT
24
119
0
18 Mar 2022
SepTr: Separable Transformer for Audio Spectrogram Processing
Nicolae-Cătălin Ristea
Radu Tudor Ionescu
Fahad Shahbaz Khan
ViT
23
30
0
17 Mar 2022
PanoFormer: Panorama Transformer for Indoor 360 Depth Estimation
Zhijie Shen
Chunyu Lin
K. Liao
Lang Nie
Zishuo Zheng
Yao Zhao
ViT
MDE
27
85
0
17 Mar 2022
Attribute Surrogates Learning and Spectral Tokens Pooling in Transformers for Few-shot Learning
Yang He
Weihan Liang
Dongyang Zhao
Hong-Yu Zhou
Weifeng Ge
Yizhou Yu
Wenqiang Zhang
ViT
35
45
0
17 Mar 2022
Patch-Fool: Are Vision Transformers Always Robust Against Adversarial Perturbations?
Y. Fu
Shunyao Zhang
Shan-Hung Wu
Cheng Wan
Yingyan Lin
AAML
23
64
0
16 Mar 2022
InvPT: Inverted Pyramid Multi-task Transformer for Dense Scene Understanding
Hanrong Ye
Dan Xu
ViT
21
84
0
15 Mar 2022
Enriched CNN-Transformer Feature Aggregation Networks for Super-Resolution
Jinsu Yoo
Taehoon Kim
Sihaeng Lee
Seunghyeon Kim
Hankook Lee
Tae Hyun Kim
SupR
ViT
31
51
0
15 Mar 2022
TransCAM: Transformer Attention-based CAM Refinement for Weakly Supervised Semantic Segmentation
Ruiwen Li
Zheda Mai
C. Trabelsi
Zhibo Zhang
Jongseong Jang
Scott Sanner
ViT
31
61
0
14 Mar 2022
Deep Transformers Thirst for Comprehensive-Frequency Data
R. Xia
Chao Xue
Boyu Deng
Fang Wang
Jingchao Wang
ViT
25
0
0
14 Mar 2022
Self-Promoted Supervision for Few-Shot Transformer
Bowen Dong
Pan Zhou
Shuicheng Yan
W. Zuo
ViT
22
28
0
14 Mar 2022
Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs
Xiaohan Ding
Xinming Zhang
Yi Zhou
Jungong Han
Guiguang Ding
Jian Sun
VLM
49
528
0
13 Mar 2022
The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy
Tianlong Chen
Zhenyu (Allen) Zhang
Yu Cheng
Ahmed Hassan Awadallah
Zhangyang Wang
ViT
41
37
0
12 Mar 2022
Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice
Peihao Wang
Wenqing Zheng
Tianlong Chen
Zhangyang Wang
ViT
33
127
0
09 Mar 2022
ParC-Net: Position Aware Circular Convolution with Merits from ConvNets and Transformer
Haokui Zhang
Wenze Hu
Xiaoyu Wang
ViT
41
59
0
08 Mar 2022
Dynamic Group Transformer: A General Vision Transformer Backbone with Dynamic Group Attention
Kai Liu
Tianyi Wu
Cong Liu
Guodong Guo
ViT
41
17
0
08 Mar 2022
WaveMix: Resource-efficient Token Mixing for Images
Pranav Jeevan
A. Sethi
17
10
0
07 Mar 2022
Stepwise Feature Fusion: Local Guides Global
Jinfeng Wang
Qiming Huang
Feilong Tang
Jia Meng
Jionglong Su
Sifan Song
ViT
MedIm
27
181
0
07 Mar 2022
Knowledge Amalgamation for Object Detection with Transformers
Haofei Zhang
Feng Mao
Mengqi Xue
Gongfan Fang
Zunlei Feng
Mingli Song
Mingli Song
ViT
111
12
0
07 Mar 2022
Multi-Tailed Vision Transformer for Efficient Inference
Yunke Wang
Bo Du
Wenyuan Wang
Chang Xu
ViT
220
6
0
03 Mar 2022
ViTransPAD: Video Transformer using convolution and self-attention for Face Presentation Attack Detection
Zuheng Ming
Zitong Yu
M. Al-Ghadi
M. Visani
M. Luqman
J. Burie
ViT
CVBM
11
19
0
03 Mar 2022
Aggregated Pyramid Vision Transformer: Split-transform-merge Strategy for Image Recognition without Convolutions
Ruikang Ju
Ting-Yu Lin
Jen-Shiun Chiang
Jia-Hao Jian
Yu-Shian Lin
Liu-Rui-Yi Huang
ViT
16
1
0
02 Mar 2022
3DCTN: 3D Convolution-Transformer Network for Point Cloud Classification
Dening Lu
Qian Xie
Linlin Xu
Jonathan Li
3DV
19
68
0
02 Mar 2022
A Data-scalable Transformer for Medical Image Segmentation: Architecture, Model Efficiency, and Benchmark
Yunhe Gao
Mu Zhou
Ding Liu
Zhennan Yan
Shaoting Zhang
Dimitris N. Metaxas
ViT
MedIm
26
68
0
28 Feb 2022
CTformer: Convolution-free Token2Token Dilated Vision Transformer for Low-dose CT Denoising
Dayang Wang
Fenglei Fan
Zhan Wu
R. Liu
Fei Wang
Hengyong Yu
ViT
MedIm
35
122
0
28 Feb 2022
Factorizer: A Scalable Interpretable Approach to Context Modeling for Medical Image Segmentation
Pooya Ashtari
Diana Sima
L. De Lathauwer
D. Sappey-Marinier
F. Maes
Sabine Van Huffel
ViT
MedIm
28
35
0
24 Feb 2022
Auto-scaling Vision Transformers without Training
Wuyang Chen
Wei Huang
Xianzhi Du
Xiaodan Song
Zhangyang Wang
Denny Zhou
ViT
32
23
0
24 Feb 2022
GroupViT: Semantic Segmentation Emerges from Text Supervision
Jiarui Xu
Shalini De Mello
Sifei Liu
Wonmin Byeon
Thomas Breuel
Jan Kautz
Xinyu Wang
ViT
VLM
192
501
0
22 Feb 2022
ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for Image Recognition and Beyond
Qiming Zhang
Yufei Xu
Jing Zhang
Dacheng Tao
ViT
33
229
0
21 Feb 2022
Visual Attention Network
Meng-Hao Guo
Chengrou Lu
Zheng-Ning Liu
Ming-Ming Cheng
Shiyong Hu
ViT
VLM
24
637
0
20 Feb 2022
Discriminability-enforcing loss to improve representation learning
Florinel-Alin Croitoru
Diana-Nicoleta Grigore
Radu Tudor Ionescu
FaML
27
1
0
14 Feb 2022
CATs++: Boosting Cost Aggregation with Convolutions and Transformers
Seokju Cho
Sunghwan Hong
Seung Wook Kim
ViT
27
34
0
14 Feb 2022
Mixing and Shifting: Exploiting Global and Local Dependencies in Vision MLPs
Huangjie Zheng
Pengcheng He
Weizhu Chen
Mingyuan Zhou
22
14
0
14 Feb 2022
BViT: Broad Attention based Vision Transformer
Nannan Li
Yaran Chen
Weifan Li
Zixiang Ding
Dong Zhao
ViT
38
23
0
13 Feb 2022
Feature-level augmentation to improve robustness of deep neural networks to affine transformations
A. Sandru
Mariana-Iuliana Georgescu
Radu Tudor Ionescu
OOD
16
3
0
10 Feb 2022
LwPosr: Lightweight Efficient Fine-Grained Head Pose Estimation
Naina Dhingra
29
16
0
07 Feb 2022
Towards an Analytical Definition of Sufficient Data
Adam Byerly
T. Kalganova
27
4
0
07 Feb 2022
Previous
1
2
3
...
12
13
14
15
16
17
Next