Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.15808
Cited By
CvT: Introducing Convolutions to Vision Transformers
29 March 2021
Haiping Wu
Bin Xiao
Noel Codella
Mengchen Liu
Xiyang Dai
Lu Yuan
Lei Zhang
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CvT: Introducing Convolutions to Vision Transformers"
50 / 819 papers shown
Title
DnSwin: Toward Real-World Denoising via Continuous Wavelet Sliding-Transformer
Hao Li
Zhijing Yang
Xiaobin Hong
Ziying Zhao
Junyang Chen
Yukai Shi
Jin-shan Pan
DiffM
ViT
43
11
0
28 Jul 2022
Convolutional Embedding Makes Hierarchical Vision Transformer Stronger
Cong Wang
Hongmin Xu
Xiong Zhang
Li Wang
Zhitong Zheng
Haifeng Liu
ViT
20
20
0
27 Jul 2022
Learning Visual Representation from Modality-Shared Contrastive Language-Image Pre-training
Haoxuan You
Luowei Zhou
Bin Xiao
Noel Codella
Yu Cheng
Ruochen Xu
Shih-Fu Chang
Lu Yuan
CLIP
VLM
27
47
0
26 Jul 2022
Self-Distilled Vision Transformer for Domain Generalization
M. Sultana
Muzammal Naseer
Muhammad Haris Khan
Salman Khan
Fahad Shahbaz Khan
ViT
10
29
0
25 Jul 2022
D3Former: Debiased Dual Distilled Transformer for Incremental Learning
Abdel-rahman Mohamed
Rushali Grandhe
KJ Joseph
Salman Khan
Fahad Shahbaz Khan
CLL
28
9
0
25 Jul 2022
Jigsaw-ViT: Learning Jigsaw Puzzles in Vision Transformer
Yingyi Chen
Xiaoke Shen
Yahui Liu
Qinghua Tao
Johan A. K. Suykens
AAML
ViT
33
22
0
25 Jul 2022
Online Continual Learning with Contrastive Vision Transformer
Zhen Wang
Liu Liu
Yajing Kong
Jiaxian Guo
Dacheng Tao
CLL
21
36
0
24 Jul 2022
An Efficient Spatio-Temporal Pyramid Transformer for Action Detection
Yuetian Weng
Zizheng Pan
Mingfei Han
Xiaojun Chang
Bohan Zhuang
ViT
19
25
0
21 Jul 2022
Locality Guidance for Improving Vision Transformers on Tiny Datasets
Kehan Li
Runyi Yu
Zhennan Wang
Li-ming Yuan
Guoli Song
Jie Chen
ViT
32
44
0
20 Jul 2022
Vision Transformers: From Semantic Segmentation to Dense Prediction
Li Zhang
Jiachen Lu
Sixiao Zheng
Xinxuan Zhao
Xiatian Zhu
Yanwei Fu
Tao Xiang
Jianfeng Feng
Philip H. S. Torr
ViT
27
7
0
19 Jul 2022
Defect Transformer: An Efficient Hybrid Transformer Architecture for Surface Defect Detection
Junpu Wang
Guili Xu
Fuju Yan
Jinjin Wang
Zhengsheng Wang
ViT
MedIm
28
66
0
17 Jul 2022
SSMTL++: Revisiting Self-Supervised Multi-Task Learning for Video Anomaly Detection
Antonio Bărbălău
Radu Tudor Ionescu
Mariana-Iuliana Georgescu
J. Dueholm
B. Ramachandra
Kamal Nasrollahi
Fahad Shahbaz Khan
T. Moeslund
M. Shah
ViT
30
70
0
16 Jul 2022
Convolutional Bypasses Are Better Vision Transformer Adapters
Shibo Jie
Zhi-Hong Deng
VPVLM
21
131
0
14 Jul 2022
N-Grammer: Augmenting Transformers with latent n-grams
Aurko Roy
Rohan Anil
Guangda Lai
Benjamin Lee
Jeffrey Zhao
...
Yu
Phuong Dao
Christopher Fifty
Z. Chen
Yonghui Wu
21
7
0
13 Jul 2022
Eliminating Gradient Conflict in Reference-based Line-Art Colorization
Zekun Li
Zhengyang Geng
Zhao Kang
Wenyu Chen
Yibo Yang
21
35
0
13 Jul 2022
MSP-Former: Multi-Scale Projection Transformer for Single Image Desnowing
Sixiang Chen
Tian-Chun Ye
Yun-Peng Liu
Taodong Liao
Y. Ye
Erkang Chen
Peng Chen
ViT
28
51
0
12 Jul 2022
Long-term Leap Attention, Short-term Periodic Shift for Video Classification
Huatian Zhang
Lechao Cheng
Y. Hao
Chong-Wah Ngo
ViT
36
10
0
12 Jul 2022
Next-ViT: Next Generation Vision Transformer for Efficient Deployment in Realistic Industrial Scenarios
Jiashi Li
Xin Xia
W. Li
Huixia Li
Xing Wang
Xuefeng Xiao
Rui Wang
Min Zheng
Xin Pan
ViT
17
149
0
12 Jul 2022
Wave-ViT: Unifying Wavelet and Transformers for Visual Representation Learning
Ting Yao
Yingwei Pan
Yehao Li
Chong-Wah Ngo
Tao Mei
ViT
154
137
0
11 Jul 2022
Dual Vision Transformer
Ting Yao
Yehao Li
Yingwei Pan
Yu Wang
Xiaoping Zhang
Tao Mei
ViT
154
75
0
11 Jul 2022
Self-attention on Multi-Shifted Windows for Scene Segmentation
Litao Yu
Zhibin Li
Jian Zhang
Qiang Wu
SSeg
19
1
0
10 Jul 2022
Horizontal and Vertical Attention in Transformers
Litao Yu
Jing Zhang
ViT
17
1
0
10 Jul 2022
QKVA grid: Attention in Image Perspective and Stacked DETR
Wenyuan Sheng
ViT
MU
19
0
0
09 Jul 2022
FastLTS: Non-Autoregressive End-to-End Unconstrained Lip-to-Speech Synthesis
Yongqiang Wang
Zhou Zhao
19
10
0
08 Jul 2022
MaiT: Leverage Attention Masks for More Efficient Image Transformers
Ling Li
Ali Shafiee Ardestani
Joseph Hassoun
14
1
0
06 Jul 2022
Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Yifan Peng
Siddharth Dalmia
Ian Lane
Shinji Watanabe
30
143
0
06 Jul 2022
OSFormer: One-Stage Camouflaged Instance Segmentation with Transformers
Jialun Pei
Tianyang Cheng
Deng-Ping Fan
He Tang
Chuanbo Chen
Luc Van Gool
ViT
18
55
0
05 Jul 2022
Improving Semantic Segmentation in Transformers using Hierarchical Inter-Level Attention
Gary Leung
Jun Gao
Fangyin Wei
Sanja Fidler
21
3
0
05 Jul 2022
Dynamic Spatial Sparsification for Efficient Vision Transformers and Convolutional Neural Networks
Yongming Rao
Zuyan Liu
Wenliang Zhao
Jie Zhou
Jiwen Lu
ViT
44
36
0
04 Jul 2022
Masked World Models for Visual Control
Younggyo Seo
Danijar Hafner
Hao Liu
Fangchen Liu
Stephen James
Kimin Lee
Pieter Abbeel
OffRL
93
147
0
28 Jun 2022
BYOL-S: Learning Self-supervised Speech Representations by Bootstrapping
Gasser Elbanna
Neil Scheidwasser
M. Kegler
P. Beckmann
Karl El Hajal
Milos Cernak
SSL
33
21
0
24 Jun 2022
Vicinity Vision Transformer
Weixuan Sun
Zhen Qin
Huiyuan Deng
Jianyuan Wang
Yi Zhang
Kaihao Zhang
Nick Barnes
Stan Birchfield
Lingpeng Kong
Yiran Zhong
ViT
42
31
0
21 Jun 2022
Global Context Vision Transformers
Ali Hatamizadeh
Hongxu Yin
Greg Heinrich
Jan Kautz
Pavlo Molchanov
ViT
25
120
0
20 Jun 2022
Learning Multiscale Transformer Models for Sequence Generation
Bei Li
Tong Zheng
Yi Jing
Chengbo Jiao
Tong Xiao
Jingbo Zhu
32
9
0
19 Jun 2022
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
Jiangning Zhang
Xiangtai Li
Yabiao Wang
Chengjie Wang
Yibo Yang
Yong Liu
Dacheng Tao
ViT
34
32
0
19 Jun 2022
SimA: Simple Softmax-free Attention for Vision Transformers
Soroush Abbasi Koohpayegani
Hamed Pirsiavash
24
25
0
17 Jun 2022
Patch-level Representation Learning for Self-supervised Vision Transformers
Sukmin Yun
Hankook Lee
Jaehyung Kim
Jinwoo Shin
ViT
27
64
0
16 Jun 2022
SP-ViT: Learning 2D Spatial Priors for Vision Transformers
Yuxuan Zhou
Wangmeng Xiang
Chong Li
Biao Wang
Xihan Wei
Lei Zhang
M. Keuper
Xia Hua
ViT
37
15
0
15 Jun 2022
Efficient Adaptive Ensembling for Image Classification
A. Bruno
Davide Moroni
M. Martinelli
34
18
0
15 Jun 2022
Peripheral Vision Transformer
Juhong Min
Yucheng Zhao
Chong Luo
Minsu Cho
ViT
MDE
32
30
0
14 Jun 2022
MLP-3D: A MLP-like 3D Architecture with Grouped Time Mixing
Zhaofan Qiu
Ting Yao
Chong-Wah Ngo
Tao Mei
ViT
37
15
0
13 Jun 2022
Spatial Entropy as an Inductive Bias for Vision Transformers
E. Peruzzo
E. Sangineto
Yahui Liu
Marco De Nadai
Wei Bi
Bruno Lepri
N. Sebe
ViT
MDE
36
1
0
09 Jun 2022
MobileOne: An Improved One millisecond Mobile Backbone
Pavan Kumar Anasosalu Vasu
J. Gabriel
Jeff J. Zhu
Oncel Tuzel
Anurag Ranjan
33
154
0
08 Jun 2022
Separable Self-attention for Mobile Vision Transformers
Sachin Mehta
Mohammad Rastegari
ViT
MQ
26
252
0
06 Jun 2022
Federated Adversarial Training with Transformers
Ahmed Aldahdooh
W. Hamidouche
Olivier Déforges
FedML
ViT
25
2
0
05 Jun 2022
EfficientFormer: Vision Transformers at MobileNet Speed
Yanyu Li
Geng Yuan
Yang Wen
Eric Hu
Georgios Evangelidis
Sergey Tulyakov
Yanzhi Wang
Jian Ren
ViT
26
347
0
02 Jun 2022
Transforming medical imaging with Transformers? A comparative review of key properties, current progresses, and future perspectives
Jun Li
Junyu Chen
Yucheng Tang
Ce Wang
Bennett A. Landman
S. K. Zhou
ViT
OOD
MedIm
23
22
0
02 Jun 2022
The Fully Convolutional Transformer for Medical Image Segmentation
Athanasios Tragakis
Chaitanya Kaul
Roderick Murray-Smith
D. Husmeier
ViT
MedIm
25
56
0
01 Jun 2022
Vision GNN: An Image is Worth Graph of Nodes
Kai Han
Yunhe Wang
Jianyuan Guo
Yehui Tang
Enhua Wu
GNN
3DH
19
352
0
01 Jun 2022
HiViT: Hierarchical Vision Transformer Meets Masked Image Modeling
Xiaosong Zhang
Yunjie Tian
Wei Huang
QiXiang Ye
Qi Dai
Lingxi Xie
Qi Tian
64
26
0
30 May 2022
Previous
1
2
3
...
10
11
12
...
15
16
17
Next