Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.15808
Cited By
CvT: Introducing Convolutions to Vision Transformers
29 March 2021
Haiping Wu
Bin Xiao
Noel Codella
Mengchen Liu
Xiyang Dai
Lu Yuan
Lei Zhang
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CvT: Introducing Convolutions to Vision Transformers"
50 / 818 papers shown
Title
Convolutional Xformers for Vision
Pranav Jeevan
Amit Sethi
ViT
55
12
0
25 Jan 2022
UniFormer: Unifying Convolution and Self-attention for Visual Recognition
Kunchang Li
Yali Wang
Junhao Zhang
Peng Gao
Guanglu Song
Yu Liu
Hongsheng Li
Yu Qiao
ViT
162
360
0
24 Jan 2022
Improving Chest X-Ray Report Generation by Leveraging Warm Starting
Aaron Nicolson
Jason Dowling
Bevan Koopman
ViT
LM&MA
MedIm
30
90
0
24 Jan 2022
UniFormer: Unified Transformer for Efficient Spatiotemporal Representation Learning
Kunchang Li
Yali Wang
Peng Gao
Guanglu Song
Yu Liu
Hongsheng Li
Yu Qiao
ViT
47
238
0
12 Jan 2022
A ConvNet for the 2020s
Zhuang Liu
Hanzi Mao
Chaozheng Wu
Christoph Feichtenhofer
Trevor Darrell
Saining Xie
ViT
42
4,983
0
10 Jan 2022
Spatio-Temporal Tuples Transformer for Skeleton-Based Action Recognition
Helei Qiu
B. Hou
Bo Ren
Xiaohua Zhang
ViT
24
47
0
08 Jan 2022
QuadTree Attention for Vision Transformers
Shitao Tang
Jiahui Zhang
Siyu Zhu
Ping Tan
ViT
169
156
0
08 Jan 2022
Lumbar Bone Mineral Density Estimation from Chest X-ray Images: Anatomy-aware Attentive Multi-ROI Modeling
Fakai Wang
K. Zheng
Le Lu
Jing Xiao
Min Wu
C. Kuo
S. Miao
6
11
0
05 Jan 2022
Lawin Transformer: Improving Semantic Segmentation Transformer with Multi-Scale Representations via Large Window Attention
Haotian Yan
Chuang Zhang
Ming Wu
ViT
30
63
0
05 Jan 2022
PyramidTNT: Improved Transformer-in-Transformer Baselines with Pyramid Architecture
Kai Han
Jianyuan Guo
Yehui Tang
Yunhe Wang
ViT
34
22
0
04 Jan 2022
Vision Transformer with Deformable Attention
Zhuofan Xia
Xuran Pan
S. Song
Li Erran Li
Gao Huang
ViT
33
456
0
03 Jan 2022
HPRN: Holistic Prior-embedded Relation Network for Spectral Super-Resolution
Chaoxiong Wu
Jiaojiao Li
Rui Song
Yunsong Li
Qian Du
30
15
0
29 Dec 2021
Pale Transformer: A General Vision Transformer Backbone with Pale-Shaped Attention
Sitong Wu
Tianyi Wu
Hao Hao Tan
G. Guo
ViT
31
70
0
28 Dec 2021
SPViT: Enabling Faster Vision Transformers via Soft Token Pruning
Zhenglun Kong
Peiyan Dong
Xiaolong Ma
Xin Meng
Mengshu Sun
...
Geng Yuan
Bin Ren
Minghai Qin
H. Tang
Yanzhi Wang
ViT
34
144
0
27 Dec 2021
Vision Transformer for Small-Size Datasets
Seung Hoon Lee
Seunghyun Lee
B. Song
ViT
22
222
0
27 Dec 2021
Learned Queries for Efficient Local Attention
Moab Arar
Ariel Shamir
Amit H. Bermano
ViT
41
29
0
21 Dec 2021
MPViT: Multi-Path Vision Transformer for Dense Prediction
Youngwan Lee
Jonghee Kim
Jeffrey Willette
Sung Ju Hwang
ViT
29
244
0
21 Dec 2021
Lite Vision Transformer with Enhanced Self-Attention
Chenglin Yang
Yilin Wang
Jianming Zhang
He Zhang
Zijun Wei
Zhe-nan Lin
Alan Yuille
ViT
21
112
0
20 Dec 2021
StyleSwin: Transformer-based GAN for High-resolution Image Generation
Bo Zhang
Shuyang Gu
Bo Zhang
Jianmin Bao
Dong Chen
Fang Wen
Yong Wang
B. Guo
ViT
38
223
0
20 Dec 2021
Towards End-to-End Image Compression and Analysis with Transformers
Yuanchao Bai
Xu Yang
Xianming Liu
Junjun Jiang
Yaowei Wang
Xiangyang Ji
Wen Gao
ViT
31
51
0
17 Dec 2021
Couplformer:Rethinking Vision Transformer with Coupling Attention Map
Hai Lan
Xihao Wang
Xian Wei
ViT
31
3
0
10 Dec 2021
Locally Shifted Attention With Early Global Integration
Shelly Sheynin
Sagie Benaim
Adam Polyak
Lior Wolf
ViT
11
0
0
09 Dec 2021
3D Medical Point Transformer: Introducing Convolution to Attention Networks for Medical Point Cloud Analysis
Jianhui Yu
Chaoyi Zhang
Heng Wang
Dingxin Zhang
Yang Song
Tiange Xiang
Dongnan Liu
Weidong (Tom) Cai
ViT
MedIm
21
32
0
09 Dec 2021
MS-TCT: Multi-Scale Temporal ConvTransformer for Action Detection
Rui Dai
Srijan Das
Kumara Kahatapitiya
Michael S. Ryoo
F. Brémond
ViT
42
73
0
07 Dec 2021
Creating Multimodal Interactive Agents with Imitation and Self-Supervised Learning
DeepMind Interactive Agents Team Josh Abramson
Josh Abramson
Arun Ahuja
Arthur Brussee
Federico Carnevale
...
Tamara von Glehn
Greg Wayne
Nathaniel Wong
Chen Yan
Rui Zhu
LM&Ro
40
46
0
07 Dec 2021
Bootstrapping ViTs: Towards Liberating Vision Transformers from Pre-training
Haofei Zhang
Jiarui Duan
Mengqi Xue
Mingli Song
Li Sun
Xiuming Zhang
ViT
AI4CE
30
16
0
07 Dec 2021
GETAM: Gradient-weighted Element-wise Transformer Attention Map for Weakly-supervised Semantic segmentation
Weixuan Sun
Jing Zhang
Zheyuan Liu
Yiran Zhong
Nick Barnes
ViT
63
14
0
06 Dec 2021
Dynamic Token Normalization Improves Vision Transformers
Wenqi Shao
Yixiao Ge
Zhaoyang Zhang
Xuyuan Xu
Xiaogang Wang
Ying Shan
Ping Luo
ViT
121
11
0
05 Dec 2021
BEVT: BERT Pretraining of Video Transformers
Rui Wang
Dongdong Chen
Zuxuan Wu
Yinpeng Chen
Xiyang Dai
Mengchen Liu
Yu-Gang Jiang
Luowei Zhou
Lu Yuan
ViT
39
203
0
02 Dec 2021
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
Yanghao Li
Chaoxia Wu
Haoqi Fan
K. Mangalam
Bo Xiong
Jitendra Malik
Christoph Feichtenhofer
ViT
75
678
0
02 Dec 2021
Vision Pair Learning: An Efficient Training Framework for Image Classification
Bei Tong
Xiaoyuan Yu
ViT
20
0
0
02 Dec 2021
AdaViT: Adaptive Vision Transformers for Efficient Image Recognition
Lingchen Meng
Hengduo Li
Bor-Chun Chen
Shiyi Lan
Zuxuan Wu
Yu-Gang Jiang
Ser-Nam Lim
ViT
28
221
0
30 Nov 2021
Adaptive Token Sampling For Efficient Vision Transformers
Mohsen Fayyaz
Soroush Abbasi Koohpayegani
F. Jafari
Sunando Sengupta
Hamid Reza Vaezi Joze
Eric Sommerlade
Hamed Pirsiavash
Juergen Gall
ViT
16
146
0
30 Nov 2021
TransWeather: Transformer-based Restoration of Images Degraded by Adverse Weather Conditions
Jeya Maria Jose Valanarasu
R. Yasarla
Vishal M. Patel
ViT
54
276
0
29 Nov 2021
On the Integration of Self-Attention and Convolution
Xuran Pan
Chunjiang Ge
Rui Lu
S. Song
Guanfu Chen
Zeyi Huang
Gao Huang
SSL
41
288
0
29 Nov 2021
SWAT: Spatial Structure Within and Among Tokens
Kumara Kahatapitiya
Michael S. Ryoo
25
6
0
26 Nov 2021
NomMer: Nominate Synergistic Context in Vision Transformer for Visual Recognition
Hao Liu
Xinghua Jiang
Xin Li
Zhimin Bao
Deqiang Jiang
Bo Ren
ViT
32
16
0
25 Nov 2021
Self-slimmed Vision Transformer
Zhuofan Zong
Kunchang Li
Guanglu Song
Yali Wang
Yu Qiao
B. Leng
Yu Liu
ViT
21
30
0
24 Nov 2021
Octree Transformer: Autoregressive 3D Shape Generation on Hierarchically Structured Sequences
Moritz Ibing
Gregor Kobsik
Leif Kobbelt
33
37
0
24 Nov 2021
Florence: A New Foundation Model for Computer Vision
Lu Yuan
Dongdong Chen
Yi-Ling Chen
Noel Codella
Xiyang Dai
...
Zhen Xiao
Jianwei Yang
Michael Zeng
Luowei Zhou
Pengchuan Zhang
VLM
31
879
0
22 Nov 2021
MetaFormer Is Actually What You Need for Vision
Weihao Yu
Mi Luo
Pan Zhou
Chenyang Si
Yichen Zhou
Xinchao Wang
Jiashi Feng
Shuicheng Yan
37
874
0
22 Nov 2021
Semi-Supervised Vision Transformers
Zejia Weng
Xitong Yang
Ang Li
Zuxuan Wu
Yu-Gang Jiang
ViT
17
40
0
22 Nov 2021
CpT: Convolutional Point Transformer for 3D Point Cloud Processing
Chaitanya Kaul
Joshua Mitton
H. Dai
Roderick Murray-Smith
3DPC
27
6
0
21 Nov 2021
Rethinking Query, Key, and Value Embedding in Vision Transformer under Tiny Model Constraints
Jaesin Ahn
Jiuk Hong
Jeongwoo Ju
Heechul Jung
ViT
32
3
0
19 Nov 2021
INTERN: A New Learning Paradigm Towards General Vision
Jing Shao
Siyu Chen
Yangguang Li
Kun Wang
Zhen-fei Yin
...
F. Yu
Junjie Yan
Dahua Lin
Xiaogang Wang
Yu Qiao
24
34
0
16 Nov 2021
Attention Mechanisms in Computer Vision: A Survey
Meng-Hao Guo
Tianhan Xu
Jiangjiang Liu
Zheng-Ning Liu
Peng-Tao Jiang
Tai-Jiang Mu
Song-Hai Zhang
Ralph Robert Martin
Ming-Ming Cheng
Shimin Hu
19
1,636
0
15 Nov 2021
Searching for TrioNet: Combining Convolution with Local and Global Self-Attention
Huaijin Pi
Huiyu Wang
Yingwei Li
Zizhang Li
Alan Yuille
ViT
27
3
0
15 Nov 2021
A Survey of Visual Transformers
Yang Liu
Yao Zhang
Yixin Wang
Feng Hou
Jin Yuan
Jiang Tian
Yang Zhang
Zhongchao Shi
Jianping Fan
Zhiqiang He
3DGS
ViT
77
330
0
11 Nov 2021
Sliced Recursive Transformer
Zhiqiang Shen
Zechun Liu
Eric P. Xing
ViT
22
27
0
09 Nov 2021
Convolutional Gated MLP: Combining Convolutions & gMLP
A. Rajagopal
V. Nirmala
31
14
0
06 Nov 2021
Previous
1
2
3
...
13
14
15
16
17
Next