ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.15808
  4. Cited By
CvT: Introducing Convolutions to Vision Transformers

CvT: Introducing Convolutions to Vision Transformers

29 March 2021
Haiping Wu
Bin Xiao
Noel Codella
Mengchen Liu
Xiyang Dai
Lu Yuan
Lei Zhang
    ViT
ArXivPDFHTML

Papers citing "CvT: Introducing Convolutions to Vision Transformers"

50 / 818 papers shown
Title
Blending Anti-Aliasing into Vision Transformer
Blending Anti-Aliasing into Vision Transformer
Shengju Qian
Hao Shao
Yi Zhu
Mu Li
Jiaya Jia
26
20
0
28 Oct 2021
MVT: Multi-view Vision Transformer for 3D Object Recognition
MVT: Multi-view Vision Transformer for 3D Object Recognition
Shuo Chen
Tan Yu
Ping Li
ViT
37
43
0
25 Oct 2021
CvT-ASSD: Convolutional vision-Transformer Based Attentive Single Shot
  MultiBox Detector
CvT-ASSD: Convolutional vision-Transformer Based Attentive Single Shot MultiBox Detector
Weiqiang Jin
Hang Yu
Xiangfeng Luo
ViT
19
14
0
24 Oct 2021
HRFormer: High-Resolution Transformer for Dense Prediction
HRFormer: High-Resolution Transformer for Dense Prediction
Yuhui Yuan
Rao Fu
Lang Huang
Weihong Lin
Chao Zhang
Xilin Chen
Jingdong Wang
ViT
38
227
0
18 Oct 2021
CAE-Transformer: Transformer-based Model to Predict Invasiveness of Lung
  Adenocarcinoma Subsolid Nodules from Non-thin Section 3D CT Scans
CAE-Transformer: Transformer-based Model to Predict Invasiveness of Lung Adenocarcinoma Subsolid Nodules from Non-thin Section 3D CT Scans
Shahin Heidarian
Parnian Afshar
A. Oikonomou
Konstantinos N. Plataniotis
Arash Mohammadi
ViT
MedIm
18
3
0
17 Oct 2021
CyTran: A Cycle-Consistent Transformer with Multi-Level Consistency for
  Non-Contrast to Contrast CT Translation
CyTran: A Cycle-Consistent Transformer with Multi-Level Consistency for Non-Contrast to Contrast CT Translation
Nicolae-Cătălin Ristea
A. Miron
O. Savencu
Mariana-Iuliana Georgescu
N. Verga
Fahad Shahbaz Khan
Radu Tudor Ionescu
ViT
MedIm
43
20
0
12 Oct 2021
Global Vision Transformer Pruning with Hessian-Aware Saliency
Global Vision Transformer Pruning with Hessian-Aware Saliency
Huanrui Yang
Hongxu Yin
Maying Shen
Pavlo Molchanov
Hai Helen Li
Jan Kautz
ViT
30
39
0
10 Oct 2021
Adversarial Token Attacks on Vision Transformers
Adversarial Token Attacks on Vision Transformers
Ameya Joshi
Gauri Jagatap
C. Hegde
ViT
30
19
0
08 Oct 2021
PHNNs: Lightweight Neural Networks via Parameterized Hypercomplex
  Convolutions
PHNNs: Lightweight Neural Networks via Parameterized Hypercomplex Convolutions
Eleonora Grassucci
Aston Zhang
Danilo Comminiello
28
38
0
08 Oct 2021
UniNet: Unified Architecture Search with Convolution, Transformer, and
  MLP
UniNet: Unified Architecture Search with Convolution, Transformer, and MLP
Jihao Liu
Hongsheng Li
Guanglu Song
Xin Huang
Yu Liu
ViT
37
35
0
08 Oct 2021
SERAB: A multi-lingual benchmark for speech emotion recognition
SERAB: A multi-lingual benchmark for speech emotion recognition
Neil Scheidwasser
M. Kegler
P. Beckmann
Milos Cernak
32
44
0
07 Oct 2021
Adversarial Robustness Comparison of Vision Transformer and MLP-Mixer to
  CNNs
Adversarial Robustness Comparison of Vision Transformer and MLP-Mixer to CNNs
Philipp Benz
Soomin Ham
Chaoning Zhang
Adil Karjauv
In So Kweon
AAML
ViT
47
78
0
06 Oct 2021
3rd Place Solution to Google Landmark Recognition Competition 2021
3rd Place Solution to Google Landmark Recognition Competition 2021
Chengfeng Xu
Weimin Wang
Shuai Liu
Yong Wang
Yuxiang Tang
Tianling Bian
Yanyu Yan
Qi She
Cheng Yang
3DPC
3DV
33
6
0
06 Oct 2021
Ripple Attention for Visual Perception with Sub-quadratic Complexity
Ripple Attention for Visual Perception with Sub-quadratic Complexity
Lin Zheng
Huijie Pan
Lingpeng Kong
28
3
0
06 Oct 2021
MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision
  Transformer
MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer
Sachin Mehta
Mohammad Rastegari
ViT
218
1,213
0
05 Oct 2021
UFO-ViT: High Performance Linear Vision Transformer without Softmax
UFO-ViT: High Performance Linear Vision Transformer without Softmax
Jeonggeun Song
ViT
114
20
0
29 Sep 2021
Fine-tuning Vision Transformers for the Prediction of State Variables in
  Ising Models
Fine-tuning Vision Transformers for the Prediction of State Variables in Ising Models
Onur Kara
Arijit Sehanobish
H. Corzo
20
4
0
28 Sep 2021
BiTr-Unet: a CNN-Transformer Combined Network for MRI Brain Tumor
  Segmentation
BiTr-Unet: a CNN-Transformer Combined Network for MRI Brain Tumor Segmentation
Qiran Jia
Hai Shu
ViT
MedIm
98
69
0
25 Sep 2021
Audiomer: A Convolutional Transformer For Keyword Spotting
Surya Kant Sahu
Sai Mitheran
Juhi Kamdar
Meet Gandhi
40
8
0
21 Sep 2021
SDTP: Semantic-aware Decoupled Transformer Pyramid for Dense Image
  Prediction
SDTP: Semantic-aware Decoupled Transformer Pyramid for Dense Image Prediction
Zekun Li
Yufan Liu
Bing Li
Weiming Hu
Kebin Wu
Chengwei Peng
ViT
32
22
0
18 Sep 2021
Primer: Searching for Efficient Transformers for Language Modeling
Primer: Searching for Efficient Transformers for Language Modeling
David R. So
Wojciech Mañke
Hanxiao Liu
Zihang Dai
Noam M. Shazeer
Quoc V. Le
VLM
91
152
0
17 Sep 2021
Complementary Feature Enhanced Network with Vision Transformer for Image
  Dehazing
Complementary Feature Enhanced Network with Vision Transformer for Image Dehazing
Dong Zhao
Jia Li
Hongyu Li
Longhao Xu
ViT
21
16
0
15 Sep 2021
LibFewShot: A Comprehensive Library for Few-shot Learning
LibFewShot: A Comprehensive Library for Few-shot Learning
Wenbin Li
Ziyi
Ziyi Wang
Xuesong Yang
C. Dong
...
Jing Huo
Yinghuan Shi
Lei Wang
Yang Gao
Jiebo Luo
VLM
113
66
0
10 Sep 2021
Towards Transferable Adversarial Attacks on Vision Transformers
Towards Transferable Adversarial Attacks on Vision Transformers
Zhipeng Wei
Jingjing Chen
Micah Goldblum
Zuxuan Wu
Tom Goldstein
Yu-Gang Jiang
ViT
AAML
24
111
0
09 Sep 2021
Scaled ReLU Matters for Training Vision Transformers
Scaled ReLU Matters for Training Vision Transformers
Pichao Wang
Xue Wang
Haowen Luo
Jingkai Zhou
Zhipeng Zhou
Fan Wang
Hao Li
R. L. Jin
19
41
0
08 Sep 2021
Searching for Efficient Multi-Stage Vision Transformers
Searching for Efficient Multi-Stage Vision Transformers
Yi-Lun Liao
S. Karaman
Vivienne Sze
ViT
16
19
0
01 Sep 2021
Hire-MLP: Vision MLP via Hierarchical Rearrangement
Hire-MLP: Vision MLP via Hierarchical Rearrangement
Jianyuan Guo
Yehui Tang
Kai Han
Xinghao Chen
Han Wu
Chao Xu
Chang Xu
Yunhe Wang
46
105
0
30 Aug 2021
A Battle of Network Structures: An Empirical Study of CNN, Transformer,
  and MLP
A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP
Yucheng Zhao
Guangting Wang
Chuanxin Tang
Chong Luo
Wenjun Zeng
Zhengjun Zha
35
69
0
30 Aug 2021
Reiterative Domain Aware Multi-Target Adaptation
Reiterative Domain Aware Multi-Target Adaptation
Sudipan Saha
Shan Zhao
Nasrullah Sheikh
Xiao Xiang Zhu
24
1
0
26 Aug 2021
Shifted Chunk Transformer for Spatio-Temporal Representational Learning
Shifted Chunk Transformer for Spatio-Temporal Representational Learning
Xuefan Zha
Wentao Zhu
Tingxun Lv
Sen Yang
Ji Liu
AI4TS
ViT
33
27
0
26 Aug 2021
Transformers predicting the future. Applying attention in next-frame and
  time series forecasting
Transformers predicting the future. Applying attention in next-frame and time series forecasting
Radostin Cholakov
T. Kolev
AI4TS
22
16
0
18 Aug 2021
Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers
Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers
B. Dong
Wenhai Wang
Deng-Ping Fan
Jinpeng Li
Huazhu Fu
Ling Shao
ViT
MedIm
31
314
0
16 Aug 2021
Mobile-Former: Bridging MobileNet and Transformer
Mobile-Former: Bridging MobileNet and Transformer
Yinpeng Chen
Xiyang Dai
Dongdong Chen
Mengchen Liu
Xiaoyi Dong
Lu Yuan
Zicheng Liu
ViT
183
476
0
12 Aug 2021
ICAF: Iterative Contrastive Alignment Framework for Multimodal
  Abstractive Summarization
ICAF: Iterative Contrastive Alignment Framework for Multimodal Abstractive Summarization
Zijian Zhang
Chang Shu
Youxin Chen
Jing Xiao
Qian Zhang
Lu Zheng
23
5
0
11 Aug 2021
TriTransNet: RGB-D Salient Object Detection with a Triplet Transformer
  Embedding Network
TriTransNet: RGB-D Salient Object Detection with a Triplet Transformer Embedding Network
Zhengyi Liu
Yuan Wang
Zhengzheng Tu
Yun Xiao
Bin Tang
ViT
32
142
0
09 Aug 2021
Armour: Generalizable Compact Self-Attention for Vision Transformers
Armour: Generalizable Compact Self-Attention for Vision Transformers
Lingchuan Meng
ViT
21
3
0
03 Aug 2021
Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer
Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer
Yifan Xu
Zhijie Zhang
Mengdan Zhang
Kekai Sheng
Ke Li
Weiming Dong
Liqing Zhang
Changsheng Xu
Xing Sun
ViT
32
201
0
03 Aug 2021
CrossFormer: A Versatile Vision Transformer Hinging on Cross-scale
  Attention
CrossFormer: A Versatile Vision Transformer Hinging on Cross-scale Attention
Wenxiao Wang
Lulian Yao
Long Chen
Binbin Lin
Deng Cai
Xiaofei He
Wei Liu
32
258
0
31 Jul 2021
Query2Label: A Simple Transformer Way to Multi-Label Classification
Query2Label: A Simple Transformer Way to Multi-Label Classification
Shilong Liu
Lei Zhang
Xiao Yang
Hang Su
Jun Zhu
24
187
0
22 Jul 2021
CycleMLP: A MLP-like Architecture for Dense Prediction
CycleMLP: A MLP-like Architecture for Dense Prediction
Shoufa Chen
Enze Xie
Chongjian Ge
Runjian Chen
Ding Liang
Ping Luo
33
231
0
21 Jul 2021
FLAT: An Optimized Dataflow for Mitigating Attention Bottlenecks
FLAT: An Optimized Dataflow for Mitigating Attention Bottlenecks
Sheng-Chun Kao
Suvinay Subramanian
Gaurav Agrawal
Amir Yazdanbakhsh
T. Krishna
38
57
0
13 Jul 2021
Visual Parser: Representing Part-whole Hierarchies with Transformers
Visual Parser: Representing Part-whole Hierarchies with Transformers
Shuyang Sun
Xiaoyu Yue
S. Bai
Philip Torr
50
27
0
13 Jul 2021
Locally Enhanced Self-Attention: Combining Self-Attention and
  Convolution as Local and Context Terms
Locally Enhanced Self-Attention: Combining Self-Attention and Convolution as Local and Context Terms
Chenglin Yang
Siyuan Qiao
Adam Kortylewski
Alan Yuille
25
4
0
12 Jul 2021
Local-to-Global Self-Attention in Vision Transformers
Local-to-Global Self-Attention in Vision Transformers
Jinpeng Li
Yichao Yan
Tianran Ouyang
Xiaokang Yang
Ling Shao
ViT
25
29
0
10 Jul 2021
ViTGAN: Training GANs with Vision Transformers
ViTGAN: Training GANs with Vision Transformers
Kwonjoon Lee
Huiwen Chang
Lu Jiang
Han Zhang
Z. Tu
Ce Liu
ViT
28
183
0
09 Jul 2021
Vision Xformers: Efficient Attention for Image Classification
Vision Xformers: Efficient Attention for Image Classification
Pranav Jeevan
Amit Sethi
ViT
25
13
0
05 Jul 2021
Long-Short Transformer: Efficient Transformers for Language and Vision
Long-Short Transformer: Efficient Transformers for Language and Vision
Chen Zhu
Ming-Yu Liu
Chaowei Xiao
M. Shoeybi
Tom Goldstein
Anima Anandkumar
Bryan Catanzaro
ViT
VLM
32
131
0
05 Jul 2021
CSWin Transformer: A General Vision Transformer Backbone with
  Cross-Shaped Windows
CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows
Xiaoyi Dong
Jianmin Bao
Dongdong Chen
Weiming Zhang
Nenghai Yu
Lu Yuan
Dong Chen
B. Guo
ViT
25
956
0
01 Jul 2021
AutoFormer: Searching Transformers for Visual Recognition
AutoFormer: Searching Transformers for Visual Recognition
Minghao Chen
Houwen Peng
Jianlong Fu
Haibin Ling
ViT
36
259
0
01 Jul 2021
Global Filter Networks for Image Classification
Global Filter Networks for Image Classification
Yongming Rao
Wenliang Zhao
Zheng Zhu
Jiwen Lu
Jie Zhou
ViT
28
450
0
01 Jul 2021
Previous
123...14151617
Next