ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.15808
  4. Cited By
CvT: Introducing Convolutions to Vision Transformers

CvT: Introducing Convolutions to Vision Transformers

29 March 2021
Haiping Wu
Bin Xiao
Noel Codella
Mengchen Liu
Xiyang Dai
Lu Yuan
Lei Zhang
    ViT
ArXivPDFHTML

Papers citing "CvT: Introducing Convolutions to Vision Transformers"

50 / 818 papers shown
Title
TCFormer: Visual Recognition via Token Clustering Transformer
TCFormer: Visual Recognition via Token Clustering Transformer
Wang Zeng
Sheng Jin
Lumin Xu
Wentao Liu
Chao Qian
Wanli Ouyang
Ping Luo
Xiaogang Wang
33
3
0
16 Jul 2024
Parameter Efficient Fine Tuning for Multi-scanner PET to PET
  Reconstruction
Parameter Efficient Fine Tuning for Multi-scanner PET to PET Reconstruction
Yumin Kim
Gayoon Choi
Seong Jae Hwang
39
0
0
10 Jul 2024
HAFormer: Unleashing the Power of Hierarchy-Aware Features for
  Lightweight Semantic Segmentation
HAFormer: Unleashing the Power of Hierarchy-Aware Features for Lightweight Semantic Segmentation
Guoan Xu
Wenjing Jia
Tao Wu
Ligeng Chen
Guangwei Gao
ViT
38
9
0
10 Jul 2024
iiANET: Inception Inspired Attention Hybrid Network for efficient Long-Range Dependency
iiANET: Inception Inspired Attention Hybrid Network for efficient Long-Range Dependency
Haruna Yunusa
Qin Shiyin
Abdulrahman Hamman Adama Chukkol
Isah Bello
A. Lawan
Isah Bello
46
4
0
10 Jul 2024
Fish-Vista: A Multi-Purpose Dataset for Understanding & Identification of Traits from Images
Fish-Vista: A Multi-Purpose Dataset for Understanding & Identification of Traits from Images
Kazi Sajeed Mehrab
M. Maruf
Arka Daw
Harish Babu Manogaran
Abhilash Neog
...
Paula Mabee
Wasila Dahdul
Anuj Karpatne
Wasila M Dahdul
Anuj Karpatne
41
4
0
10 Jul 2024
CBM: Curriculum by Masking
CBM: Curriculum by Masking
Andrei Jarca
Florinel-Alin Croitoru
Radu Tudor Ionescu
35
0
0
06 Jul 2024
Kolmogorov-Arnold Convolutions: Design Principles and Empirical Studies
Kolmogorov-Arnold Convolutions: Design Principles and Empirical Studies
Ivan Drokin
53
19
0
01 Jul 2024
Query-Efficient Hard-Label Black-Box Attack against Vision Transformers
Query-Efficient Hard-Label Black-Box Attack against Vision Transformers
Chao Zhou
Xiaowen Shi
Yuan-Gen Wang
ViT
AAML
29
0
0
29 Jun 2024
Fibottention: Inceptive Visual Representation Learning with Diverse
  Attention Across Heads
Fibottention: Inceptive Visual Representation Learning with Diverse Attention Across Heads
Ali Khaleghi Rahimian
Manish Kumar Govind
Subhajit Maity
Dominick Reilly
Christian Kummerle
Srijan Das
A. Dutta
43
1
0
27 Jun 2024
Implicit-Zoo: A Large-Scale Dataset of Neural Implicit Functions for 2D
  Images and 3D Scenes
Implicit-Zoo: A Large-Scale Dataset of Neural Implicit Functions for 2D Images and 3D Scenes
Qi Ma
Danda Pani Paudel
E. Konukoglu
Luc Van Gool
40
6
0
25 Jun 2024
A Primal-Dual Framework for Transformers and Neural Networks
A Primal-Dual Framework for Transformers and Neural Networks
Tan M. Nguyen
Tam Nguyen
Nhat Ho
Andrea L. Bertozzi
Richard G. Baraniuk
Stanley J. Osher
ViT
29
13
0
19 Jun 2024
Learning to Adapt Foundation Model DINOv2 for Capsule Endoscopy
  Diagnosis
Learning to Adapt Foundation Model DINOv2 for Capsule Endoscopy Diagnosis
Bowen Zhang
Ying Chen
Long Bai
Yan Zhao
Yuxiang Sun
Yixuan Yuan
Jianhua Zhang
Hongliang Ren
40
4
0
15 Jun 2024
AdaNCA: Neural Cellular Automata As Adaptors For More Robust Vision
  Transformer
AdaNCA: Neural Cellular Automata As Adaptors For More Robust Vision Transformer
Yitao Xu
Tong Zhang
Sabine Süsstrunk
ViT
47
0
0
12 Jun 2024
Adaptively Bypassing Vision Transformer Blocks for Efficient Visual
  Tracking
Adaptively Bypassing Vision Transformer Blocks for Efficient Visual Tracking
Xiangyang Yang
Dan Zeng
Xucheng Wang
You Wu
Hengzhou Ye
Qijun Zhao
Shuiwang Li
59
3
0
12 Jun 2024
You Only Need Less Attention at Each Stage in Vision Transformers
You Only Need Less Attention at Each Stage in Vision Transformers
Shuoxi Zhang
Hanpeng Liu
Stephen Lin
Kun He
53
5
0
01 Jun 2024
Automatic Channel Pruning for Multi-Head Attention
Automatic Channel Pruning for Multi-Head Attention
Eunho Lee
Youngbae Hwang
ViT
40
1
0
31 May 2024
Optimizing Foundation Model Inference on a Many-tiny-core Open-source
  RISC-V Platform
Optimizing Foundation Model Inference on a Many-tiny-core Open-source RISC-V Platform
Viviane Potocnik
Luca Colagrande
Tim Fischer
L. Bertaccini
Daniele Jahier Pagliari
Alessio Burrello
Luca Benini
23
3
0
29 May 2024
ViG: Linear-complexity Visual Sequence Learning with Gated Linear
  Attention
ViG: Linear-complexity Visual Sequence Learning with Gated Linear Attention
Bencheng Liao
Xinggang Wang
Lianghui Zhu
Qian Zhang
Chang Huang
57
4
0
28 May 2024
XFormParser: A Simple and Effective Multimodal Multilingual
  Semi-structured Form Parser
XFormParser: A Simple and Effective Multimodal Multilingual Semi-structured Form Parser
Xianfu Cheng
Hang Zhang
Jian Yang
Xiang Li
Weixiao Zhou
...
Fei Liu
Wei Zhang
Tao Sun
Tongliang Li
Zhoujun Li
52
2
0
27 May 2024
ETTrack: Enhanced Temporal Motion Predictor for Multi-Object Tracking
ETTrack: Enhanced Temporal Motion Predictor for Multi-Object Tracking
Xudong Han
Nobuyuki Oishi
Yueying Tian
Elif Ucurum
R. Young
C. Chatwin
Philip Birch
40
3
0
24 May 2024
YOLOv10: Real-Time End-to-End Object Detection
YOLOv10: Real-Time End-to-End Object Detection
Ao Wang
Hui Chen
Lihao Liu
Kai Chen
Zijia Lin
Jungong Han
Guiguang Ding
3DH
43
916
0
23 May 2024
A Survey on Vision-Language-Action Models for Embodied AI
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
82
42
0
23 May 2024
CSTA: CNN-based Spatiotemporal Attention for Video Summarization
CSTA: CNN-based Spatiotemporal Attention for Video Summarization
Jaewon Son
Jaehun Park
Kwangsu Kim
AI4TS
ViT
39
8
0
20 May 2024
Stereo-Knowledge Distillation from dpMV to Dual Pixels for Light Field
  Video Reconstruction
Stereo-Knowledge Distillation from dpMV to Dual Pixels for Light Field Video Reconstruction
Aryan Garg
Raghav Mallampali
Akshat Joshi
Shrisudhan Govindarajan
Kaushik Mitra
39
0
0
20 May 2024
GestFormer: Multiscale Wavelet Pooling Transformer Network for Dynamic
  Hand Gesture Recognition
GestFormer: Multiscale Wavelet Pooling Transformer Network for Dynamic Hand Gesture Recognition
Mallika Garg
Debashis Ghosh
P. M. Pradhan
SLR
ViT
44
2
0
18 May 2024
All in One Framework for Multimodal Re-identification in the Wild
All in One Framework for Multimodal Re-identification in the Wild
He Li
Mang Ye
Ming Zhang
Bo Du
35
9
0
08 May 2024
Examining Changes in Internal Representations of Continual Learning
  Models Through Tensor Decomposition
Examining Changes in Internal Representations of Continual Learning Models Through Tensor Decomposition
Nishant Suresh Aswani
Amira Guesmi
Muhammad Abdullah Hanif
Muhammad Shafique
CLL
30
1
0
06 May 2024
A separability-based approach to quantifying generalization: which layer
  is best?
A separability-based approach to quantifying generalization: which layer is best?
Luciano Dyballa
Evan Gerritz
Steven W. Zucker
OOD
37
3
0
02 May 2024
Fusing Depthwise and Pointwise Convolutions for Efficient Inference on
  GPUs
Fusing Depthwise and Pointwise Convolutions for Efficient Inference on GPUs
Fareed Qararyah
M. Azhar
Mohammad Ali Maleki
Pedro Trancoso
29
1
0
30 Apr 2024
ShadowMaskFormer: Mask Augmented Patch Embeddings for Shadow Removal
ShadowMaskFormer: Mask Augmented Patch Embeddings for Shadow Removal
Zhuohao Li
Guoyang Xie
Guannan Jiang
Zhichao Lu
36
3
0
29 Apr 2024
GLIMS: Attention-Guided Lightweight Multi-Scale Hybrid Network for
  Volumetric Semantic Segmentation
GLIMS: Attention-Guided Lightweight Multi-Scale Hybrid Network for Volumetric Semantic Segmentation
Z. A. Yazici
Ilkay Oksuz
H. K. Ekenel
MedIm
38
7
0
27 Apr 2024
PromptCIR: Blind Compressed Image Restoration with Prompt Learning
PromptCIR: Blind Compressed Image Restoration with Prompt Learning
Bingchen Li
Xin Li
Yiting Lu
Ruoyu Feng
Mengxi Guo
Shijie Zhao
Li Zhang
Zhibo Chen
39
13
0
26 Apr 2024
MathNet: A Data-Centric Approach for Printed Mathematical Expression
  Recognition
MathNet: A Data-Centric Approach for Printed Mathematical Expression Recognition
Felix M. Schmitt-Koopmann
Elaine M. Huang
Hans-Peter Hutter
Thilo Stadelmann
Alireza Darvishy
32
4
0
21 Apr 2024
Nested-TNT: Hierarchical Vision Transformers with Multi-Scale Feature
  Processing
Nested-TNT: Hierarchical Vision Transformers with Multi-Scale Feature Processing
Yuang Liu
Zhiheng Qiu
Xiaokai Qin
ViT
31
0
0
20 Apr 2024
An Experimental Study on Exploring Strong Lightweight Vision
  Transformers via Masked Image Modeling Pre-Training
An Experimental Study on Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training
Jin Gao
Shubo Lin
Shaoru Wang
Yutong Kou
Zeming Li
Liang Li
Congxuan Zhang
Xiaoqin Zhang
Yizheng Wang
Weiming Hu
47
1
0
18 Apr 2024
Training Transformer Models by Wavelet Losses Improves Quantitative and
  Visual Performance in Single Image Super-Resolution
Training Transformer Models by Wavelet Losses Improves Quantitative and Visual Performance in Single Image Super-Resolution
Cansu Korkmaz
A. Murat Tekalp
ViT
44
6
0
17 Apr 2024
Weight Copy and Low-Rank Adaptation for Few-Shot Distillation of Vision
  Transformers
Weight Copy and Low-Rank Adaptation for Few-Shot Distillation of Vision Transformers
Diana-Nicoleta Grigore
Mariana-Iuliana Georgescu
J. A. Justo
T. Johansen
Andreea-Iuliana Ionescu
Radu Tudor Ionescu
36
0
0
14 Apr 2024
TSLANet: Rethinking Transformers for Time Series Representation Learning
TSLANet: Rethinking Transformers for Time Series Representation Learning
Emadeldeen Eldele
Mohamed Ragab
Zhenghua Chen
Min-man Wu
Xiaoli Li
AI4TS
AIFin
36
37
0
12 Apr 2024
Robust feature knowledge distillation for enhanced performance of
  lightweight crack segmentation models
Robust feature knowledge distillation for enhanced performance of lightweight crack segmentation models
Zhaohui Chen
Elyas Asadi Shamsabadi
Sheng Jiang
Luming Shen
Daniel Dias-da-Costa
29
2
0
09 Apr 2024
Using Few-Shot Learning to Classify Primary Lung Cancer and Other Malignancy with Lung Metastasis in Cytological Imaging via Endobronchial Ultrasound Procedures
Using Few-Shot Learning to Classify Primary Lung Cancer and Other Malignancy with Lung Metastasis in Cytological Imaging via Endobronchial Ultrasound Procedures
Ching-Kai Lin
Di-Chun Wei
Yun-Chien Cheng
37
0
0
09 Apr 2024
Lightweight Deep Learning for Resource-Constrained Environments: A
  Survey
Lightweight Deep Learning for Resource-Constrained Environments: A Survey
Hou-I Liu
Marco Galindo
Hongxia Xie
Lai-Kuan Wong
Hong-Han Shuai
Yung-Hui Li
Wen-Huang Cheng
58
48
0
08 Apr 2024
HSViT: Horizontally Scalable Vision Transformer
HSViT: Horizontally Scalable Vision Transformer
Chenhao Xu
Chang-Tsun Li
Chee Peng Lim
Douglas Creighton
ViT
34
2
0
08 Apr 2024
GvT: A Graph-based Vision Transformer with Talking-Heads Utilizing
  Sparsity, Trained from Scratch on Small Datasets
GvT: A Graph-based Vision Transformer with Talking-Heads Utilizing Sparsity, Trained from Scratch on Small Datasets
Dongjing Shan
guiqiang chen
ViT
45
0
0
07 Apr 2024
Learning Correlation Structures for Vision Transformers
Learning Correlation Structures for Vision Transformers
Manjin Kim
Paul Hongsuck Seo
Cordelia Schmid
Minsu Cho
ViT
40
7
0
05 Apr 2024
ViTamin: Designing Scalable Vision Models in the Vision-Language Era
ViTamin: Designing Scalable Vision Models in the Vision-Language Era
Jienneg Chen
Qihang Yu
Xiaohui Shen
Alan L. Yuille
Liang-Chieh Chen
3DV
VLM
36
24
0
02 Apr 2024
Structured Initialization for Attention in Vision Transformers
Structured Initialization for Attention in Vision Transformers
Jianqiao Zheng
Xueqian Li
Simon Lucey
ViT
26
1
0
01 Apr 2024
Improving Visual Recognition with Hyperbolical Visual Hierarchy Mapping
Improving Visual Recognition with Hyperbolical Visual Hierarchy Mapping
Hyeongjun Kwon
Jinhyun Jang
Jin-Hwa Kim
Kwonyoung Kim
Kwanghoon Sohn
43
1
0
01 Apr 2024
IPT-V2: Efficient Image Processing Transformer using Hierarchical
  Attentions
IPT-V2: Efficient Image Processing Transformer using Hierarchical Attentions
Zhijun Tu
Kunpeng Du
Hanting Chen
Hai-lin Wang
Wei Li
Jie Hu
Yunhe Wang
ViT
44
4
0
31 Mar 2024
Enhancing Efficiency in Vision Transformer Networks: Design Techniques
  and Insights
Enhancing Efficiency in Vision Transformer Networks: Design Techniques and Insights
Moein Heidari
Reza Azad
Sina Ghorbani Kolahi
René Arimond
Leon Niggemeier
...
Afshin Bozorgpour
Ehsan Khodapanah Aghdam
A. Kazerouni
I. Hacihaliloglu
Dorit Merhof
51
7
0
28 Mar 2024
Heracles: A Hybrid SSM-Transformer Model for High-Resolution Image and
  Time-Series Analysis
Heracles: A Hybrid SSM-Transformer Model for High-Resolution Image and Time-Series Analysis
Badri N. Patro
Suhas Ranganath
Vinay P. Namboodiri
Vijay Srinivas Agneeswaran
43
2
0
26 Mar 2024
Previous
123456...151617
Next