Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.11816
Cited By
Incorporating Convolution Designs into Visual Transformers
22 March 2021
Kun Yuan
Shaopeng Guo
Ziwei Liu
Aojun Zhou
F. Yu
Wei Wu
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Incorporating Convolution Designs into Visual Transformers"
50 / 218 papers shown
Title
A 2D Semantic-Aware Position Encoding for Vision Transformers
Xi Chen
Shiyang Zhou
Muqi Huang
Jiaxu Feng
Yun Xiong
...
Y. Zhang
Huishuai Bao
Sijia Peng
C. Li
Feng Shi
ViT
31
0
0
14 May 2025
FreCT: Frequency-augmented Convolutional Transformer for Robust Time Series Anomaly Detection
Wenxin Zhang
Ding Xu
Guangzhen Yao
Xiaojian Lin
Renxiang Guan
Chengze Du
Renda Han
Xi Xuan
Cuicui Luo
AI4TS
54
0
0
02 May 2025
ECViT: Efficient Convolutional Vision Transformer with Local-Attention and Multi-scale Stages
Zhoujie Qian
ViT
29
0
0
21 Apr 2025
Exploring the Collaborative Advantage of Low-level Information on Generalizable AI-Generated Image Detection
Ziyin Zhou
Ke Sun
Zhongxi Chen
Xianming Lin
Yunpeng Luo
Ke Yan
Shouhong Ding
Xiaoshuai Sun
31
0
0
01 Apr 2025
Iterative Optimal Attention and Local Model for Single Image Rain Streak Removal
Xiangyu Li
Wanshu Fan
Yue Shen
C. Wang
Wei-Qun Wang
X. Yang
Qiang Zhang
D. Zhou
55
0
0
20 Mar 2025
Escaping The Big Data Paradigm in Self-Supervised Representation Learning
Carlos Vélez García
Miguel Cazorla
Jorge Pomares
54
0
0
25 Feb 2025
Fully Exploiting Vision Foundation Model's Profound Prior Knowledge for Generalizable RGB-Depth Driving Scene Parsing
Sicen Guo
Tianyou Wen
Chuang-Wei Liu
Qijun Chen
Rui Fan
57
0
0
10 Feb 2025
Exploring Real&Synthetic Dataset and Linear Attention in Image Restoration
Yuzhen Du
Teng Hu
J. Zhang
Ran Yi Chengming Xu
Xiaobin Hu
Kai WU
Donghao Luo
Y. Wang
Lizhuang Ma
85
1
0
05 Dec 2024
GCI-ViTAL: Gradual Confidence Improvement with Vision Transformers for Active Learning on Label Noise
Moseli Motsóehli
Kyungim Baek
34
1
0
08 Nov 2024
ESC-MISR: Enhancing Spatial Correlations for Multi-Image Super-Resolution in Remote Sensing
Zhihui Zhang
Jinhui Pang
Jianan Li
Xiaoshuai Hao
27
0
0
07 Nov 2024
Multi-task Geometric Estimation of Depth and Surface Normal from Monocular 360° Images
Kun Huang
Fang-Lue Zhang
Fangfang Zhang
Yu-Kun Lai
Paul L. Rosin
N. Dodgson
32
0
0
04 Nov 2024
UTSRMorph: A Unified Transformer and Superresolution Network for Unsupervised Medical Image Registration
Runshi Zhang
Hao Mo
Junchen Wang
Bimeng Jie
Yang He
Nenghao Jin
Liang Zhu
ViT
MedIm
30
3
0
27 Oct 2024
ED-ViT: Splitting Vision Transformer for Distributed Inference on Edge Devices
Xiang Liu
Yijun Song
Xia Li
Yifei Sun
Huiying Lan
Zemin Liu
Linshan Jiang
Jialin Li
17
1
0
15 Oct 2024
Positional Attention: Expressivity and Learnability of Algorithmic Computation
Artur Back de Luca
George Giapitzakis
Shenghao Yang
Petar Veličković
K. Fountoulakis
44
0
0
02 Oct 2024
ASSNet: Adaptive Semantic Segmentation Network for Microtumors and Multi-Organ Segmentation
Fuchen Zheng
Xinyi Chen
Xuhang Chen
Haolun Li
Xiaojiao Guo
Guoheng Huang
Chi-Man Pun
Shoujun Zhou
ViT
MedIm
27
0
0
12 Sep 2024
MVTN: A Multiscale Video Transformer Network for Hand Gesture Recognition
Mallika Garg
Debashis Ghosh
P. M. Pradhan
ViT
30
1
0
05 Sep 2024
A Hybrid Transformer-Mamba Network for Single Image Deraining
Shangquan Sun
Wenqi Ren
Juxiang Zhou
Jianhou Gan
Rui Wang
Xiaochun Cao
Mamba
46
5
0
31 Aug 2024
SMAFormer: Synergistic Multi-Attention Transformer for Medical Image Segmentation
Fuchen Zheng
Xuhang Chen
Weihuang Liu
Haolun Li
Yingtie Lei
Jiahui He
Chi-Man Pun
Shounjun Zhou
MedIm
29
11
0
31 Aug 2024
DuoFormer: Leveraging Hierarchical Visual Representations by Local and Global Attention
Xiaoya Tang
Bodong Zhang
Beatrice S. Knudsen
Tolga Tasdizen
ViT
MedIm
47
1
0
18 Jul 2024
AdaNCA: Neural Cellular Automata As Adaptors For More Robust Vision Transformer
Yitao Xu
Tong Zhang
Sabine Süsstrunk
ViT
42
0
0
12 Jun 2024
You Only Need Less Attention at Each Stage in Vision Transformers
Shuoxi Zhang
Hanpeng Liu
Stephen Lin
Kun He
53
5
0
01 Jun 2024
PanoNormal: Monocular Indoor 360° Surface Normal Estimation
Kun Huang
Fanglue Zhang
N. Dodgson
MDE
29
0
0
29 May 2024
Multi-View Attentive Contextualization for Multi-View 3D Object Detection
Xianpeng Liu
Ce Zheng
Ming Qian
Nan Xue
C. L. P. Chen
Zhebin Zhang
Chen Li
Tianfu Wu
33
2
0
20 May 2024
CSTA: CNN-based Spatiotemporal Attention for Video Summarization
Jaewon Son
Jaehun Park
Kwangsu Kim
AI4TS
ViT
37
8
0
20 May 2024
Stereo-Knowledge Distillation from dpMV to Dual Pixels for Light Field Video Reconstruction
Aryan Garg
Raghav Mallampali
Akshat Joshi
Shrisudhan Govindarajan
Kaushik Mitra
31
0
0
20 May 2024
GestFormer: Multiscale Wavelet Pooling Transformer Network for Dynamic Hand Gesture Recognition
Mallika Garg
Debashis Ghosh
P. M. Pradhan
SLR
ViT
36
2
0
18 May 2024
Fusing Depthwise and Pointwise Convolutions for Efficient Inference on GPUs
Fareed Qararyah
M. Azhar
Mohammad Ali Maleki
Pedro Trancoso
21
1
0
30 Apr 2024
PromptCIR: Blind Compressed Image Restoration with Prompt Learning
Bingchen Li
Xin Li
Yiting Lu
Ruoyu Feng
Mengxi Guo
Shijie Zhao
Li Zhang
Zhibo Chen
36
13
0
26 Apr 2024
Training Transformer Models by Wavelet Losses Improves Quantitative and Visual Performance in Single Image Super-Resolution
Cansu Korkmaz
A. Murat Tekalp
ViT
44
6
0
17 Apr 2024
WiTUnet: A U-Shaped Architecture Integrating CNN and Transformer for Improved Feature Alignment and Local Information Fusion
Bin Wang
Fei Deng
Peifan Jiang
Shuang Wang
Xiao Han
Zhixuan Zhang
MedIm
23
6
0
15 Apr 2024
Structured Initialization for Attention in Vision Transformers
Jianqiao Zheng
Xueqian Li
Simon Lucey
ViT
21
1
0
01 Apr 2024
IPT-V2: Efficient Image Processing Transformer using Hierarchical Attentions
Zhijun Tu
Kunpeng Du
Hanting Chen
Hai-lin Wang
Wei Li
Jie Hu
Yunhe Wang
ViT
39
4
0
31 Mar 2024
HIRI-ViT: Scaling Vision Transformer with High Resolution Inputs
Ting Yao
Yehao Li
Yingwei Pan
Tao Mei
ViT
25
15
0
18 Mar 2024
Activating Wider Areas in Image Super-Resolution
Cheng Cheng
Hang Wang
Hongbin Sun
34
10
0
13 Mar 2024
LKM-UNet: Large Kernel Vision Mamba UNet for Medical Image Segmentation
Jinhong Wang
Jintai Chen
D. Z. Chen
Jian Wu
Mamba
48
20
0
12 Mar 2024
Simulation of Graph Algorithms with Looped Transformers
Artur Back de Luca
K. Fountoulakis
50
14
0
02 Feb 2024
Convolutional Initialization for Data-Efficient Vision Transformers
Jianqiao Zheng
Xueqian Li
Simon Lucey
43
2
0
23 Jan 2024
CATFace: Cross-Attribute-Guided Transformer with Self-Attention Distillation for Low-Quality Face Recognition
Niloufar Alipour Talemi
Hossein Kashiani
Nasser M. Nasrabadi
ViT
CVBM
17
4
0
05 Jan 2024
SPFormer: Enhancing Vision Transformer with Superpixel Representation
Jieru Mei
Liang-Chieh Chen
Alan L. Yuille
Cihang Xie
ViT
MDE
21
4
0
05 Jan 2024
A Cost-Efficient FPGA Implementation of Tiny Transformer Model using Neural ODE
Ikumi Okubo
Keisuke Sugiura
Hiroki Matsutani
30
2
0
05 Jan 2024
Class-Discriminative Attention Maps for Vision Transformers
L. Brocki
Jakub Binda
N. C. Chung
MedIm
30
3
0
04 Dec 2023
Universal Deoxidation of Semiconductor Substrates Assisted by Machine-Learning and Real-Time-Feedback-Control
Chaorong Shen
Wenkang Zhan
Jian Tang
Zhaofeng Wu
Bop Xu
Chao Zhao
Zhanguo Wang
29
0
0
04 Dec 2023
SCHEME: Scalable Channel Mixer for Vision Transformers
Deepak Sridhar
Yunsheng Li
Nuno Vasconcelos
33
0
0
01 Dec 2023
Bridging The Gaps Between Token Pruning and Full Pre-training via Masked Fine-tuning
Fengyuan Shi
Limin Wang
ViT
32
0
0
26 Oct 2023
Camera-LiDAR Fusion with Latent Contact for Place Recognition in Challenging Cross-Scenes
Yan Pan
Jiapeng Xie
Jiajie Wu
Bo Zhou
28
0
0
16 Oct 2023
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
Yulong Shi
Mingwei Sun
Yongshuai Wang
Hui Sun
Zengqiang Chen
31
4
0
10 Oct 2023
Understanding Masked Autoencoders From a Local Contrastive Perspective
Xiaoyu Yue
Lei Bai
Meng Wei
Jiangmiao Pang
Xihui Liu
Luping Zhou
Wanli Ouyang
SSL
61
4
0
03 Oct 2023
RBFormer: Improve Adversarial Robustness of Transformer by Robust Bias
Hao Cheng
Jinhao Duan
Hui Li
Lyutianyang Zhang
Jiahang Cao
Ping Wang
Jize Zhang
Kaidi Xu
Renjing Xu
AAML
32
3
0
23 Sep 2023
Trading-off Mutual Information on Feature Aggregation for Face Recognition
Mohammad Akyash
Ali Zafari
Nasser M. Nasrabadi
ViT
25
1
0
22 Sep 2023
Gold-YOLO: Efficient Object Detector via Gather-and-Distribute Mechanism
Chengcheng Wang
Wei He
Ying Nie
Jianyuan Guo
Chuanjian Liu
Kai Han
Yunhe Wang
ObjD
27
206
0
20 Sep 2023
1
2
3
4
5
Next