Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.06717
Cited By
Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs
13 March 2022
Xiaohan Ding
Xinming Zhang
Yi Zhou
Jungong Han
Guiguang Ding
Jian Sun
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs"
50 / 115 papers shown
Title
HGFormer: Topology-Aware Vision Transformer with HyperGraph Learning
Hao Wang
Shuo Zhang
Biao Leng
ViT
198
1
0
03 Apr 2025
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection
Yuming Chen
Xinbin Yuan
Ruiqi Wu
Jiabao Wang
Qibin Hou
Mingg-Ming Cheng
Ming-Ming Cheng
ObjD
243
52
0
21 Feb 2025
Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation
Yunzhi Zhuge
Hongyu Gu
Lu Zhang
Jinqing Qi
Huchuan Lu
VOS
122
3
0
14 Jan 2025
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
Meng Lou
Hong-Yu Zhou
Sibei Yang
Yizhou Yu
Chuan Wu
Yizhou Yu
ViT
78
37
0
30 Oct 2023
Low-Resolution Self-Attention for Semantic Segmentation
Yu-Huan Wu
Shi-Chen Zhang
Yun-Hai Liu
Le Zhang
Xin Zhan
Daquan Zhou
Jiashi Feng
Ming-Ming Cheng
Liangli Zhen
ViT
138
3
0
08 Oct 2023
Lightweight Vision Transformer with Bidirectional Interaction
Qihang Fan
Huaibo Huang
Xiaoqiang Zhou
Ran He
ViT
85
28
0
01 Jun 2023
Patches Are All You Need?
Asher Trockman
J. Zico Kolter
ViT
250
405
0
24 Jan 2022
A ConvNet for the 2020s
Zhuang Liu
Hanzi Mao
Chaozheng Wu
Christoph Feichtenhofer
Trevor Darrell
Saining Xie
ViT
100
5,102
0
10 Jan 2022
RepMLPNet: Hierarchical Vision MLP with Re-parameterized Locality
Xiaohan Ding
Honghao Chen
Xinming Zhang
Jungong Han
Guiguang Ding
40
71
0
21 Dec 2021
MetaFormer Is Actually What You Need for Vision
Weihao Yu
Mi Luo
Pan Zhou
Chenyang Si
Yichen Zhou
Xinchao Wang
Jiashi Feng
Shuicheng Yan
146
896
0
22 Nov 2021
Swin Transformer V2: Scaling Up Capacity and Resolution
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
...
Yue Cao
Zheng Zhang
Li Dong
Furu Wei
B. Guo
ViT
191
1,783
0
18 Nov 2021
FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes
David W. Romero
Robert-Jan Bruintjes
Jakub M. Tomczak
Erik J. Bekkers
Mark Hoogendoorn
Jan van Gemert
93
83
0
15 Oct 2021
Dead Pixel Test Using Effective Receptive Field
Bum Jun Kim
Hyeyeon Choi
Hyeonah Jang
Dong Gu Lee
Wonseok Jeong
Sang Woo Kim
47
26
0
31 Aug 2021
A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP
Yucheng Zhao
Guangting Wang
Chuanxin Tang
Chong Luo
Wenjun Zeng
Zhengjun Zha
57
69
0
30 Aug 2021
Multi-Task Self-Training for Learning General Representations
Golnaz Ghiasi
Barret Zoph
E. D. Cubuk
Quoc V. Le
Nayeon Lee
SSL
51
100
0
25 Aug 2021
SwinIR: Image Restoration Using Swin Transformer
Christos Sakaridis
Jie Cao
Guolei Sun
Peng Sun
Luc Van Gool
Radu Timofte
ViT
173
2,862
0
23 Aug 2021
Do Vision Transformers See Like Convolutional Neural Networks?
M. Raghu
Thomas Unterthiner
Simon Kornblith
Chiyuan Zhang
Alexey Dosovitskiy
ViT
104
938
0
19 Aug 2021
Collaboration of Experts: Achieving 80% Top-1 Accuracy on ImageNet with 100M FLOPs
Yikang Zhang
Zhuo Chen
Zhaobai Zhong
MoE
52
8
0
08 Jul 2021
CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows
Xiaoyi Dong
Jianmin Bao
Dongdong Chen
Weiming Zhang
Nenghai Yu
Lu Yuan
Dong Chen
B. Guo
ViT
127
969
0
01 Jul 2021
Global Filter Networks for Image Classification
Yongming Rao
Wenliang Zhao
Zheng Zhu
Jiwen Lu
Jie Zhou
ViT
56
458
0
01 Jul 2021
VOLO: Vision Outlooker for Visual Recognition
Li-xin Yuan
Qibin Hou
Zihang Jiang
Jiashi Feng
Shuicheng Yan
ViT
90
322
0
24 Jun 2021
Dynamic Head: Unifying Object Detection Heads with Attentions
Xiyang Dai
Yinpeng Chen
Bin Xiao
Dongdong Chen
Mengchen Liu
Lu Yuan
Lei Zhang
49
566
0
15 Jun 2021
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
195
2,790
0
15 Jun 2021
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
Enze Xie
Wenhai Wang
Zhiding Yu
Anima Anandkumar
J. Álvarez
Ping Luo
ViT
193
4,934
0
31 May 2021
Vision Transformers are Robust Learners
Sayak Paul
Pin-Yu Chen
ViT
57
310
0
17 May 2021
Are Convolutional Neural Networks or Transformers more like human vision?
Shikhar Tuli
Ishita Dasgupta
Erin Grant
Thomas Griffiths
ViT
FaML
46
185
0
15 May 2021
Self-Supervised Learning with Swin Transformers
Zhenda Xie
Yutong Lin
Zhuliang Yao
Zheng Zhang
Qi Dai
Yue Cao
Han Hu
ViT
60
179
0
10 May 2021
ResMLP: Feedforward networks for image classification with data-efficient training
Hugo Touvron
Piotr Bojanowski
Mathilde Caron
Matthieu Cord
Alaaeldin El-Nouby
...
Gautier Izacard
Armand Joulin
Gabriel Synnaeve
Jakob Verbeek
Hervé Jégou
VLM
73
657
0
07 May 2021
BasisNet: Two-stage Model Synthesis for Efficient Inference
Ruotong Wang
Chun-Te Chu
A. Zhmoginov
Andrew G. Howard
Brendan Jou
Yukun Zhu
Li Zhang
R. Hwa
Adriana Kovashka
3DH
41
7
0
07 May 2021
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
394
2,638
0
04 May 2021
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
595
5,920
0
29 Apr 2021
An Empirical Study of Training Self-Supervised Vision Transformers
Xinlei Chen
Saining Xie
Kaiming He
ViT
135
1,837
0
05 Apr 2021
CvT: Introducing Convolutions to Vision Transformers
Haiping Wu
Bin Xiao
Noel Codella
Mengchen Liu
Xiyang Dai
Lu Yuan
Lei Zhang
ViT
126
1,891
0
29 Mar 2021
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
ViT
371
21,175
0
25 Mar 2021
Diverse Branch Block: Building a Convolution as an Inception-like Unit
Xiaohan Ding
Xinming Zhang
Jungong Han
Guiguang Ding
AI4CE
54
282
0
24 Mar 2021
Vision Transformers for Dense Prediction
René Ranftl
Alexey Bochkovskiy
V. Koltun
ViT
MDE
122
1,696
0
24 Mar 2021
Scaling Local Self-Attention for Parameter Efficient Visual Backbones
Ashish Vaswani
Prajit Ramachandran
A. Srinivas
Niki Parmar
Blake A. Hechtman
Jonathon Shlens
79
398
0
23 Mar 2021
Fast and Accurate Model Scaling
Piotr Dollár
Mannat Singh
Ross B. Girshick
51
98
0
11 Mar 2021
Attention is Not All You Need: Pure Attention Loses Rank Doubly Exponentially with Depth
Yihe Dong
Jean-Baptiste Cordonnier
Andreas Loukas
85
376
0
05 Mar 2021
How to represent part-whole hierarchies in a neural network
Geoffrey E. Hinton
OCL
MoE
65
200
0
25 Feb 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
469
3,678
0
24 Feb 2021
Conditional Positional Encodings for Vision Transformers
Xiangxiang Chu
Zhi Tian
Bo Zhang
Xinlong Wang
Chunhua Shen
ViT
64
610
0
22 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
407
3,778
0
11 Feb 2021
CKConv: Continuous Kernel Convolution For Sequential Data
David W. Romero
Anna Kuzina
Erik J. Bekkers
Jakub M. Tomczak
Mark Hoogendoorn
47
125
0
04 Feb 2021
Bottleneck Transformers for Visual Recognition
A. Srinivas
Nayeon Lee
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
SLR
334
986
0
27 Jan 2021
RepVGG: Making VGG-style ConvNets Great Again
Xiaohan Ding
Xinming Zhang
Ningning Ma
Jungong Han
Guiguang Ding
Jian Sun
239
1,574
0
11 Jan 2021
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
Sixiao Zheng
Jiachen Lu
Hengshuang Zhao
Xiatian Zhu
Zekun Luo
...
Yanwei Fu
Jianfeng Feng
Tao Xiang
Philip Torr
Li Zhang
ViT
156
2,872
0
31 Dec 2020
Training data-efficient image transformers & distillation through attention
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jégou
ViT
333
6,657
0
23 Dec 2020
Pre-Trained Image Processing Transformer
Hanting Chen
Yunhe Wang
Tianyu Guo
Chang Xu
Yiping Deng
Zhenhua Liu
Siwei Ma
Chunjing Xu
Chao Xu
Wen Gao
VLM
ViT
125
1,659
0
01 Dec 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
464
40,217
0
22 Oct 2020
1
2
3
Next