ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.10697
  4. Cited By
ConViT: Improving Vision Transformers with Soft Convolutional Inductive
  Biases

ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases

19 March 2021
Stéphane dÁscoli
Hugo Touvron
Matthew L. Leavitt
Ari S. Morcos
Giulio Biroli
Levent Sagun
    ViT
ArXivPDFHTML

Papers citing "ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases"

50 / 399 papers shown
Title
Harmonizing the object recognition strategies of deep neural networks
  with humans
Harmonizing the object recognition strategies of deep neural networks with humans
Thomas Fel
Ivan Felipe
Drew Linsley
Thomas Serre
33
71
0
08 Nov 2022
MogaNet: Multi-order Gated Aggregation Network
MogaNet: Multi-order Gated Aggregation Network
Siyuan Li
Zedong Wang
Zicheng Liu
Cheng Tan
Haitao Lin
Di Wu
Zhiyuan Chen
Jiangbin Zheng
Stan Z. Li
26
56
0
07 Nov 2022
Effective Audio Classification Network Based on Paired Inverse Pyramid
  Structure and Dense MLP Block
Effective Audio Classification Network Based on Paired Inverse Pyramid Structure and Dense MLP Block
Yunhao Chen
Yunjie Zhu
Zihui Yan
Yifan Huang
Zhen Ren
Jianlu Shen
Lifang Chen
20
9
0
05 Nov 2022
Boosting Binary Neural Networks via Dynamic Thresholds Learning
Boosting Binary Neural Networks via Dynamic Thresholds Learning
Jiehua Zhang
Xueyang Zhang
Z. Su
Zitong Yu
Yanghe Feng
Xin Lu
M. Pietikäinen
Li Liu
MQ
30
0
0
04 Nov 2022
Studying inductive biases in image classification task
Studying inductive biases in image classification task
N. Arizumi
26
1
0
31 Oct 2022
LCPFormer: Towards Effective 3D Point Cloud Analysis via Local Context
  Propagation in Transformers
LCPFormer: Towards Effective 3D Point Cloud Analysis via Local Context Propagation in Transformers
Zhuo Huang
Zhiyou Zhao
Banghuai Li
Jungong Han
3DPC
ViT
29
55
0
23 Oct 2022
Boosting vision transformers for image retrieval
Boosting vision transformers for image retrieval
Chull Hwan Song
Jooyoung Yoon
Shunghyun Choi
Yannis Avrithis
ViT
29
31
0
21 Oct 2022
Similarity of Neural Architectures using Adversarial Attack
  Transferability
Similarity of Neural Architectures using Adversarial Attack Transferability
Jaehui Hwang
Dongyoon Han
Byeongho Heo
Song Park
Sanghyuk Chun
Jong-Seok Lee
AAML
26
1
0
20 Oct 2022
Rethinking Bias Mitigation: Fairer Architectures Make for Fairer Face
  Recognition
Rethinking Bias Mitigation: Fairer Architectures Make for Fairer Face Recognition
Samuel Dooley
R. Sukthanker
John P. Dickerson
Colin White
Frank Hutter
Micah Goldblum
CVBM
24
21
0
18 Oct 2022
TokenMixup: Efficient Attention-guided Token-level Data Augmentation for
  Transformers
TokenMixup: Efficient Attention-guided Token-level Data Augmentation for Transformers
Hyeong Kyu Choi
Joonmyung Choi
Hyunwoo J. Kim
ViT
28
35
0
14 Oct 2022
When Adversarial Training Meets Vision Transformers: Recipes from
  Training to Architecture
When Adversarial Training Meets Vision Transformers: Recipes from Training to Architecture
Yi Mo
Dongxian Wu
Yifei Wang
Yiwen Guo
Yisen Wang
ViT
39
52
0
14 Oct 2022
Vision Transformers provably learn spatial structure
Vision Transformers provably learn spatial structure
Samy Jelassi
Michael E. Sander
Yuan-Fang Li
ViT
MLT
32
73
0
13 Oct 2022
Bridging the Gap Between Vision Transformers and Convolutional Neural
  Networks on Small Datasets
Bridging the Gap Between Vision Transformers and Convolutional Neural Networks on Small Datasets
Zhiying Lu
Hongtao Xie
Chuanbin Liu
Yongdong Zhang
ViT
15
57
0
12 Oct 2022
Fast-ParC: Capturing Position Aware Global Feature for ConvNets and ViTs
Fast-ParC: Capturing Position Aware Global Feature for ConvNets and ViTs
Taojiannan Yang
Haokui Zhang
Wenze Hu
C. L. P. Chen
Xiaoyu Wang
ViT
11
0
0
08 Oct 2022
The Lie Derivative for Measuring Learned Equivariance
The Lie Derivative for Measuring Learned Equivariance
Nate Gruver
Marc Finzi
Micah Goldblum
A. Wilson
16
34
0
06 Oct 2022
MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision
  Models
MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision Models
Chenglin Yang
Siyuan Qiao
Qihang Yu
Xiaoding Yuan
Yukun Zhu
Alan Yuille
Hartwig Adam
Liang-Chieh Chen
ViT
MoE
33
58
0
04 Oct 2022
Towards Flexible Inductive Bias via Progressive Reparameterization
  Scheduling
Towards Flexible Inductive Bias via Progressive Reparameterization Scheduling
Yunsung Lee
Gyuseong Lee
Kwang-seok Ryoo
Hyojun Go
Jihye Park
Seung Wook Kim
24
5
0
04 Oct 2022
Expediting Large-Scale Vision Transformer for Dense Prediction without
  Fine-tuning
Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuning
Weicong Liang
Yuhui Yuan
Henghui Ding
Xiao Luo
Weihong Lin
Ding Jia
Zheng-Wei Zhang
Chao Zhang
Hanhua Hu
25
25
0
03 Oct 2022
Effective Vision Transformer Training: A Data-Centric Perspective
Effective Vision Transformer Training: A Data-Centric Perspective
Benjia Zhou
Pichao Wang
Jun Wan
Yan-Ni Liang
Fan Wang
26
5
0
29 Sep 2022
Pretraining the Vision Transformer using self-supervised methods for
  vision based Deep Reinforcement Learning
Pretraining the Vision Transformer using self-supervised methods for vision based Deep Reinforcement Learning
Manuel Goulão
Arlindo L. Oliveira
ViT
33
6
0
22 Sep 2022
Medical Image Segmentation using LeViT-UNet++: A Case Study on GI Tract
  Data
Medical Image Segmentation using LeViT-UNet++: A Case Study on GI Tract Data
Praneeth Nemani
Satyanarayana Vollala
ViT
MedIm
19
15
0
15 Sep 2022
DMTNet: Dynamic Multi-scale Network for Dual-pixel Images Defocus
  Deblurring with Transformer
DMTNet: Dynamic Multi-scale Network for Dual-pixel Images Defocus Deblurring with Transformer
Dafeng Zhang
Xiaobing Wang
ViT
24
11
0
13 Sep 2022
Back-to-Bones: Rediscovering the Role of Backbones in Domain
  Generalization
Back-to-Bones: Rediscovering the Role of Backbones in Domain Generalization
Simone Angarano
Mauro Martini
Francesco Salvetti
Vittorio Mazzia
Marcello Chiaberge
OOD
30
12
0
02 Sep 2022
MRL: Learning to Mix with Attention and Convolutions
MRL: Learning to Mix with Attention and Convolutions
Shlok Mohta
Hisahiro Suganuma
Yoshiki Tanaka
22
2
0
30 Aug 2022
ClusTR: Exploring Efficient Self-attention via Clustering for Vision
  Transformers
ClusTR: Exploring Efficient Self-attention via Clustering for Vision Transformers
Yutong Xie
Jianpeng Zhang
Yong-quan Xia
A. Hengel
Qi Wu
25
6
0
28 Aug 2022
From WSI-level to Patch-level: Structure Prior Guided Binuclear Cell
  Fine-grained Detection
From WSI-level to Patch-level: Structure Prior Guided Binuclear Cell Fine-grained Detection
Baomin Wang
G. Hu
Dan Chen
Lihua Hu
Cheng Li
Yu An
G. Hu
Guangyu Jia
16
1
0
26 Aug 2022
gSwin: Gated MLP Vision Model with Hierarchical Structure of Shifted
  Window
gSwin: Gated MLP Vision Model with Hierarchical Structure of Shifted Window
Mocho Go
Hideyuki Tachibana
ViT
29
9
0
24 Aug 2022
FocusFormer: Focusing on What We Need via Architecture Sampler
FocusFormer: Focusing on What We Need via Architecture Sampler
Jing Liu
Jianfei Cai
Bohan Zhuang
29
7
0
23 Aug 2022
Conviformers: Convolutionally guided Vision Transformer
Conviformers: Convolutionally guided Vision Transformer
Mohit Vaishnav
Thomas Fel
I. F. Rodriguez
Thomas Serre
ViT
32
1
0
17 Aug 2022
Deep is a Luxury We Don't Have
Deep is a Luxury We Don't Have
Ahmed Taha
Yen Nhi Truong Vu
Brent Mombourquette
Thomas P. Matthews
Jason Su
Sadanand Singh
ViT
MedIm
20
2
0
11 Aug 2022
Memorizing Complementation Network for Few-Shot Class-Incremental
  Learning
Memorizing Complementation Network for Few-Shot Class-Incremental Learning
Zhong Ji
Zhi Hou
Xiyao Liu
Yanwei Pang
Xuelong Li
CLL
19
45
0
11 Aug 2022
Calibrate the inter-observer segmentation uncertainty via
  diagnosis-first principle
Calibrate the inter-observer segmentation uncertainty via diagnosis-first principle
Junde Wu
Huihui Fang
Hoayi Xiong
Lixin Duan
Mingkui Tan
Weihua Yang
Huiying Liu
Yanwu Xu
MedIm
41
1
0
05 Aug 2022
MVSFormer: Multi-View Stereo by Learning Robust Image Features and
  Temperature-based Depth
MVSFormer: Multi-View Stereo by Learning Robust Image Features and Temperature-based Depth
Chenjie Cao
Xinlin Ren
Yanwei Fu
26
46
0
04 Aug 2022
Two-Stream Transformer Architecture for Long Video Understanding
Two-Stream Transformer Architecture for Long Video Understanding
Edward Fish
Jon Weinbren
Andrew Gilbert
ViT
25
6
0
02 Aug 2022
HorNet: Efficient High-Order Spatial Interactions with Recursive Gated
  Convolutions
HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions
Yongming Rao
Wenliang Zhao
Yansong Tang
Jie Zhou
Ser-Nam Lim
Jiwen Lu
ViT
20
252
0
28 Jul 2022
DnSwin: Toward Real-World Denoising via Continuous Wavelet
  Sliding-Transformer
DnSwin: Toward Real-World Denoising via Continuous Wavelet Sliding-Transformer
Hao Li
Zhijing Yang
Xiaobin Hong
Ziying Zhao
Junyang Chen
Yukai Shi
Jin-shan Pan
DiffM
ViT
33
11
0
28 Jul 2022
Convolutional Embedding Makes Hierarchical Vision Transformer Stronger
Convolutional Embedding Makes Hierarchical Vision Transformer Stronger
Cong Wang
Hongmin Xu
Xiong Zhang
Li Wang
Zhitong Zheng
Haifeng Liu
ViT
14
20
0
27 Jul 2022
S-Prompts Learning with Pre-trained Transformers: An Occam's Razor for
  Domain Incremental Learning
S-Prompts Learning with Pre-trained Transformers: An Occam's Razor for Domain Incremental Learning
Yabin Wang
Zhiwu Huang
Xiaopeng Hong
CLL
VLM
27
208
0
26 Jul 2022
Cost Aggregation with 4D Convolutional Swin Transformer for Few-Shot
  Segmentation
Cost Aggregation with 4D Convolutional Swin Transformer for Few-Shot Segmentation
Sunghwan Hong
Seokju Cho
Jisu Nam
Stephen Lin
Seung Wook Kim
ViT
19
122
0
22 Jul 2022
Multi Resolution Analysis (MRA) for Approximate Self-Attention
Multi Resolution Analysis (MRA) for Approximate Self-Attention
Zhanpeng Zeng
Sourav Pal
Jeffery Kline
G. Fung
Vikas Singh
15
6
0
21 Jul 2022
SplitMixer: Fat Trimmed From MLP-like Models
SplitMixer: Fat Trimmed From MLP-like Models
Ali Borji
Sikun Lin
21
3
0
21 Jul 2022
Locality Guidance for Improving Vision Transformers on Tiny Datasets
Locality Guidance for Improving Vision Transformers on Tiny Datasets
Kehan Li
Runyi Yu
Zhennan Wang
Li-ming Yuan
Guoli Song
Jie Chen
ViT
24
43
0
20 Jul 2022
Vision Transformers: From Semantic Segmentation to Dense Prediction
Vision Transformers: From Semantic Segmentation to Dense Prediction
Li Zhang
Jiachen Lu
Sixiao Zheng
Xinxuan Zhao
Xiatian Zhu
Yanwei Fu
Tao Xiang
Jianfeng Feng
Philip H. S. Torr
ViT
24
7
0
19 Jul 2022
Multi-manifold Attention for Vision Transformers
Multi-manifold Attention for Vision Transformers
D. Konstantinidis
Ilias Papastratis
K. Dimitropoulos
P. Daras
ViT
14
16
0
18 Jul 2022
Progress and limitations of deep networks to recognize objects in
  unusual poses
Progress and limitations of deep networks to recognize objects in unusual poses
Amro Abbas
Stéphane Deny
OOD
AAML
13
17
0
16 Jul 2022
Parameterization of Cross-Token Relations with Relative Positional
  Encoding for Vision MLP
Parameterization of Cross-Token Relations with Relative Positional Encoding for Vision MLP
Zhicai Wang
Y. Hao
Xingyu Gao
Hao Zhang
Shuo Wang
Tingting Mu
Xiangnan He
16
8
0
15 Jul 2022
Lightweight Vision Transformer with Cross Feature Attention
Lightweight Vision Transformer with Cross Feature Attention
Youpeng Zhao
Huadong Tang
Yingying Jiang
A. Yong
Qiang Wu
ViT
17
10
0
15 Jul 2022
Data Augmentation for Low-Resource Quechua ASR Improvement
Data Augmentation for Low-Resource Quechua ASR Improvement
Rodolfo Zevallos
Núria Bel
Guillermo Cámbara
Mireia Farrús
Jordi Luque
VLM
SyDa
11
6
0
14 Jul 2022
Rethinking Attention Mechanism in Time Series Classification
Rethinking Attention Mechanism in Time Series Classification
Bowen Zhao
Huanlai Xing
Xinhan Wang
Fuhong Song
Zhiwen Xiao
AI4TS
28
30
0
14 Jul 2022
Outpainting by Queries
Outpainting by Queries
Kai Yao
Penglei Gao
Xi Yang
Kaizhu Huang
Jie Sun
Rui Zhang
ViT
28
13
0
12 Jul 2022
Previous
12345678
Next