ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.13452
  4. Cited By
MetaFormer Baselines for Vision

MetaFormer Baselines for Vision

24 October 2022
Weihao Yu
Chenyang Si
Pan Zhou
Mi Luo
Yichen Zhou
Jiashi Feng
Shuicheng Yan
Xinchao Wang
    MoE
ArXivPDFHTML

Papers citing "MetaFormer Baselines for Vision"

38 / 88 papers shown
Title
ParFormer: Vision Transformer Baseline with Parallel Local Global Token
  Mixer and Convolution Attention Patch Embedding
ParFormer: Vision Transformer Baseline with Parallel Local Global Token Mixer and Convolution Attention Patch Embedding
Novendra Setyawan
Ghufron Wahyu Kurniawan
Chi-Chia Sun
Jun-Wei Hsieh
Hui-Kai Su
W. Kuo
ViT
MoE
39
0
0
22 Mar 2024
Depth-induced Saliency Comparison Network for Diagnosis of Alzheimer's
  Disease via Jointly Analysis of Visual Stimuli and Eye Movements
Depth-induced Saliency Comparison Network for Diagnosis of Alzheimer's Disease via Jointly Analysis of Visual Stimuli and Eye Movements
Yu Liu
Wenlin Zhang
Shaochu Wang
Fangyu Zuo
Peiguang Jing
Yong Ji
27
0
0
15 Mar 2024
The NeRFect Match: Exploring NeRF Features for Visual Localization
The NeRFect Match: Exploring NeRF Features for Visual Localization
Qunjie Zhou
Maxim Maximov
Or Litany
Laura Leal-Taixé
41
15
0
14 Mar 2024
HyenaPixel: Global Image Context with Convolutions
HyenaPixel: Global Image Context with Convolutions
Julian Spravil
Sebastian Houben
Sven Behnke
31
1
0
29 Feb 2024
Windowed-FourierMixer: Enhancing Clutter-Free Room Modeling with Fourier
  Transform
Windowed-FourierMixer: Enhancing Clutter-Free Room Modeling with Fourier Transform
Bruno Henriques
Benjamin Allaert
Jean-Philippe Vandeborre
3DV
29
0
0
28 Feb 2024
Spike-driven Transformer V2: Meta Spiking Neural Network Architecture
  Inspiring the Design of Next-generation Neuromorphic Chips
Spike-driven Transformer V2: Meta Spiking Neural Network Architecture Inspiring the Design of Next-generation Neuromorphic Chips
Man Yao
Jiakui Hu
Tianxiang Hu
Yifan Xu
Zhaokun Zhou
Yonghong Tian
Boxing Xu
Guoqi Li
34
56
0
15 Feb 2024
Towards Evaluating Transfer-based Attacks Systematically, Practically,
  and Fairly
Towards Evaluating Transfer-based Attacks Systematically, Practically, and Fairly
Qizhang Li
Yiwen Guo
Wangmeng Zuo
Hao Chen
ELM
AAML
33
2
0
02 Nov 2023
TorchDEQ: A Library for Deep Equilibrium Models
TorchDEQ: A Library for Deep Equilibrium Models
Zhengyang Geng
J. Zico Kolter
VLM
56
12
0
28 Oct 2023
Handling Data Heterogeneity via Architectural Design for Federated
  Visual Recognition
Handling Data Heterogeneity via Architectural Design for Federated Visual Recognition
Sara Pieri
Jose Renato Restom
Samuel Horvath
Hisham Cholakkal
FedML
19
8
0
23 Oct 2023
Audio classification with Dilated Convolution with Learnable Spacings
Audio classification with Dilated Convolution with Learnable Spacings
Ismail Khalfaoui-Hassani
T. Masquelier
Thomas Pellegrini
20
1
0
25 Sep 2023
Priority-Centric Human Motion Generation in Discrete Latent Space
Priority-Centric Human Motion Generation in Discrete Latent Space
Hanyang Kong
Kehong Gong
Dongze Lian
Michael Bi Mi
Xinchao Wang
DiffM
20
52
0
28 Aug 2023
SG-Former: Self-guided Transformer with Evolving Token Reallocation
SG-Former: Self-guided Transformer with Evolving Token Reallocation
Sucheng Ren
Xingyi Yang
Songhua Liu
Xinchao Wang
ViT
27
41
0
23 Aug 2023
RepViT: Revisiting Mobile CNN From ViT Perspective
RepViT: Revisiting Mobile CNN From ViT Perspective
Ao Wang
Hui Chen
Zijia Lin
Hengjun Pu
Guiguang Ding
34
177
0
18 Jul 2023
Feature Mixing for Writer Retrieval and Identification on Papyri
  Fragments
Feature Mixing for Writer Retrieval and Identification on Papyri Fragments
Marco Peer
Robert Sablatnig
15
4
0
22 Jun 2023
Adapting a ConvNeXt model to audio classification on AudioSet
Adapting a ConvNeXt model to audio classification on AudioSet
Thomas Pellegrini
Ismail Khalfaoui-Hassani
Etienne Labbé
T. Masquelier
6
21
0
01 Jun 2023
Dilated Convolution with Learnable Spacings: beyond bilinear
  interpolation
Dilated Convolution with Learnable Spacings: beyond bilinear interpolation
Ismail Khalfaoui-Hassani
Thomas Pellegrini
T. Masquelier
16
3
0
01 Jun 2023
DiffRate : Differentiable Compression Rate for Efficient Vision
  Transformers
DiffRate : Differentiable Compression Rate for Efficient Vision Transformers
Mengzhao Chen
Wenqi Shao
Peng Xu
Mingbao Lin
Kaipeng Zhang
Rongrong Ji
Rongrong Ji
Yu Qiao
Ping Luo
ViT
44
43
0
29 May 2023
Meta-Polyp: a baseline for efficient Polyp segmentation
Meta-Polyp: a baseline for efficient Polyp segmentation
Quoc-Huy Trinh
MedIm
21
18
0
13 May 2023
iMixer: hierarchical Hopfield network implies an invertible, implicit
  and iterative MLP-Mixer
iMixer: hierarchical Hopfield network implies an invertible, implicit and iterative MLP-Mixer
Toshihiro Ota
Masato Taki
29
2
0
25 Apr 2023
RIFormer: Keep Your Vision Backbone Effective While Removing Token Mixer
RIFormer: Keep Your Vision Backbone Effective While Removing Token Mixer
Jiahao Wang
Songyang Zhang
Yong Liu
Taiqiang Wu
Yujiu Yang
Xihui Liu
Kai-xiang Chen
Ping Luo
Dahua Lin
32
20
0
12 Apr 2023
Towards an Effective and Efficient Transformer for Rain-by-snow Weather
  Removal
Towards an Effective and Efficient Transformer for Rain-by-snow Weather Removal
Tao Gao
Yuanbo Wen
Kaihao Zhang
Peng Cheng
Ting Chen
ViT
33
5
0
06 Apr 2023
TM2D: Bimodality Driven 3D Dance Generation via Music-Text Integration
TM2D: Bimodality Driven 3D Dance Generation via Music-Text Integration
Kehong Gong
Dongze Lian
Heng Chang
Chuan Guo
Zihang Jiang
Wei Ji
Michael Bi Mi
Xinchao Wang
11
61
0
05 Apr 2023
Dual Cross-Attention for Medical Image Segmentation
Dual Cross-Attention for Medical Image Segmentation
Gorkem Can Ates
P. Mohan
Emrah Çelik
9
74
0
30 Mar 2023
InceptionNeXt: When Inception Meets ConvNeXt
InceptionNeXt: When Inception Meets ConvNeXt
Weihao Yu
Pan Zhou
Shuicheng Yan
Xinchao Wang
48
118
0
29 Mar 2023
FFT-based Dynamic Token Mixer for Vision
FFT-based Dynamic Token Mixer for Vision
Yuki Tatsunami
Masato Taki
45
19
0
07 Mar 2023
CECT: Controllable Ensemble CNN and Transformer for COVID-19 Image
  Classification
CECT: Controllable Ensemble CNN and Transformer for COVID-19 Image Classification
Zhao Liu
Leizhao Shen
ViT
29
7
0
05 Feb 2023
Does progress on ImageNet transfer to real-world datasets?
Does progress on ImageNet transfer to real-world datasets?
Alex Fang
Simon Kornblith
Ludwig Schmidt
VLM
21
34
0
11 Jan 2023
Demystify Transformers & Convolutions in Modern Image Deep Networks
Demystify Transformers & Convolutions in Modern Image Deep Networks
Jifeng Dai
Min Shi
Weiyun Wang
Sitong Wu
Linjie Xing
...
Lewei Lu
Jie Zhou
Xiaogang Wang
Yu Qiao
Xiao-hua Hu
ViT
26
11
0
10 Nov 2022
Dual Vision Transformer
Dual Vision Transformer
Ting Yao
Yehao Li
Yingwei Pan
Yu Wang
Xiaoping Zhang
Tao Mei
ViT
141
75
0
11 Jul 2022
Mugs: A Multi-Granular Self-Supervised Learning Framework
Mugs: A Multi-Granular Self-Supervised Learning Framework
Pan Zhou
Yichen Zhou
Chenyang Si
Weihao Yu
Teck Khim Ng
Shuicheng Yan
VLM
34
60
0
27 Mar 2022
ResNet strikes back: An improved training procedure in timm
ResNet strikes back: An improved training procedure in timm
Ross Wightman
Hugo Touvron
Hervé Jégou
AI4TS
212
487
0
01 Oct 2021
Primer: Searching for Efficient Transformers for Language Modeling
Primer: Searching for Efficient Transformers for Language Modeling
David R. So
Wojciech Mañke
Hanxiao Liu
Zihang Dai
Noam M. Shazeer
Quoc V. Le
VLM
85
152
0
17 Sep 2021
MLP-Mixer: An all-MLP Architecture for Vision
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
271
2,603
0
04 May 2021
Transformer in Transformer
Transformer in Transformer
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
ViT
284
1,524
0
27 Feb 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction
  without Convolutions
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
277
3,623
0
24 Feb 2021
Dynamic ReLU
Dynamic ReLU
Yinpeng Chen
Xiyang Dai
Mengchen Liu
Dongdong Chen
Lu Yuan
Zicheng Liu
177
162
0
22 Mar 2020
Xception: Deep Learning with Depthwise Separable Convolutions
Xception: Deep Learning with Depthwise Separable Convolutions
François Chollet
MDE
BDL
PINN
206
14,368
0
07 Oct 2016
ImageNet Large Scale Visual Recognition Challenge
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
296
39,198
0
01 Sep 2014
Previous
12