Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.11418
Cited By
MetaFormer Is Actually What You Need for Vision
22 November 2021
Weihao Yu
Mi Luo
Pan Zhou
Chenyang Si
Yichen Zhou
Xinchao Wang
Jiashi Feng
Shuicheng Yan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MetaFormer Is Actually What You Need for Vision"
50 / 123 papers shown
Title
SSH-Net: A Self-Supervised and Hybrid Network for Noisy Image Watermark Removal
Wenyang Liu
Jianjun Gao
Kim-Hui Yap
40
0
0
08 May 2025
Image Restoration via Multi-domain Learning
Xingyu Jiang
Ning Gao
Xiuhui Zhang
Hongkun Dou
Shaowen Fu
Xiaoqing Zhong
Hao Li
Yue Deng
ViT
43
0
0
07 May 2025
Image Recognition with Online Lightweight Vision Transformer: A Survey
Zherui Zhang
Rongtao Xu
Jie Zhou
Changwei Wang
Xingtian Pei
...
Jiguang Zhang
Li Guo
Longxiang Gao
Wenyuan Xu
Shibiao Xu
ViT
148
0
0
06 May 2025
Comparison of Different Deep Neural Network Models in the Cultural Heritage Domain
Teodor Boyadzhiev
Gabriele Lagani
Luca Ciampi
Giuseppe Amato
Krassimira Ivanova
VLM
54
0
0
30 Apr 2025
Packing Input Frame Context in Next-Frame Prediction Models for Video Generation
Lvmin Zhang
Maneesh Agrawala
DiffM
VGen
75
0
0
17 Apr 2025
Depth-Aware Range Image-Based Model for Point Cloud Segmentation
Bike Chen
Antti Tikänmaki
Juha Roning
3DPC
3DV
55
0
0
19 Mar 2025
Predicting Team Performance from Communications in Simulated Search-and-Rescue
Ali Jalal-Kamali
Nikolos Gurney
David Pynadath
AI4TS
116
0
0
05 Mar 2025
iFormer: Integrating ConvNet and Transformer for Mobile Application
Chuanyang Zheng
ViT
72
0
0
26 Jan 2025
Integrating Multi-Modal Input Token Mixer Into Mamba-Based Decision Models: Decision MetaMamba
Wall Kim
Mamba
55
0
0
10 Jan 2025
EfficientViM: Efficient Vision Mamba with Hidden State Mixer based State Space Duality
Sanghyeok Lee
Joonmyung Choi
Hyunwoo J. Kim
112
3
0
22 Nov 2024
Harnessing Scale and Physics: A Multi-Graph Neural Operator Framework for PDEs on Arbitrary Geometries
Zhihao Li
H. Song
Di Xiao
Zhilu Lai
Wei Wang
AI4CE
88
2
0
18 Nov 2024
Efficient Fourier Filtering Network with Contrastive Learning for UAV-based Unaligned Bi-modal Salient Object Detection
Pengfei Lyu
Pak-Hei Yeung
Xiufei Cheng
Xiaosheng Yu
Chengdong Wu
Jagath C. Rajapakse
42
0
0
06 Nov 2024
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration
Jintao Zhang
Jia wei
Pengle Zhang
Jun-Jie Zhu
Jun Zhu
Jianfei Chen
VLM
MQ
82
19
0
03 Oct 2024
Accuracy Improvement of Cell Image Segmentation Using Feedback Former
Hinako Mitsuoka
Kazuhiro Hotta
ViT
MedIm
41
0
0
23 Aug 2024
LightWeather: Harnessing Absolute Positional Encoding to Efficient and Scalable Global Weather Forecasting
Yisong Fu
Fei Wang
Zezhi Shao
Chengqing Yu
Yujie Li
Zhao Chen
Zhulin An
Yongjun Xu
AI4TS
37
0
0
19 Aug 2024
MetaSeg: MetaFormer-based Global Contexts-aware Network for Efficient Semantic Segmentation
Beoungwoo Kang
Seunghun Moon
Yubin Cho
Hyunwoo Yu
Suk-Ju Kang
ViT
MedIm
29
8
0
14 Aug 2024
PADRe: A Unifying Polynomial Attention Drop-in Replacement for Efficient Vision Transformer
Pierre-David Létourneau
Manish Kumar Singh
Hsin-Pai Cheng
Shizhong Han
Yunxiao Shi
Dalton Jones
M. H. Langston
Hong Cai
Fatih Porikli
37
0
0
16 Jul 2024
MambaVision: A Hybrid Mamba-Transformer Vision Backbone
Ali Hatamizadeh
Jan Kautz
Mamba
45
58
0
10 Jul 2024
KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches
Jiayi Yuan
Hongyi Liu
Shaochen
Zhong
Yu-Neng Chuang
...
Hongye Jin
V. Chaudhary
Zhaozhuo Xu
Zirui Liu
Xia Hu
43
17
0
01 Jul 2024
EmT: A Novel Transformer for Generalized Cross-subject EEG Emotion Recognition
Yi Ding
Chengxuan Tong
Shuailei Zhang
Muyun Jiang
Yong Li
Kevin Lim Jun Liang
Cuntai Guan
25
4
0
26 Jun 2024
Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing Backpropagation
Yuchen Yang
Yingdong Shi
Cheems Wang
Xiantong Zhen
Yuxuan Shi
Jun Xu
37
1
0
24 Jun 2024
The 3D-PC: a benchmark for visual perspective taking in humans and machines
Drew Linsley
Peisen Zhou
A. Ashok
Akash Nagaraj
Gaurav Gaonkar
Francis E Lewis
Zygmunt Pizlo
Thomas Serre
48
6
0
06 Jun 2024
Transformer-based Federated Learning for Multi-Label Remote Sensing Image Classification
Baris Büyüktas
Kenneth Weitzel
Sebastian Völkers
Felix Zailskas
Begüm Demir
54
6
0
24 May 2024
HAFFormer: A Hierarchical Attention-Free Framework for Alzheimer's Disease Detection From Spontaneous Speech
Zhongren Dong
Zixing Zhang
Weixiang Xu
Jing Han
Jianjun Ou
Björn W. Schuller
40
1
0
07 May 2024
A separability-based approach to quantifying generalization: which layer is best?
Luciano Dyballa
Evan Gerritz
Steven W. Zucker
OOD
37
3
0
02 May 2024
LiteNeXt: A Novel Lightweight ConvMixer-based Model with Self-embedding Representation Parallel for Medical Image Segmentation
Ngoc-Du Tran
Thi-Thao Tran
Quang-Huy Nguyen
Manh-Hung Vu
Van-Truong Pham
MedIm
ViT
39
1
0
04 Apr 2024
Efficient Modulation for Vision Networks
Xu Ma
Xiyang Dai
Jianwei Yang
Bin Xiao
Yinpeng Chen
Yun Fu
Lu Yuan
43
17
0
29 Mar 2024
Activating Wider Areas in Image Super-Resolution
Cheng Cheng
Hang Wang
Hongbin Sun
37
10
0
13 Mar 2024
MedFLIP: Medical Vision-and-Language Self-supervised Fast Pre-Training with Masked Autoencoder
Lei Li
Tianfang Zhang
Xinglin Zhang
Jiaqi Liu
Bingqi Ma
Yan-chun Luo
Tao Chen
MedIm
40
0
0
07 Mar 2024
Large Convolutional Model Tuning via Filter Subspace
Wei Chen
Zichen Miao
Qiang Qiu
51
3
0
01 Mar 2024
Zero-shot generalization across architectures for visual classification
Evan Gerritz
Luciano Dyballa
Steven W. Zucker
29
1
0
21 Feb 2024
FViT: A Focal Vision Transformer with Gabor Filter
Yulong Shi
Mingwei Sun
Yongshuai Wang
Rui Wang
57
4
0
17 Feb 2024
CascadedGaze: Efficiency in Global Context Extraction for Image Restoration
Amirhosein Ghasemabadi
Muhammad Kamran Janjua
Mohammad Salameh
Chunhua Zhou
Fengyu Sun
Di Niu
35
11
0
26 Jan 2024
Setting the Record Straight on Transformer Oversmoothing
G. Dovonon
M. Bronstein
Matt J. Kusner
28
5
0
09 Jan 2024
Does Vector Quantization Fail in Spatio-Temporal Forecasting? Exploring a Differentiable Sparse Soft-Vector Quantization Approach
Chao Chen
Tian Zhou
Yanjun Zhao
Hui Liu
Liang Sun
Rong Jin
40
0
0
06 Dec 2023
SCHEME: Scalable Channel Mixer for Vision Transformers
Deepak Sridhar
Yunsheng Li
Nuno Vasconcelos
44
0
0
01 Dec 2023
OmniVec: Learning robust representations with cross modal sharing
Siddharth Srivastava
Gaurav Sharma
SSL
27
64
0
07 Nov 2023
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
Meng Lou
Hong-Yu Zhou
Sibei Yang
Yizhou Yu
Chuan Wu
Yizhou Yu
ViT
44
36
0
30 Oct 2023
Gramian Attention Heads are Strong yet Efficient Vision Learners
Jongbin Ryu
Dongyoon Han
J. Lim
32
1
0
25 Oct 2023
Enhancing Representations through Heterogeneous Self-Supervised Learning
Zhongyu Li
Bo-Wen Yin
Yongxiang Liu
Li Liu
Ming-Ming Cheng
SSL
28
2
0
08 Oct 2023
Low-Resolution Self-Attention for Semantic Segmentation
Yu-Huan Wu
Shi-Chen Zhang
Yun-Hai Liu
Le Zhang
Xin Zhan
Daquan Zhou
Jiashi Feng
Ming-Ming Cheng
Liangli Zhen
ViT
45
3
0
08 Oct 2023
CAIT: Triple-Win Compression towards High Accuracy, Fast Inference, and Favorable Transferability For ViTs
Ao Wang
Hui Chen
Zijia Lin
Sicheng Zhao
J. Han
Guiguang Ding
ViT
31
6
0
27 Sep 2023
Decoupled Local Aggregation for Point Cloud Learning
Binjie Chen
Yunzhou Xia
Yu Zang
Cheng-Yu Wang
Jonathan Li
3DPC
29
9
0
31 Aug 2023
LatentDR: Improving Model Generalization Through Sample-Aware Latent Degradation and Restoration
Ran Liu
Sahil Khose
Jingyun Xiao
Lakshmi Sathidevi
Keerthan Ramnath
Z. Kira
Eva L. Dyer
34
3
0
28 Aug 2023
Large-kernel Attention for Efficient and Robust Brain Lesion Segmentation
Liam Chalcroft
Ruben Lourencco Pereira
Mikael Brudfors
Andrew S. Kayser
M. D’Esposito
Cathy J. Price
Ioannis Pappas
John Ashburner
ViT
3DV
MedIm
29
8
0
14 Aug 2023
Mitigating Task Interference in Multi-Task Learning via Explicit Task Routing with Non-Learnable Primitives
Chuntao Ding
Zhichao Lu
Shangguang Wang
Ran Cheng
Vishnu Naresh Boddeti
MoMe
11
16
0
03 Aug 2023
MiDaS v3.1 -- A Model Zoo for Robust Monocular Relative Depth Estimation
R. Birkl
Diana Wofk
Matthias Muller
MDE
27
133
0
26 Jul 2023
MobileViG: Graph-Based Sparse Attention for Mobile Vision Applications
Mustafa Munir
William Avery
R. Marculescu
ViT
GNN
34
33
0
01 Jul 2023
A Mask Free Neural Network for Monaural Speech Enhancement
Liangqi Liu
Haixing Guan
Jinlong Ma
Wei Dai
Guang-Yi Wang
Shaowei Ding
21
11
0
07 Jun 2023
Lightweight Vision Transformer with Bidirectional Interaction
Qihang Fan
Huaibo Huang
Xiaoqiang Zhou
Ran He
ViT
50
28
0
01 Jun 2023
1
2
3
Next