Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.07992
Cited By
MambaOut: Do We Really Need Mamba for Vision?
13 May 2024
Weihao Yu
Xinchao Wang
Mamba
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MambaOut: Do We Really Need Mamba for Vision?"
49 / 49 papers shown
Title
Graph Mamba for Efficient Whole Slide Image Understanding
Jiaxuan Lu
Junyan Shi
Yuhui Lin
Fang Yan
Yue Gao
Shaoting Zhang
Xiaosong Wang
Mamba
142
0
0
23 May 2025
Selective Structured State Space for Multispectral-fused Small Target Detection
Qianqian Zhang
WeiJun Wang
Yunxing Liu
Li Zhou
Hao Zhao
Junshe An
Zihan Wang
Mamba
147
0
0
20 May 2025
MambaFlow: A Mamba-Centric Architecture for End-to-End Optical Flow Estimation
Juntian Du
Yuan Sun
Zhihu Zhou
Pinyi Chen
Runzhe Zhang
Keji Mao
Mamba
66
1
0
10 Mar 2025
MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval
Haoran Tang
Meng Cao
Jinfa Huang
Ruyang Liu
Peng Jin
Ge Li
Xiaodan Liang
Mamba
132
4
0
24 Feb 2025
Surface Vision Mamba: Leveraging Bidirectional State Space Model for Efficient Spherical Manifold Representation
Rongzhao He
Weihao Zheng
Leilei Zhao
Ying Wang
Dalin Zhu
Dan Wu
Bin Hu
Mamba
118
0
0
21 Feb 2025
Is Long Range Sequential Modeling Necessary For Colorectal Tumor Segmentation?
Abhishek Srivastava
Koushik Biswas
Gorkem Durak
Gulsah Ozden
Mustafa Adli
Ulas Bagci
Mamba
3DV
74
0
0
10 Feb 2025
VMamba: Visual State Space Model
Yue Liu
Yunjie Tian
Yuzhong Zhao
Hongtian Yu
Lingxi Xie
Yaowei Wang
Qixiang Ye
Jianbin Jiao
Yunfan Liu
Mamba
206
666
0
31 Dec 2024
Spatial-Mamba: Effective Visual State Space Models via Structure-aware State Fusion
Chaodong Xiao
Minghan Li
Zhengqiang Zhang
Deyu Meng
Lei Zhang
Mamba
111
5
0
19 Oct 2024
IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation Model
Y. Huang
Tomo Miyazaki
Xiaofeng Liu
S. Omachi
Mamba
63
4
0
16 May 2024
PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition
Chenhongyi Yang
Zehui Chen
Miguel Espinosa
Linus Ericsson
Zhenyu Wang
Jiaming Liu
Elliot J. Crowley
Mamba
65
92
0
26 Mar 2024
LocalMamba: Visual State Space Model with Windowed Selective Scan
Tao Huang
Xiaohuan Pei
Shan You
Fei Wang
Chao Qian
Chang Xu
Mamba
67
146
0
14 Mar 2024
TransNeXt: Robust Foveal Visual Perception for Vision Transformers
Dai Shi
ViT
45
83
0
28 Nov 2023
RWKV: Reinventing RNNs for the Transformer Era
Bo Peng
Eric Alcaide
Quentin G. Anthony
Alon Albalak
Samuel Arcadinho
...
Qihang Zhao
P. Zhou
Qinghua Zhou
Jian Zhu
Rui-Jie Zhu
163
585
0
22 May 2023
InceptionNeXt: When Inception Meets ConvNeXt
Weihao Yu
Pan Zhou
Shuicheng Yan
Xinchao Wang
85
125
0
29 Mar 2023
Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition
Qibin Hou
Cheng Lu
Mingg-Ming Cheng
Jiashi Feng
ViT
67
131
0
22 Nov 2022
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
Wenhai Wang
Jifeng Dai
Zhe Chen
Zhenhang Huang
Zhiqi Li
...
Tong Lu
Lewei Lu
Hongsheng Li
Xiaogang Wang
Yu Qiao
VLM
98
677
0
10 Nov 2022
MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision Models
Chenglin Yang
Siyuan Qiao
Qihang Yu
Xiaoding Yuan
Yukun Zhu
Alan Yuille
Hartwig Adam
Liang-Chieh Chen
ViT
MoE
64
60
0
04 Oct 2022
HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions
Yongming Rao
Wenliang Zhao
Yansong Tang
Jie Zhou
Ser-Nam Lim
Jiwen Lu
ViT
39
254
0
28 Jul 2022
Inception Transformer
Chenyang Si
Weihao Yu
Pan Zhou
Yichen Zhou
Xinchao Wang
Shuicheng Yan
ViT
51
193
0
25 May 2022
Vision Transformer Adapter for Dense Predictions
Zhe Chen
Yuchen Duan
Wenhai Wang
Junjun He
Tong Lu
Jifeng Dai
Yu Qiao
73
552
0
17 May 2022
MaxViT: Multi-Axis Vision Transformer
Zhengzhong Tu
Hossein Talebi
Han Zhang
Feng Yang
P. Milanfar
A. Bovik
Yinxiao Li
ViT
95
649
0
04 Apr 2022
Focal Modulation Networks
Jianwei Yang
Chunyuan Li
Xiyang Dai
Lu Yuan
Jianfeng Gao
3DPC
62
271
0
22 Mar 2022
UniFormer: Unifying Convolution and Self-attention for Visual Recognition
Kunchang Li
Yali Wang
Junhao Zhang
Peng Gao
Guanglu Song
Yu Liu
Hongsheng Li
Yu Qiao
ViT
177
372
0
24 Jan 2022
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
Yanghao Li
Chaoxia Wu
Haoqi Fan
K. Mangalam
Bo Xiong
Jitendra Malik
Christoph Feichtenhofer
ViT
123
685
0
02 Dec 2021
MetaFormer Is Actually What You Need for Vision
Weihao Yu
Mi Luo
Pan Zhou
Chenyang Si
Yichen Zhou
Xinchao Wang
Jiashi Feng
Shuicheng Yan
130
896
0
22 Nov 2021
PVT v2: Improved Baselines with Pyramid Vision Transformer
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
AI4TS
83
1,634
0
25 Jun 2021
VOLO: Vision Outlooker for Visual Recognition
Li-xin Yuan
Qibin Hou
Zihang Jiang
Jiashi Feng
Shuicheng Yan
ViT
77
322
0
24 Jun 2021
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
170
2,790
0
15 Jun 2021
CoAtNet: Marrying Convolution and Attention for All Data Sizes
Zihang Dai
Hanxiao Liu
Quoc V. Le
Mingxing Tan
ViT
91
1,188
0
09 Jun 2021
Going deeper with Image Transformers
Hugo Touvron
Matthieu Cord
Alexandre Sablayrolles
Gabriel Synnaeve
Hervé Jégou
ViT
116
998
0
31 Mar 2021
Training data-efficient image transformers & distillation through attention
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jégou
ViT
297
6,657
0
23 Dec 2020
ConvBERT: Improving BERT with Span-based Dynamic Convolution
Zihang Jiang
Weihao Yu
Daquan Zhou
Yunpeng Chen
Jiashi Feng
Shuicheng Yan
60
158
0
06 Aug 2020
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
480
2,051
0
28 Jul 2020
Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention
Angelos Katharopoulos
Apoorv Vyas
Nikolaos Pappas
Franccois Fleuret
107
1,734
0
29 Jun 2020
Linformer: Self-Attention with Linear Complexity
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
168
1,678
0
08 Jun 2020
Lite Transformer with Long-Short Range Attention
Zhanghao Wu
Zhijian Liu
Ji Lin
Chengyue Wu
Song Han
51
321
0
24 Apr 2020
Longformer: The Long-Document Transformer
Iz Beltagy
Matthew E. Peters
Arman Cohan
RALM
VLM
93
3,996
0
10 Apr 2020
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke
Sam Gross
Francisco Massa
Adam Lerer
James Bradbury
...
Sasank Chilamkurthy
Benoit Steiner
Lu Fang
Junjie Bai
Soumith Chintala
ODL
256
42,038
0
03 Dec 2019
MMDetection: Open MMLab Detection Toolbox and Benchmark
Kai-xiang Chen
Jiaqi Wang
Jiangmiao Pang
Yuhang Cao
Yu Xiong
...
Jingdong Wang
Jianping Shi
Wanli Ouyang
Chen Change Loy
Dahua Lin
VOS
114
2,845
0
17 Jun 2019
CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features
Sangdoo Yun
Dongyoon Han
Seong Joon Oh
Sanghyuk Chun
Junsuk Choe
Y. Yoo
OOD
578
4,735
0
13 May 2019
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
VLM
138
3,714
0
09 Jan 2019
ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design
Ningning Ma
Xiangyu Zhang
Haitao Zheng
Jian Sun
130
4,957
0
30 Jul 2018
Group Normalization
Yuxin Wu
Kaiming He
145
3,626
0
22 Mar 2018
Language Modeling with Gated Convolutional Networks
Yann N. Dauphin
Angela Fan
Michael Auli
David Grangier
195
2,377
0
23 Dec 2016
Layer Normalization
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
246
10,412
0
21 Jul 2016
Deep Networks with Stochastic Depth
Gao Huang
Yu Sun
Zhuang Liu
Daniel Sedra
Kilian Q. Weinberger
151
2,344
0
30 Mar 2016
Rethinking the Inception Architecture for Computer Vision
Christian Szegedy
Vincent Vanhoucke
Sergey Ioffe
Jonathon Shlens
Z. Wojna
3DV
BDL
478
27,231
0
02 Dec 2015
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
1.1K
39,383
0
01 Sep 2014
Neural Machine Translation by Jointly Learning to Align and Translate
Dzmitry Bahdanau
Kyunghyun Cho
Yoshua Bengio
AIMat
388
27,205
0
01 Sep 2014
1