Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.08083
Cited By
MambaVision: A Hybrid Mamba-Transformer Vision Backbone
10 July 2024
Ali Hatamizadeh
Jan Kautz
Mamba
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MambaVision: A Hybrid Mamba-Transformer Vision Backbone"
39 / 89 papers shown
Title
A Survey of Mamba
Shuwei Shi
Shibing Chu
Rui An
Wenqi Fan
Yuee Xie
Hui Liu
Yuanping Chen
Qing Li
AI4CE
57
29
0
02 Aug 2024
SiMBA: Simplified Mamba-Based Architecture for Vision and Multivariate Time series
Badri N. Patro
Vijay Srinivas Agneeswaran
Mamba
75
52
0
22 Mar 2024
EfficientVMamba: Atrous Selective Scan for Light Weight Visual Mamba
Xiaohuan Pei
Tao Huang
Chang Xu
Mamba
38
94
0
15 Mar 2024
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Lianghui Zhu
Bencheng Liao
Qian Zhang
Xinlong Wang
Wenyu Liu
Xinggang Wang
Mamba
66
736
0
17 Jan 2024
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Albert Gu
Tri Dao
Mamba
55
2,552
0
01 Dec 2023
FasterViT: Fast Vision Transformers with Hierarchical Attention
Ali Hatamizadeh
Greg Heinrich
Hongxu Yin
Andrew Tao
J. Álvarez
Jan Kautz
Pavlo Molchanov
ViT
87
68
0
09 Jun 2023
Next-ViT: Next Generation Vision Transformer for Efficient Deployment in Realistic Industrial Scenarios
Jiashi Li
Xin Xia
W. Li
Huixia Li
Xing Wang
Xuefeng Xiao
Rui Wang
Min Zheng
Xin Pan
ViT
31
149
0
12 Jul 2022
Global Context Vision Transformers
Ali Hatamizadeh
Hongxu Yin
Greg Heinrich
Jan Kautz
Pavlo Molchanov
ViT
33
122
0
20 Jun 2022
EfficientFormer: Vision Transformers at MobileNet Speed
Yanyu Li
Geng Yuan
Yang Wen
Eric Hu
Georgios Evangelidis
Sergey Tulyakov
Yanzhi Wang
Jian Ren
ViT
57
360
0
02 Jun 2022
DeiT III: Revenge of the ViT
Hugo Touvron
Matthieu Cord
Hervé Jégou
ViT
101
402
0
14 Apr 2022
MaxViT: Multi-Axis Vision Transformer
Zhengzhong Tu
Hossein Talebi
Han Zhang
Feng Yang
P. Milanfar
A. Bovik
Yinxiao Li
ViT
92
646
0
04 Apr 2022
A ConvNet for the 2020s
Zhuang Liu
Hanzi Mao
Chaozheng Wu
Christoph Feichtenhofer
Trevor Darrell
Saining Xie
ViT
62
5,073
0
10 Jan 2022
MetaFormer Is Actually What You Need for Vision
Weihao Yu
Mi Luo
Pan Zhou
Chenyang Si
Yichen Zhou
Xinchao Wang
Jiashi Feng
Shuicheng Yan
130
893
0
22 Nov 2021
Swin Transformer V2: Scaling Up Capacity and Resolution
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
...
Yue Cao
Zheng Zhang
Li Dong
Furu Wei
B. Guo
ViT
173
1,783
0
18 Nov 2021
Combining Recurrent, Convolutional, and Continuous-time Models with Linear State-Space Layers
Albert Gu
Isys Johnson
Karan Goel
Khaled Kamal Saab
Tri Dao
Atri Rudra
Christopher Ré
70
574
0
26 Oct 2021
ResNet strikes back: An improved training procedure in timm
Ross Wightman
Hugo Touvron
Hervé Jégou
AI4TS
230
489
0
01 Oct 2021
XCiT: Cross-Covariance Image Transformers
Alaaeldin El-Nouby
Hugo Touvron
Mathilde Caron
Piotr Bojanowski
Matthijs Douze
...
Ivan Laptev
Natalia Neverova
Gabriel Synnaeve
Jakob Verbeek
Hervé Jégou
ViT
108
507
0
17 Jun 2021
Twins: Revisiting the Design of Spatial Attention in Vision Transformers
Xiangxiang Chu
Zhi Tian
Yuqing Wang
Bo Zhang
Haibing Ren
Xiaolin K. Wei
Huaxia Xia
Chunhua Shen
ViT
39
1,006
0
28 Apr 2021
Visformer: The Vision-friendly Transformer
Zhengsu Chen
Lingxi Xie
Jianwei Niu
Xuefeng Liu
Longhui Wei
Qi Tian
ViT
144
216
0
26 Apr 2021
Co-Scale Conv-Attentional Image Transformers
Weijian Xu
Yifan Xu
Tyler A. Chang
Zhuowen Tu
ViT
42
374
0
13 Apr 2021
LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference
Ben Graham
Alaaeldin El-Nouby
Hugo Touvron
Pierre Stock
Armand Joulin
Hervé Jégou
Matthijs Douze
ViT
41
779
0
02 Apr 2021
EfficientNetV2: Smaller Models and Faster Training
Mingxing Tan
Quoc V. Le
EgoV
75
2,662
0
01 Apr 2021
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
Chun-Fu Chen
Quanfu Fan
Yikang Shen
ViT
55
1,450
0
27 Mar 2021
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
ViT
286
21,051
0
25 Mar 2021
Transformer in Transformer
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
ViT
359
1,544
0
27 Feb 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
433
3,660
0
24 Feb 2021
Training data-efficient image transformers & distillation through attention
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jégou
ViT
268
6,657
0
23 Dec 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
307
40,217
0
22 Oct 2020
Designing Network Design Spaces
Ilija Radosavovic
Raj Prateek Kosaraju
Ross B. Girshick
Kaiming He
Piotr Dollár
GNN
82
1,672
0
30 Mar 2020
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
Yang You
Jing Li
Sashank J. Reddi
Jonathan Hseu
Sanjiv Kumar
Srinadh Bhojanapalli
Xiaodan Song
J. Demmel
Kurt Keutzer
Cho-Jui Hsieh
ODL
154
991
0
01 Apr 2019
Unified Perceptual Parsing for Scene Understanding
Tete Xiao
Yingcheng Liu
Bolei Zhou
Yuning Jiang
Jian Sun
OCL
VOS
95
1,859
0
26 Jul 2018
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
422
129,831
0
12 Jun 2017
Mask R-CNN
Kaiming He
Georgia Gkioxari
Piotr Dollár
Ross B. Girshick
ObjD
293
27,018
0
20 Mar 2017
Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning
Stefan Elfwing
E. Uchibe
Kenji Doya
53
1,690
0
10 Feb 2017
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Zhuowen Tu
Kaiming He
415
10,281
0
16 Nov 2016
Gaussian Error Linear Units (GELUs)
Dan Hendrycks
Kevin Gimpel
146
4,934
0
27 Jun 2016
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
1.3K
192,638
0
10 Dec 2015
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe
Christian Szegedy
OOD
279
43,154
0
11 Feb 2015
Microsoft COCO: Common Objects in Context
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
ObjD
232
43,290
0
01 May 2014
Previous
1
2