Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.06583
Cited By
S4ND: Modeling Images and Videos as Multidimensional Signals Using State Spaces
12 October 2022
Eric N. D. Nguyen
Karan Goel
Albert Gu
Gordon W. Downs
Preey Shah
Tri Dao
S. Baccus
Christopher Ré
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"S4ND: Modeling Images and Videos as Multidimensional Signals Using State Spaces"
19 / 19 papers shown
Title
Vision-LSTM: xLSTM as Generic Vision Backbone
Benedikt Alkin
M. Beck
Korbinian Poppel
Sepp Hochreiter
Johannes Brandstetter
VLM
112
46
0
24 Feb 2025
GG-SSMs: Graph-Generating State Space Models
Nikola Zubić
Davide Scaramuzza
Mamba
127
1
0
17 Dec 2024
How to Train Your HiPPO: State Space Models with Generalized Orthogonal Basis Projections
Albert Gu
Isys Johnson
Aman Timalsina
Atri Rudra
Christopher Ré
Mamba
119
93
0
24 Jun 2022
DaViT: Dual Attention Vision Transformers
Mingyu Ding
Bin Xiao
Noel Codella
Ping Luo
Jingdong Wang
Lu Yuan
ViT
84
247
0
07 Apr 2022
Swin Transformer V2: Scaling Up Capacity and Resolution
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
...
Yue Cao
Zheng Zhang
Li Dong
Furu Wei
B. Guo
ViT
166
1,783
0
18 Nov 2021
FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes
David W. Romero
Robert-Jan Bruintjes
Jakub M. Tomczak
Erik J. Bekkers
Mark Hoogendoorn
Jan van Gemert
91
83
0
15 Oct 2021
Video Swin Transformer
Ze Liu
Jia Ning
Yue Cao
Yixuan Wei
Zheng Zhang
Stephen Lin
Han Hu
ViT
70
1,458
0
24 Jun 2021
Towards Long-Form Video Understanding
Chaoxia Wu
Philipp Krahenbuhl
VLM
ViT
94
168
0
21 Jun 2021
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
144
2,785
0
15 Jun 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Huayu Chen
Boqing Gong
ViT
259
581
0
22 Apr 2021
Going deeper with Image Transformers
Hugo Touvron
Matthieu Cord
Alexandre Sablayrolles
Gabriel Synnaeve
Hervé Jégou
ViT
110
998
0
31 Mar 2021
ViViT: A Video Vision Transformer
Anurag Arnab
Mostafa Dehghani
G. Heigold
Chen Sun
Mario Lucic
Cordelia Schmid
ViT
102
2,119
0
29 Mar 2021
CKConv: Continuous Kernel Convolution For Sequential Data
David W. Romero
Anna Kuzina
Erik J. Bekkers
Jakub M. Tomczak
Mark Hoogendoorn
43
125
0
04 Feb 2021
Generalizing Convolutional Neural Networks for Equivariance to Lie Groups on Arbitrary Continuous Data
Marc Finzi
Samuel Stanton
Pavel Izmailov
A. Wilson
96
322
0
25 Feb 2020
Fixing the train-test resolution discrepancy
Hugo Touvron
Andrea Vedaldi
Matthijs Douze
Hervé Jégou
95
423
0
14 Jun 2019
CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features
Sangdoo Yun
Dongyoon Han
Seong Joon Oh
Sanghyuk Chun
Junsuk Choe
Y. Yoo
OOD
557
4,735
0
13 May 2019
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
João Carreira
Andrew Zisserman
186
7,961
0
22 May 2017
Deep Networks with Stochastic Depth
Gao Huang
Yu Sun
Zhuang Liu
Daniel Sedra
Kilian Q. Weinberger
134
2,344
0
30 Mar 2016
Going Deeper with Convolutions
Christian Szegedy
Wei Liu
Yangqing Jia
P. Sermanet
Scott E. Reed
Dragomir Anguelov
D. Erhan
Vincent Vanhoucke
Andrew Rabinovich
269
43,511
0
17 Sep 2014
1