Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.01697
Cited By
MaxViT: Multi-Axis Vision Transformer
4 April 2022
Zhengzhong Tu
Hossein Talebi
Han Zhang
Feng Yang
P. Milanfar
A. Bovik
Yinxiao Li
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MaxViT: Multi-Axis Vision Transformer"
50 / 101 papers shown
Title
EventDiff: A Unified and Efficient Diffusion Model Framework for Event-based Video Frame Interpolation
Hanle Zheng
Xujie Han
Zegang Peng
Shangbin Zhang
Guangxun Du
Zhuo Zou
Xiang Wang
Jibin Wu
Hao Guo
Lei Deng
DiffM
VGen
50
0
0
13 May 2025
ORXE: Orchestrating Experts for Dynamically Configurable Efficiency
Qingyuan Wang
Guoxin Wang
B. Cardiff
Deepu John
38
0
0
07 May 2025
ORBIT-2: Scaling Exascale Vision Foundation Models for Weather and Climate Downscaling
Xiao Wang
Jong Youl Choi
Takuya Kurihaya
Isaac Lyngaas
Hong-Jun Yoon
...
Dali Wang
Peter Thornton
Prasanna Balaprakash
M. Ashfaq
Dan Lu
28
0
0
07 May 2025
Vision Transformers in Precision Agriculture: A Comprehensive Survey
Saber Mehdipour
Seyed Abolghasem Mirroshandel
Seyed Amirhossein Tabatabaei
36
0
0
30 Apr 2025
A Spatially-Aware Multiple Instance Learning Framework for Digital Pathology
H. Keshvarikhojasteh
Mihail Tifrea
Sibylle Hess
J. Pluim
M. Veta
54
0
0
24 Apr 2025
Advanced Deep Learning and Large Language Models: Comprehensive Insights for Cancer Detection
Yassine Habchi
Hamza Kheddar
Yassine Himeur
Adel Belouchrani
Erchin Serpedin
Fouad Khelifi
Muhammad E.H. Chowdhury
LM&MA
46
0
0
30 Mar 2025
CADRef: Robust Out-of-Distribution Detection via Class-Aware Decoupled Relative Feature Leveraging
Zhiwei Ling
Yachen Chang
Hailiang Zhao
Xinkui Zhao
Kingsum Chow
Shuiguang Deng
OODD
58
0
0
01 Mar 2025
OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels
Meng Lou
Yizhou Yu
115
1
0
27 Feb 2025
Max360IQ: Blind Omnidirectional Image Quality Assessment with Multi-axis Attention
Jiebin Yan
Ziwen Tan
Yuming Fang
Jiale Rao
Yifan Zuo
53
1
0
26 Feb 2025
Enhancing Vehicle Make and Model Recognition with 3D Attention Modules
Narges Semiromizadeh
Omid Nejati Manzari
S. B. Shokouhi
S. Mirzakuchaki
ViT
94
0
0
24 Feb 2025
QMaxViT-Unet+: A Query-Based MaxViT-Unet with Edge Enhancement for Scribble-Supervised Segmentation of Medical Images
Thien B. Nguyen-Tat
Hoang-An Vo
Phuoc-Sang Dang
70
0
0
17 Feb 2025
DFCon: Attention-Driven Supervised Contrastive Learning for Robust Deepfake Detection
MD Sadik Hossain Shanto
Mahir Labib Dihan
Souvik Ghosh
Riad Ahmed Anonto
Hafijul Hoque Chowdhury
...
Rakib Ahsan
Md Tanvir Hassan
MD Roqunuzzaman Sojib
Sheikh Azizul Hakim
M. Saifur Rahman
CVBM
71
0
0
28 Jan 2025
Implicit Bias in Matrix Factorization and its Explicit Realization in a New Architecture
Yikun Hou
Suvrit Sra
A. Yurtsever
29
0
0
28 Jan 2025
iFormer: Integrating ConvNet and Transformer for Mobile Application
Chuanyang Zheng
ViT
72
0
0
26 Jan 2025
Rethinking Early-Fusion Strategies for Improved Multimodal Image Segmentation
Zhengwen Shen
Yulian Li
Han Zhang
Yuchen Weng
Jun Wang
35
0
0
19 Jan 2025
SegMAN: Omni-scale Context Modeling with State Space Models and Local Attention for Semantic Segmentation
Yunxiang Fu
Meng Lou
Yizhou Yu
115
1
0
16 Dec 2024
Frequency-Adaptive Low-Latency Object Detection Using Events and Frames
Haitian Zhang
Xiangyuan Wang
Chang Xu
Xinya Wang
Fang Xu
Huai Yu
Lei Yu
Wen Yang
ObjD
92
0
0
05 Dec 2024
Breaking the Low-Rank Dilemma of Linear Attention
Qihang Fan
Huaibo Huang
Ran He
42
1
0
12 Nov 2024
S
4
^4
4
ST: A Strong, Self-transferable, faSt, and Simple Scale Transformation for Transferable Targeted Attack
Yongxiang Liu
Bowen Peng
Li Liu
Xuran Li
113
0
0
13 Oct 2024
Prithvi WxC: Foundation Model for Weather and Climate
J. Schmude
Sujit Roy
Will Trojak
Johannes Jakubik
Daniel Salles Civitarese
...
Campbell Watson
M. Maskey
Tsengdar J Lee
Juan Bernabé-Moreno
Rahul Ramachandran
VLM
AI4Cl
34
10
0
20 Sep 2024
SkinMamba: A Precision Skin Lesion Segmentation Architecture with Cross-Scale Global State Modeling and Frequency Boundary Guidance
Shun Zou
Mingya Zhang
Bingjian Fan
Zhengyi Zhou
Xiuguo Zou
Mamba
29
3
0
17 Sep 2024
HDKD: Hybrid Data-Efficient Knowledge Distillation Network for Medical Image Classification
Omar S. El-Assiouti
Ghada Hamed
Dina Khattab
H. M. Ebied
39
1
0
10 Jul 2024
Fish-Vista: A Multi-Purpose Dataset for Understanding & Identification of Traits from Images
Kazi Sajeed Mehrab
M. Maruf
Arka Daw
Harish Babu Manogaran
Abhilash Neog
...
Paula Mabee
Wasila Dahdul
Anuj Karpatne
Wasila M Dahdul
Anuj Karpatne
41
4
0
10 Jul 2024
MambaVision: A Hybrid Mamba-Transformer Vision Backbone
Ali Hatamizadeh
Jan Kautz
Mamba
45
56
0
10 Jul 2024
Particle Multi-Axis Transformer for Jet Tagging
Muhammad Usman
M. Shahid
Maheen Ejaz
Ummay Hani
Nayab Fatima
Abdul Rehman Khan
Asifullah Khan
Nasir Majid Mirza
35
3
0
09 Jun 2024
The 3D-PC: a benchmark for visual perspective taking in humans and machines
Drew Linsley
Peisen Zhou
A. Ashok
Akash Nagaraj
Gaurav Gaonkar
Francis E Lewis
Zygmunt Pizlo
Thomas Serre
48
6
0
06 Jun 2024
Vision Transformer with Sparse Scan Prior
Qihang Fan
Huaibo Huang
Mingrui Chen
Ran He
ViT
48
5
0
22 May 2024
Dynamic Line Rating using Hyper-local Weather Predictions: A Machine Learning Approach
Henri Manninen
Markus Lippus
Georg Rute
27
0
0
20 May 2024
SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization
Jialong Guo
Xinghao Chen
Yehui Tang
Yunhe Wang
ViT
49
9
0
19 May 2024
Development of Skip Connection in Deep Neural Networks for Computer Vision and Medical Image Analysis: A Survey
Guoping Xu
Xiaxia Wang
Xinglong Wu
Xuesong Leng
Yongchao Xu
3DPC
36
8
0
02 May 2024
Efficient Modulation for Vision Networks
Xu Ma
Xiyang Dai
Jianwei Yang
Bin Xiao
Yinpeng Chen
Yun Fu
Lu Yuan
43
17
0
29 Mar 2024
Tiny Models are the Computational Saver for Large Models
Qingyuan Wang
B. Cardiff
Antoine Frappé
Benoît Larras
Deepu John
41
2
0
26 Mar 2024
HIRI-ViT: Scaling Vision Transformer with High Resolution Inputs
Ting Yao
Yehao Li
Yingwei Pan
Tao Mei
ViT
28
15
0
18 Mar 2024
Multi-Human Mesh Recovery with Transformers
Zeyu Wang
Zhenzhen Weng
Serena Yeung-Levy
3DH
32
1
0
26 Feb 2024
CAManim: Animating end-to-end network activation maps
Emily Kaczmarek
Olivier X. Miguel
Alexa C. Bowie
R. Ducharme
Alysha L. J. Dingwall-Harvey
S. Hawken
Christine M. Armour
Mark C. Walker
Kevin Dick
HAI
26
1
0
19 Dec 2023
Efficiency-oriented approaches for self-supervised speech representation learning
Luis Lugo
Valentin Vielzeuf
SSL
26
1
0
18 Dec 2023
A Novel Image Classification Framework Based on Variational Quantum Algorithms
Yixiong Chen
26
3
0
13 Dec 2023
Kandinsky 3.0 Technical Report
V.Ya. Arkhipkin
Andrei Filatov
Viacheslav Vasilev
Anastasia Maltseva
Said Azizov
Igor Pavlov
Julia Agafonova
Andrey Kuznetsov
Denis Dimitrov
DiffM
28
11
0
06 Dec 2023
SCHEME: Scalable Channel Mixer for Vision Transformers
Deepak Sridhar
Yunsheng Li
Nuno Vasconcelos
44
0
0
01 Dec 2023
PViT-6D: Overclocking Vision Transformers for 6D Pose Estimation with Confidence-Level Prediction and Pose Tokens
Sebastian Stapf
Tobias Bauernfeind
Marco Riboldi
ViT
25
1
0
29 Nov 2023
LEOD: Label-Efficient Object Detection for Event Cameras
Ziyi Wu
Mathias Gehrig
Qing Lyu
Xudong Liu
Igor Gilitschenski
27
13
0
29 Nov 2023
FusionFrames: Efficient Architectural Aspects for Text-to-Video Generation Pipeline
V.Ya. Arkhipkin
Zein Shaheen
Viacheslav Vasilev
E. Dakhova
Andrey Kuznetsov
Denis Dimitrov
DiffM
VGen
23
5
0
22 Nov 2023
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
Meng Lou
Hong-Yu Zhou
Sibei Yang
Yizhou Yu
Chuan Wu
Yizhou Yu
ViT
44
36
0
30 Oct 2023
Gramian Attention Heads are Strong yet Efficient Vision Learners
Jongbin Ryu
Dongyoon Han
J. Lim
30
1
0
25 Oct 2023
Medical Image Segmentation via Sparse Coding Decoder
Long Zeng
Kaigui Wu
MedIm
26
3
0
17 Oct 2023
SSG2: A new modelling paradigm for semantic segmentation
F. Diakogiannis
S. Furby
P. Caccetta
Xiaoliang Wu
Rodrigo Ibata
O. Hlinka
John Taylor
VLM
37
0
0
12 Oct 2023
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
Yulong Shi
Mingwei Sun
Yongshuai Wang
Hui Sun
Zengqiang Chen
34
4
0
10 Oct 2023
Enhancing Adversarial Attacks: The Similar Target Method
Shuo Zhang
Ziruo Wang
Zikai Zhou
Huanran Chen
AAML
54
1
0
21 Aug 2023
Dual Aggregation Transformer for Image Super-Resolution
Zheng Chen
Yulun Zhang
Jinjin Gu
L. Kong
Xiaokang Yang
F. I. F. Richard Yu
ViT
16
167
0
07 Aug 2023
M2Former: Multi-Scale Patch Selection for Fine-Grained Visual Recognition
Ji-Hee Moon
Junseok K. Lee
Yu-Ling Lee
Seongsik Park
35
4
0
04 Aug 2023
1
2
3
Next