Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2206.10589
Cited By
EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications
21 June 2022
Muhammad Maaz
Abdelrahman M. Shaker
Hisham Cholakkal
Salman Khan
Syed Waqas Zamir
Rao Muhammad Anwer
Fahad Shahbaz Khan
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications"
50 / 81 papers shown
Title
xEdgeFace: Efficient Cross-Spectral Face Recognition for Edge Devices
Anjith George
S´ebastien Marcel
CVBM
65
0
0
28 Apr 2025
LSNet: See Large, Focus Small
Ao Wang
Hui Chen
Zijia Lin
J. Han
Guiguang Ding
42
0
0
29 Mar 2025
Mobile-VideoGPT: Fast and Accurate Video Understanding Language Model
Abdelrahman M. Shaker
Muhammad Maaz
Chenhui Gou
Hamid Rezatofighi
Salman Khan
Fahad Shahbaz Khan
171
0
0
27 Mar 2025
An improved EfficientNetV2 for garbage classification
Wenxuan Qiu
Chengxin Xie
Jingui Huang
53
0
0
27 Mar 2025
SHAP-Integrated Convolutional Diagnostic Networks for Feature-Selective Medical Analysis
Yan Hu
Ahmad Chaddad
51
0
0
10 Mar 2025
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection
Yuming Chen
Xinbin Yuan
Ruiqi Wu
Jiabao Wang
Qibin Hou
Mingg-Ming Cheng
Ming-Ming Cheng
ObjD
156
51
0
21 Feb 2025
MicroViT: A Vision Transformer with Low Complexity Self Attention for Edge Device
Novendra Setyawan
Chi-Chia Sun
Mao-Hsiu Hsu
W. Kuo
Jun-Wei Hsieh
ViT
49
2
0
09 Feb 2025
iFormer: Integrating ConvNet and Transformer for Mobile Application
Chuanyang Zheng
ViT
72
0
0
26 Jan 2025
RecConv: Efficient Recursive Convolutions for Multi-Frequency Representations
Mingshu Zhao
Yi Luo
Yong Ouyang
38
0
0
27 Dec 2024
HyperCLIP: Adapting Vision-Language models with Hypernetworks
Victor Akinwande
Mohammad Sadegh Norouzzadeh
Devin Willmott
Anna Bair
Madan Ravi Ganesh
J. Zico Kolter
CLIP
VLM
93
0
0
21 Dec 2024
MobileMamba: Lightweight Multi-Receptive Visual Mamba Network
Haoyang He
Jun Zhang
Yuxuan Cai
Hongxu Chen
Xiaobin Hu
Zhenye Gan
Yishuo Wang
Chengjie Wang
Yunsheng Wu
Lei Xie
Mamba
88
3
0
24 Nov 2024
GTA-Net: An IoT-Integrated 3D Human Pose Estimation System for Real-Time Adolescent Sports Posture Correction
Shizhe Yuan
Li Zhou
3DH
43
5
0
11 Nov 2024
Cross-video Identity Correlating for Person Re-identification Pre-training
Jialong Zuo
Ying Nie
Hanyu Zhou
Huaxin Zhang
Haoyu Wang
Tianyu Guo
Nong Sang
Changxin Gao
37
2
0
27 Sep 2024
SCAN-Edge: Finding MobileNet-speed Hybrid Networks for Diverse Edge Devices via Hardware-Aware Evolutionary Search
Hung-Yueh Chiang
Diana Marculescu
34
0
0
27 Aug 2024
TReX- Reusing Vision Transformer's Attention for Efficient Xbar-based Computing
Abhishek Moitra
Abhiroop Bhattacharjee
Youngeun Kim
Priyadarshini Panda
ViT
32
2
0
22 Aug 2024
Towards Real-time Video Compressive Sensing on Mobile Devices
Miao Cao
Lishun Wang
Huan Wang
Guoqing Wang
Xin Yuan
3DGS
28
0
0
14 Aug 2024
CAS-ViT: Convolutional Additive Self-attention Vision Transformers for Efficient Mobile Applications
Tianfang Zhang
Lei Li
Yang Zhou
Wentao Liu
Chen Qian
Xiangyang Ji
ViT
30
12
0
07 Aug 2024
GroupMamba: Efficient Group-Based Visual State Space Model
Abdelrahman M. Shaker
Syed Talal Wasim
Salman Khan
Juergen Gall
Fahad Shahbaz Khan
Mamba
59
0
0
18 Jul 2024
AFIDAF: Alternating Fourier and Image Domain Adaptive Filters as an Efficient Alternative to Attention in ViTs
Yunling Zheng
Zeyi Xu
Fanghui Xue
Biao Yang
Jiancheng Lyu
Shuai Zhang
Y. Qi
Jack Xin
56
0
0
16 Jul 2024
CTRL-F: Pairing Convolution with Transformer for Image Classification via Multi-Level Feature Cross-Attention and Representation Learning Fusion
Hosam S. El-Assiouti
Hadeer El-Saadawy
M. Al-Berry
M. Tolba
ViT
52
0
0
09 Jul 2024
Mamba or RWKV: Exploring High-Quality and High-Efficiency Segment Anything Model
Haobo Yuan
Xiangtai Li
Lu Qi
Tao Zhang
Ming Yang
Shuicheng Yan
Chen Change Loy
VLM
34
10
0
27 Jun 2024
LeYOLO, New Scalable and Efficient CNN Architecture for Object Detection
Lilian Hollard
Lucas Mohimont
N. Gaveau
L. Steffenel
ObjD
42
3
0
20 Jun 2024
CrossFuse: A Novel Cross Attention Mechanism based Infrared and Visible Image Fusion Approach
Hui Li
Xiao-Jun Wu
28
100
0
15 Jun 2024
The 3D-PC: a benchmark for visual perspective taking in humans and machines
Drew Linsley
Peisen Zhou
A. Ashok
Akash Nagaraj
Gaurav Gaonkar
Francis E Lewis
Zygmunt Pizlo
Thomas Serre
48
6
0
06 Jun 2024
Automatic Channel Pruning for Multi-Head Attention
Eunho Lee
Youngbae Hwang
ViT
40
1
0
31 May 2024
HMANet: Hybrid Multi-Axis Aggregation Network for Image Super-Resolution
S. Chu
Zhi-chao Dou
Jeng-Shyang Pan
Shaowei Weng
Junbao Li
ViT
44
4
0
08 May 2024
An Experimental Study on Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training
Jin Gao
Shubo Lin
Shaoru Wang
Yutong Kou
Zeming Li
Liang Li
Congxuan Zhang
Xiaoqin Zhang
Yizheng Wang
Weiming Hu
47
1
0
18 Apr 2024
Boosting Visual Recognition in Real-world Degradations via Unsupervised Feature Enhancement Module with Deep Channel Prior
Zhanwen Liu
Yuhang Li
Yang Wang
Bolin Gao
Yisheng An
Xiangmo Zhao
30
4
0
02 Apr 2024
Efficient Modulation for Vision Networks
Xu Ma
Xiyang Dai
Jianwei Yang
Bin Xiao
Yinpeng Chen
Yun Fu
Lu Yuan
43
17
0
29 Mar 2024
Enhancing Efficiency in Vision Transformer Networks: Design Techniques and Insights
Moein Heidari
Reza Azad
Sina Ghorbani Kolahi
René Arimond
Leon Niggemeier
...
Afshin Bozorgpour
Ehsan Khodapanah Aghdam
A. Kazerouni
I. Hacihaliloglu
Dorit Merhof
51
7
0
28 Mar 2024
SHViT: Single-Head Vision Transformer with Memory Efficient Macro Design
Seokju Yun
Youngmin Ro
ViT
44
29
0
29 Jan 2024
OnDev-LCT: On-Device Lightweight Convolutional Transformers towards federated learning
Chu Myaet Thwal
Minh N. H. Nguyen
Ye Lin Tun
Seongjin Kim
My T. Thai
Choong Seon Hong
64
5
0
22 Jan 2024
Achelous++: Power-Oriented Water-Surface Panoptic Perception Framework on Edge Devices based on Vision-Radar Fusion and Pruning of Heterogeneous Modalities
Runwei Guan
Haocheng Zhao
Shanliang Yao
Ka Lok Man
Xiaohui Zhu
...
Yong Yue
Jeremy S. Smith
Eng Gee Lim
Weiping Ding
Yutao Yue
25
4
0
14 Dec 2023
EdgeSAM: Prompt-In-the-Loop Distillation for On-Device Deployment of SAM
Chong Zhou
Xiangtai Li
Chen Change Loy
Bo Dai
VLM
30
44
0
11 Dec 2023
Vision-based Learning for Drones: A Survey
Jiaping Xiao
Rangya Zhang
Yuhang Zhang
Mir Feroskhan
32
3
0
08 Dec 2023
E-ViLM: Efficient Video-Language Model via Masked Video Modeling with Semantic Vector-Quantized Tokenizer
Jacob Zhiyuan Fang
Skyler Zheng
Vasu Sharma
Robinson Piramuthu
VLM
38
0
0
28 Nov 2023
SBCFormer: Lightweight Network Capable of Full-size ImageNet Classification at 1 FPS on Single Board Computers
Xiangyong Lu
Masanori Suganuma
Takayuki Okatani
41
10
0
07 Nov 2023
CCMR: High Resolution Optical Flow Estimation via Coarse-to-Fine Context-Guided Motion Reasoning
Azin Jahedi
Maximilian Luz
Marc Rivinius
Andrés Bruhn
22
2
0
05 Nov 2023
EfficientOCR: An Extensible, Open-Source Package for Efficiently Digitizing World Knowledge
Tom Bryan
Jacob Carlson
Abhishek Arora
Melissa Dell
31
8
0
16 Oct 2023
Plug n' Play: Channel Shuffle Module for Enhancing Tiny Vision Transformers
Xuwei Xu
Sen Wang
Yudong Chen
Jiajun Liu
ViT
21
1
0
09 Oct 2023
Entropic Score metric: Decoupling Topology and Size in Training-free NAS
Niccolò Cavagnero
Luc Robbiano
Francesca Pistilli
Barbara Caputo
Giuseppe Averta
23
3
0
06 Oct 2023
EFaR 2023: Efficient Face Recognition Competition
J. Kolf
Fadi Boutros
Jurek Elliesen
Markus Theuerkauf
Naser Damer
...
D. Nunes
Ahmad Hassanpour
Pankaj Khatiwada
A. Toor
Bian Yang
CVBM
MQ
35
13
0
08 Aug 2023
Distributionally Robust Classification on a Data Budget
Ben Feuer
Ameya Joshi
Minh Pham
C. Hegde
OOD
37
2
0
07 Aug 2023
LGViT: Dynamic Early Exiting for Accelerating Vision Transformer
Guanyu Xu
Jiawei Hao
Li Shen
Han Hu
Yong Luo
Hui Lin
J. Shen
28
15
0
01 Aug 2023
Adaptive Frequency Filters As Efficient Global Token Mixers
Zhipeng Huang
Zhizheng Zhang
Cuiling Lan
Zhengjun Zha
Yan Lu
B. Guo
30
37
0
26 Jul 2023
Light-Weight Vision Transformer with Parallel Local and Global Self-Attention
Nikolas Ebert
Laurenz Reichardt
D. Stricker
Oliver Wasenmüller
ViT
16
2
0
18 Jul 2023
Scale-Aware Modulation Meet Transformer
Wei-Shiang Lin
Ziheng Wu
Jiayu Chen
Jun Huang
Lianwen Jin
MoE
ViT
30
66
0
17 Jul 2023
A Survey of Techniques for Optimizing Transformer Inference
Krishna Teja Chitty-Venkata
Sparsh Mittal
M. Emani
V. Vishwanath
Arun Somani
45
62
0
16 Jul 2023
Achelous: A Fast Unified Water-surface Panoptic Perception Framework based on Fusion of Monocular Camera and 4D mmWave Radar
Runwei Guan
Shanliang Yao
Xiaohui Zhu
Ka Lok Man
Eng Gee Lim
Jeremy S. Smith
Yong 0001Yue
Yutao Yue
VOS
32
17
0
14 Jul 2023
EdgeFace: Efficient Face Recognition Model for Edge Devices
Anjith George
Christophe Ecabert
Hatef Otroshi-Shahreza
Ketan Kotwal
S´ebastien Marcel
CVBM
32
23
0
04 Jul 2023
1
2
Next