Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2110.02178
Cited By
MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer
5 October 2021
Sachin Mehta
Mohammad Rastegari
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer"
50 / 419 papers shown
Title
Learning to Adapt Foundation Model DINOv2 for Capsule Endoscopy Diagnosis
Bowen Zhang
Ying Chen
Long Bai
Yan Zhao
Yuxiang Sun
Yixuan Yuan
Jianhua Zhang
Hongliang Ren
34
4
0
15 Jun 2024
Multiple Prior Representation Learning for Self-Supervised Monocular Depth Estimation via Hybrid Transformer
Guodong Sun
Junjie Liu
Mingxuan Liu
Moyun Liu
Yang Zhang
MDE
ViT
37
1
0
13 Jun 2024
Adaptively Bypassing Vision Transformer Blocks for Efficient Visual Tracking
Xiangyang Yang
Dan Zeng
Xucheng Wang
You Wu
Hengzhou Ye
Qijun Zhao
Shuiwang Li
59
3
0
12 Jun 2024
A Comparative Survey of Vision Transformers for Feature Extraction in Texture Analysis
Leonardo F. S. Scabini
Andre Sacilotti
Kallil M. C. Zielinski
L. C. Ribas
B. De Baets
Odemir M. Bruno
ViT
33
3
0
10 Jun 2024
Scaling Graph Convolutions for Mobile Vision
William Avery
Mustafa Munir
R. Marculescu
GNN
34
4
0
09 Jun 2024
Mamba YOLO: SSMs-Based YOLO For Object Detection
Zeyu Wang
Chen Li
Huiying Xu
Xinzhong Zhu
Mamba
49
2
0
09 Jun 2024
Navigating Efficiency in MobileViT through Gaussian Process on Global Architecture Factors
Ke Meng
Kai Chen
32
0
0
07 Jun 2024
The 3D-PC: a benchmark for visual perspective taking in humans and machines
Drew Linsley
Peisen Zhou
A. Ashok
Akash Nagaraj
Gaurav Gaonkar
Francis E Lewis
Zygmunt Pizlo
Thomas Serre
48
6
0
06 Jun 2024
Image Captioning via Dynamic Path Customization
Yiwei Ma
Jiayi Ji
Xiaoshuai Sun
Yiyi Zhou
Xiaopeng Hong
Yongjian Wu
Rongrong Ji
34
0
0
01 Jun 2024
LeMeViT: Efficient Vision Transformer with Learnable Meta Tokens for Remote Sensing Image Interpretation
Wentao Jiang
Jing Zhang
Di Wang
Qiming Zhang
Zengmao Wang
Bo Du
34
5
0
16 May 2024
GreedyViG: Dynamic Axial Graph Construction for Efficient Vision GNNs
Mustafa Munir
William Avery
Md Mostafijur Rahman
R. Marculescu
GNN
53
12
0
10 May 2024
DP-DyLoRA: Fine-Tuning Transformer-Based Models On-Device under Differentially Private Federated Learning using Dynamic Low-Rank Adaptation
Jie Xu
Karthikeyan P. Saravanan
Rogier van Dalen
Haaris Mehmood
David Tuckey
Mete Ozay
56
5
0
10 May 2024
Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformer
Huihong Shi
Haikuo Shao
Wendong Mao
Zhongfeng Wang
ViT
MQ
36
3
0
06 May 2024
Multispectral Fine-Grained Classification of Blackgrass in Wheat and Barley Crops
Madeleine Darbyshire
Shaun Coutts
Eleanor Hammond
Fazilet Gokbudak
Cengiz Öztireli
Petra Bosilj
Junfeng Gao
Elizabeth I. Sklar
Simon Parsons
24
1
0
03 May 2024
SFFNet: A Wavelet-Based Spatial and Frequency Domain Fusion Network for Remote Sensing Segmentation
Yunsong Yang
Genji Yuan
Jinjiang Li
VOS
24
13
0
03 May 2024
Weakly Supervised Training for Hologram Verification in Identity Documents
Glen Pouliquen
Guillaume Chiron
Joseph Chazalon
Thierry Géraud
Ahmad-Montaser Awal
25
1
0
26 Apr 2024
CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data
Sachin Mehta
Maxwell Horton
Fartash Faghri
Mohammad Hossein Sekhavat
Mahyar Najibi
Mehrdad Farajtabar
Oncel Tuzel
Mohammad Rastegari
VLM
CLIP
38
6
0
24 Apr 2024
Data-independent Module-aware Pruning for Hierarchical Vision Transformers
Yang He
Joey Tianyi Zhou
ViT
47
3
0
21 Apr 2024
An Experimental Study on Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training
Jin Gao
Shubo Lin
Shaoru Wang
Yutong Kou
Zeming Li
Liang Li
Congxuan Zhang
Xiaoqin Zhang
Yizheng Wang
Weiming Hu
41
1
0
18 Apr 2024
GhostNetV3: Exploring the Training Strategies for Compact Models
Zhenhua Liu
Zhiwei Hao
Kai Han
Yehui Tang
Yunhe Wang
26
16
0
17 Apr 2024
MobileNetV4 - Universal Models for the Mobile Ecosystem
Danfeng Qin
Chas Leichner
M. Delakis
Marco Fornoni
Shixin Luo
...
Berkin Akin
Vaibhav Aggarwal
Tenghui Zhu
Daniele Moro
Andrew G. Howard
MQ
28
86
0
16 Apr 2024
NTIRE 2024 Challenge on Image Super-Resolution (
×
\times
×
4): Methods and Results
Zheng Chen
Zongwei Wu
Eduard Zamfir
Kai Zhang
Yulun Zhang
...
Yan Luo
Yanyan Wei
Asif Hussain Khan
C. Micheloni
N. Martinel
SupR
33
32
0
15 Apr 2024
Robust feature knowledge distillation for enhanced performance of lightweight crack segmentation models
Zhaohui Chen
Elyas Asadi Shamsabadi
Sheng Jiang
Luming Shen
Daniel Dias-da-Costa
29
2
0
09 Apr 2024
A Lightweight Measure of Classification Difficulty from Application Dataset Characteristics
Bryan Bo Cao
Abhinav Sharma
Lawrence O'Gorman
Michael J. Coss
Shubham Jain
36
1
0
09 Apr 2024
Lightweight Deep Learning for Resource-Constrained Environments: A Survey
Hou-I Liu
Marco Galindo
Hongxia Xie
Lai-Kuan Wong
Hong-Han Shuai
Yung-Hui Li
Wen-Huang Cheng
55
48
0
08 Apr 2024
HSViT: Horizontally Scalable Vision Transformer
Chenhao Xu
Chang-Tsun Li
Chee Peng Lim
Douglas Creighton
ViT
34
2
0
08 Apr 2024
ViTamin: Designing Scalable Vision Models in the Vision-Language Era
Jienneg Chen
Qihang Yu
Xiaohui Shen
Alan L. Yuille
Liang-Chieh Chen
3DV
VLM
36
24
0
02 Apr 2024
On Train-Test Class Overlap and Detection for Image Retrieval
Chull Hwan Song
Jooyoung Yoon
Taebaek Hwang
Shunghyun Choi
Yeong Hyeon Gu
Yannis Avrithis
34
2
0
01 Apr 2024
Vision-language models for decoding provider attention during neonatal resuscitation
Felipe Parodi
Jordan K Matelsky
Alejandra Regla-Vargas
Elizabeth E. Foglia
Charis Lim
Danielle Weinberg
Konrad Kording
Heidi Herrick
Michael L Platt
24
0
0
01 Apr 2024
Separate, Dynamic and Differentiable (SMART) Pruner for Block/Output Channel Pruning on Computer Vision Tasks
Guanhua Ding
Zexi Ye
Zhen Zhong
Gang Li
David Shao
36
0
0
29 Mar 2024
Efficient Modulation for Vision Networks
Xu Ma
Xiyang Dai
Jianwei Yang
Bin Xiao
Yinpeng Chen
Yun Fu
Lu Yuan
43
17
0
29 Mar 2024
MambaMixer: Efficient Selective State Space Models with Dual Token and Channel Selection
Ali Behrouz
Michele Santacatterina
Ramin Zabih
44
31
0
29 Mar 2024
Enhancing Efficiency in Vision Transformer Networks: Design Techniques and Insights
Moein Heidari
Reza Azad
Sina Ghorbani Kolahi
René Arimond
Leon Niggemeier
...
Afshin Bozorgpour
Ehsan Khodapanah Aghdam
A. Kazerouni
I. Hacihaliloglu
Dorit Merhof
45
7
0
28 Mar 2024
Test-Time Domain Generalization for Face Anti-Spoofing
Qianyu Zhou
Ke-Yue Zhang
Taiping Yao
Xuequan Lu
Shouhong Ding
Lizhuang Ma
TTA
CVBM
OOD
46
22
0
28 Mar 2024
QuakeSet: A Dataset and Low-Resource Models to Monitor Earthquakes through Sentinel-1
Daniele Rege Cambrin
Paolo Garza
27
6
0
26 Mar 2024
ELGC-Net: Efficient Local-Global Context Aggregation for Remote Sensing Change Detection
Mubashir Noman
M. Fiaz
Hisham Cholakkal
Salman Khan
Fahad Shahbaz Khan
19
27
0
26 Mar 2024
PaPr: Training-Free One-Step Patch Pruning with Lightweight ConvNets for Faster Inference
Tanvir Mahmud
Burhaneddin Yaman
Chun-Hao Liu
Diana Marculescu
38
2
0
24 Mar 2024
ParFormer: Vision Transformer Baseline with Parallel Local Global Token Mixer and Convolution Attention Patch Embedding
Novendra Setyawan
Ghufron Wahyu Kurniawan
Chi-Chia Sun
Jun-Wei Hsieh
Hui-Kai Su
W. Kuo
ViT
MoE
39
0
0
22 Mar 2024
HSEmotion Team at the 6th ABAW Competition: Facial Expressions, Valence-Arousal and Emotion Intensity Prediction
Andrey V. Savchenko
43
19
0
18 Mar 2024
When Training-Free NAS Meets Vision Transformer: A Neural Tangent Kernel Perspective
Qiqi Zhou
Yichen Zhu
ViT
16
1
0
15 Mar 2024
Depth-induced Saliency Comparison Network for Diagnosis of Alzheimer's Disease via Jointly Analysis of Visual Stimuli and Eye Movements
Yu Liu
Wenlin Zhang
Shaochu Wang
Fangyu Zuo
Peiguang Jing
Yong Ji
27
0
0
15 Mar 2024
Group-Mix SAM: Lightweight Solution for Industrial Assembly Line Applications
Wu Liang
X.-G. Ma
34
0
0
15 Mar 2024
EfficientVMamba: Atrous Selective Scan for Light Weight Visual Mamba
Xiaohuan Pei
Tao Huang
Chang Xu
Mamba
27
88
0
15 Mar 2024
METER: a mobile vision transformer architecture for monocular depth estimation
Lorenzo Papa
Paolo Russo
Irene Amerini
MDE
27
18
0
13 Mar 2024
ACC-ViT : Atrous Convolution's Comeback in Vision Transformers
Nabil Ibtehaz
Ning Yan
Masood S. Mortazavi
Daisuke Kihara
ViT
24
3
0
07 Mar 2024
A data-centric approach to class-specific bias in image data augmentation
Athanasios Angelakis
Andrey Rass
37
0
0
07 Mar 2024
Learning without Exact Guidance: Updating Large-scale High-resolution Land Cover Maps from Low-resolution Historical Labels
Zhuo Li
Wei He
Jiepan Li
Fangxiao Lu
Hongyan Zhang
23
12
0
05 Mar 2024
Image2Sentence based Asymmetrical Zero-shot Composed Image Retrieval
Yongchao Du
Min Wang
Wen-gang Zhou
Shuping Hui
Houqiang Li
32
10
0
03 Mar 2024
PEM: Prototype-based Efficient MaskFormer for Image Segmentation
Niccolò Cavagnero
Gabriele Rosi
Claudia Cuttano
Francesca Pistilli
Marco Ciccone
Giuseppe Averta
Fabio Cermelli
54
21
0
29 Feb 2024
Video-Based Autism Detection with Deep Learning
Manuel Serna-Aguilera
Xuan-Bac Nguyen
Asmita Singh
Lydia Rockers
Se-Woong Park
Leslie Neely
Han-Seok Seo
Khoa Luu
24
5
0
26 Feb 2024
Previous
1
2
3
4
5
6
7
8
9
Next