ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2209.15159
  4. Cited By
MobileViTv3: Mobile-Friendly Vision Transformer with Simple and
  Effective Fusion of Local, Global and Input Features

MobileViTv3: Mobile-Friendly Vision Transformer with Simple and Effective Fusion of Local, Global and Input Features

30 September 2022
S. Wadekar
Abhishek Chaurasia
    ViT
ArXivPDFHTML

Papers citing "MobileViTv3: Mobile-Friendly Vision Transformer with Simple and Effective Fusion of Local, Global and Input Features"

32 / 32 papers shown
Title
RecConv: Efficient Recursive Convolutions for Multi-Frequency
  Representations
RecConv: Efficient Recursive Convolutions for Multi-Frequency Representations
Mingshu Zhao
Yi Luo
Yong Ouyang
38
0
0
27 Dec 2024
CompactFlowNet: Efficient Real-time Optical Flow Estimation on Mobile
  Devices
CompactFlowNet: Efficient Real-time Optical Flow Estimation on Mobile Devices
Andrei Znobishchev
Valerii Filev
Oleg Kudashev
Nikita Orlov
Humphrey Shi
74
0
0
17 Dec 2024
LUIEO: A Lightweight Model for Integrating Underwater Image Enhancement and Object Detection
LUIEO: A Lightweight Model for Integrating Underwater Image Enhancement and Object Detection
Bin Li
Li Li
Zhenwei Zhang
Yuping Duan
74
0
0
01 Dec 2024
Improving Accuracy and Generalization for Efficient Visual Tracking
Improving Accuracy and Generalization for Efficient Visual Tracking
Ram J. Zaveri
Shivang Patel
Yu Gu
Gianfranco Doretto
VLM
86
0
0
28 Nov 2024
Lightweight Gaze Estimation Model Via Fusion Global Information
Lightweight Gaze Estimation Model Via Fusion Global Information
Zhang Cheng
Yanxia Wang
98
0
0
27 Nov 2024
Scaling Spike-driven Transformer with Efficient Spike Firing
  Approximation Training
Scaling Spike-driven Transformer with Efficient Spike Firing Approximation Training
Man Yao
Xuerui Qiu
Tianxiang Hu
J. Hu
Yuhong Chou
Keyu Tian
Jianxing Liao
Luziwei Leng
Bo Xu
Guoqi Li
76
7
0
25 Nov 2024
MobileMamba: Lightweight Multi-Receptive Visual Mamba Network
MobileMamba: Lightweight Multi-Receptive Visual Mamba Network
Haoyang He
Junyuan Zhang
Yuxuan Cai
Hongxu Chen
Xiaobin Hu
Zhenye Gan
Yuping Wang
Chengjie Wang
Yunsheng Wu
Lei Xie
Mamba
88
3
0
24 Nov 2024
COSNet: A Novel Semantic Segmentation Network using Enhanced Boundaries
  in Cluttered Scenes
COSNet: A Novel Semantic Segmentation Network using Enhanced Boundaries in Cluttered Scenes
Muhammad Ali
Mamoona Javaid
Mubashir Noman
M. Fiaz
Salman Khan
36
0
0
31 Oct 2024
Improving Vision Transformers by Overlapping Heads in Multi-Head Self-Attention
Improving Vision Transformers by Overlapping Heads in Multi-Head Self-Attention
Tianxiao Zhang
Bo Luo
G. Wang
ViT
21
1
0
18 Oct 2024
RepNeXt: A Fast Multi-Scale CNN using Structural Reparameterization
RepNeXt: A Fast Multi-Scale CNN using Structural Reparameterization
Mingshu Zhao
Yi Luo
Yong Ouyang
40
2
0
23 Jun 2024
LeYOLO, New Scalable and Efficient CNN Architecture for Object Detection
LeYOLO, New Scalable and Efficient CNN Architecture for Object Detection
Lilian Hollard
Lucas Mohimont
N. Gaveau
L. Steffenel
ObjD
42
3
0
20 Jun 2024
Two Heads are Better Than One: Neural Networks Quantization with 2D
  Hilbert Curve-based Output Representation
Two Heads are Better Than One: Neural Networks Quantization with 2D Hilbert Curve-based Output Representation
Mykhail M. Uss
Ruslan Yermolenko
Olena Kolodiazhna
Oleksii Shashko
Ivan Safonov
Volodymyr Savin
Yoonjae Yeo
Seowon Ji
Jaeyun Jeong
MQ
27
0
0
22 May 2024
MambaOut: Do We Really Need Mamba for Vision?
MambaOut: Do We Really Need Mamba for Vision?
Weihao Yu
Xinchao Wang
Mamba
50
48
0
13 May 2024
EffLoc: Lightweight Vision Transformer for Efficient 6-DOF Camera
  Relocalization
EffLoc: Lightweight Vision Transformer for Efficient 6-DOF Camera Relocalization
Zhendong Xiao
Changhao Chen
Shan Yang
Wu Wei
35
1
0
21 Feb 2024
A Lightweight Feature Fusion Architecture For Resource-Constrained Crowd
  Counting
A Lightweight Feature Fusion Architecture For Resource-Constrained Crowd Counting
Yashwardhan Chaudhuri
Ankit Kumar
Orchid Chetia Phukan
Arun Balaji Buduru
29
1
0
11 Jan 2024
MCAD: Multi-teacher Cross-modal Alignment Distillation for efficient
  image-text retrieval
MCAD: Multi-teacher Cross-modal Alignment Distillation for efficient image-text retrieval
Youbo Lei
Feifei He
Chen Chen
Yingbin Mo
Sijia Li
Defeng Xie
H. Lu
VLM
57
0
0
30 Oct 2023
RingMo-lite: A Remote Sensing Multi-task Lightweight Network with
  CNN-Transformer Hybrid Framework
RingMo-lite: A Remote Sensing Multi-task Lightweight Network with CNN-Transformer Hybrid Framework
Yuelei Wang
Ting Zhang
Liangjin Zhao
Lin Hu
Zhechao Wang
...
Kaiqiang Chen
Xuan Zeng
Zhirui Wang
Hongqi Wang
Xian Sun
24
4
0
16 Sep 2023
RepViT: Revisiting Mobile CNN From ViT Perspective
RepViT: Revisiting Mobile CNN From ViT Perspective
Ao Wang
Hui Chen
Zijia Lin
Hengjun Pu
Guiguang Ding
34
177
0
18 Jul 2023
Light-Weight Vision Transformer with Parallel Local and Global
  Self-Attention
Light-Weight Vision Transformer with Parallel Local and Global Self-Attention
Nikolas Ebert
Laurenz Reichardt
D. Stricker
Oliver Wasenmüller
ViT
16
2
0
18 Jul 2023
FR-Net:A Light-weight FFT Residual Net For Gaze Estimation
FR-Net:A Light-weight FFT Residual Net For Gaze Estimation
Tao Xu
Borimandafu Wu
Ruilong Fan
Yun Zhou
Di Huang
29
2
0
04 May 2023
EVA-02: A Visual Representation for Neon Genesis
EVA-02: A Visual Representation for Neon Genesis
Yuxin Fang
Quan-Sen Sun
Xinggang Wang
Tiejun Huang
Xinlong Wang
Yue Cao
VLM
ViT
CLIP
40
259
0
20 Mar 2023
ElasticViT: Conflict-aware Supernet Training for Deploying Fast Vision
  Transformer on Diverse Mobile Devices
ElasticViT: Conflict-aware Supernet Training for Deploying Fast Vision Transformer on Diverse Mobile Devices
Chen Tang
Li Lyna Zhang
Huiqiang Jiang
Jiahang Xu
Ting Cao
Quanlu Zhang
Yuqing Yang
Zhi Wang
Mao Yang
28
11
0
17 Mar 2023
Run, Don't Walk: Chasing Higher FLOPS for Faster Neural Networks
Run, Don't Walk: Chasing Higher FLOPS for Faster Neural Networks
Jierun Chen
Shiu-hong Kao
Hao He
Weipeng Zhuo
Song Wen
Chul-Ho Lee
Shueng-Han Gary Chan
OOD
32
779
0
07 Mar 2023
TinyMIM: An Empirical Study of Distilling MIM Pre-trained Models
TinyMIM: An Empirical Study of Distilling MIM Pre-trained Models
Sucheng Ren
Fangyun Wei
Zheng-Wei Zhang
Han Hu
40
34
0
03 Jan 2023
Rethinking Mobile Block for Efficient Attention-based Models
Rethinking Mobile Block for Efficient Attention-based Models
Jiangning Zhang
Xiangtai Li
Jian Li
Liang Liu
Zhucun Xue
Boshen Zhang
Zhe Jiang
Tianxin Huang
Yabiao Wang
Chengjie Wang
MQ
44
90
0
03 Jan 2023
MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision
  Transformer
MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer
Sachin Mehta
Mohammad Rastegari
ViT
218
1,213
0
05 Oct 2021
Mobile-Former: Bridging MobileNet and Transformer
Mobile-Former: Bridging MobileNet and Transformer
Yinpeng Chen
Xiyang Dai
Dongdong Chen
Mengchen Liu
Xiaoyi Dong
Lu Yuan
Zicheng Liu
ViT
183
476
0
12 Aug 2021
CMT: Convolutional Neural Networks Meet Vision Transformers
CMT: Convolutional Neural Networks Meet Vision Transformers
Jianyuan Guo
Kai Han
Han Wu
Yehui Tang
Chunjing Xu
Yunhe Wang
Chang Xu
ViT
351
500
0
13 Jul 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction
  without Convolutions
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
289
3,623
0
24 Feb 2021
Densely Connected Convolutional Networks
Densely Connected Convolutional Networks
Gao Huang
Zhuang Liu
L. V. D. van der Maaten
Kilian Q. Weinberger
PINN
3DV
303
36,371
0
25 Aug 2016
Semantic Understanding of Scenes through the ADE20K Dataset
Semantic Understanding of Scenes through the ADE20K Dataset
Bolei Zhou
Hang Zhao
Xavier Puig
Tete Xiao
Sanja Fidler
Adela Barriuso
Antonio Torralba
SSeg
253
1,828
0
18 Aug 2016
ImageNet Large Scale Visual Recognition Challenge
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
296
39,198
0
01 Sep 2014
1