Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2207.11660
Cited By
MAR: Masked Autoencoders for Efficient Action Recognition
24 July 2022
Zhiwu Qing
Shiwei Zhang
Ziyuan Huang
Xiang Wang
Yuehuang Wang
Yiliang Lv
Changxin Gao
Nong Sang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MAR: Masked Autoencoders for Efficient Action Recognition"
28 / 28 papers shown
Title
SOAR: Self-supervision Optimized UAV Action Recognition with Efficient Object-Aware Pretraining
Ruiqi Xian
Xiyang Wu
Tianrui Guan
Xijun Wang
Boqing Gong
Dinesh Manocha
ViT
39
0
0
26 Sep 2024
Towards Student Actions in Classroom Scenes: New Dataset and Baseline
Zhuolin Tan
Chenqiang Gao
Anyong Qin
Ruixin Chen
Tiecheng Song
Feng Yang
Deyu Meng
29
0
0
02 Sep 2024
Multi-scale Contrastive Adaptor Learning for Segmenting Anything in Underperformed Scenes
Ke Zhou
Zhongwei Qiu
Dongmei Fu
VLM
35
1
0
12 Aug 2024
Accelerating Pre-training of Multimodal LLMs via Chain-of-Sight
Ziyuan Huang
Kaixiang Ji
Biao Gong
Zhiwu Qing
Qinglong Zhang
Kecheng Zheng
Jian Wang
Jingdong Chen
Ming Yang
LRM
34
1
0
22 Jul 2024
MVP-Shot: Multi-Velocity Progressive-Alignment Framework for Few-Shot Action Recognition
Hongyu Qu
Rui Yan
Xiangbo Shu
Haoliang Gao
Peng Huang
Guo-Sen Xie
56
4
0
03 May 2024
Unified Multi-modal Diagnostic Framework with Reconstruction Pre-training and Heterogeneity-combat Tuning
Yupei Zhang
Li Pan
Qiushi Yang
Tan Li
Zhen Chen
28
1
0
09 Apr 2024
Masked Modeling for Self-supervised Representation Learning on Vision and Beyond
Siyuan Li
Luyuan Zhang
Zedong Wang
Di Wu
Lirong Wu
...
Jun-Xiong Xia
Cheng Tan
Yang Liu
Baigui Sun
Stan Z. Li
SSL
36
14
0
31 Dec 2023
From Static to Dynamic: Adapting Landmark-Aware Image Models for Facial Expression Recognition in Videos
Yin Chen
Jia Li
Shiguang Shan
Meng Wang
Richang Hong
46
32
0
09 Dec 2023
Human Pose-based Estimation, Tracking and Action Recognition with Deep Learning: A Survey
Lijuan Zhou
Xiang Meng
Zhihuan Liu
Mengqi Wu
Zhimin Gao
Pichao Wang
34
3
0
19 Oct 2023
MGMAE: Motion Guided Masking for Video Masked Autoencoding
Bingkun Huang
Zhiyu Zhao
Guozhen Zhang
Yu Qiao
Limin Wang
39
30
0
21 Aug 2023
Temporally-Adaptive Models for Efficient Video Understanding
Ziyuan Huang
Shiwei Zhang
Liang Pan
Zhiwu Qing
Yingya Zhang
Ziwei Liu
Marcelo H. Ang
28
9
0
10 Aug 2023
ArtGPT-4: Towards Artistic-understanding Large Vision-Language Models with Enhanced Adapter
Zheng Yuan
HU Xue
Kun Wang
Yongming Liu
Kun Wang
VLM
MLLM
21
5
0
12 May 2023
Efficient Video Action Detection with Token Dropout and Context Refinement
Lei Chen
Zhan Tong
Yibing Song
Gangshan Wu
Limin Wang
36
14
0
17 Apr 2023
VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
Limin Wang
Bingkun Huang
Zhiyu Zhao
Zhan Tong
Yinan He
Yi Wang
Yali Wang
Yu Qiao
VGen
59
325
0
29 Mar 2023
Remote Sensing Scene Classification with Masked Image Modeling (MIM)
Liya Wang
A. Tien
35
3
0
28 Feb 2023
AIM: Adapting Image Models for Efficient Video Action Recognition
Taojiannan Yang
Yi Zhu
Yusheng Xie
Aston Zhang
C. L. P. Chen
Mu Li
ViT
49
144
0
06 Feb 2023
Aerial Image Object Detection With Vision Transformer Detector (ViTDet)
Liya Wang
A. Tien
42
7
0
28 Jan 2023
EVEREST: Efficient Masked Video Autoencoder by Removing Redundant Spatiotemporal Tokens
Sun-Kyoo Hwang
Jaehong Yoon
Youngwan Lee
S. Hwang
31
6
0
19 Nov 2022
Masked Motion Encoding for Self-Supervised Video Representation Learning
Xinyu Sun
Peihao Chen
Liang-Chieh Chen
Chan Li
Thomas H. Li
Mingkui Tan
Chuang Gan
27
29
0
12 Oct 2022
A Closer Look at Self-Supervised Lightweight Vision Transformers
Shaoru Wang
Jin Gao
Zeming Li
Jian-jun Sun
Weiming Hu
ViT
67
41
0
28 May 2022
Two-stage Rule-induction Visual Reasoning on RPMs with an Application to Video Prediction
Wentao He
Jianfeng Ren
Ruibin Bai
Xudong Jiang
LRM
27
5
0
24 Nov 2021
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
305
7,434
0
11 Nov 2021
TAda! Temporally-Adaptive Convolutions for Video Understanding
Ziyuan Huang
Shiwei Zhang
Liang Pan
Zhiwu Qing
Mingqian Tang
Ziwei Liu
M. Ang
40
49
0
12 Oct 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
255
4,777
0
24 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
280
1,982
0
09 Feb 2021
Video Transformer Network
Daniel Neimark
Omri Bar
Maya Zohar
Dotan Asselmann
ViT
198
422
0
01 Feb 2021
Self-supervised Co-training for Video Representation Learning
Tengda Han
Weidi Xie
Andrew Zisserman
SSL
215
309
0
19 Oct 2020
AdaFrame: Adaptive Frame Selection for Fast Video Recognition
Zuxuan Wu
Caiming Xiong
Chih-Yao Ma
R. Socher
L. Davis
122
194
0
29 Nov 2018
1