Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2308.12962
Cited By
Motion-Guided Masking for Spatiotemporal Representation Learning
24 August 2023
D. Fan
Jue Wang
Shuai Liao
Yi Zhu
Vimal Bhat
H. Santos-Villalobos
M. Rohith
Xinyu Li
VGen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Motion-Guided Masking for Spatiotemporal Representation Learning"
20 / 20 papers shown
Title
Reinforcement Learning meets Masked Video Modeling : Trajectory-Guided Adaptive Token Selection
Ayush Rai
Kyle Min
Tarun Krishna
Feiyan Hu
A. Smeaton
Noel E. O'Connor
VGen
31
0
0
13 May 2025
SEVERE++: Evaluating Benchmark Sensitivity in Generalization of Video Representation Learning
Fida Mohammad Thoker
Letian Jiang
Chen Zhao
Piyush Bagad
Hazel Doughty
Bernard Ghanem
Cees G. M. Snoek
ViT
SSL
52
0
0
08 Apr 2025
Safety Modulation: Enhancing Safety in Reinforcement Learning through Cost-Modulated Rewards
Hanping Zhang
Yuhong Guo
OffRL
38
0
0
03 Apr 2025
Scaling Language-Free Visual Representation Learning
David Fan
Shengbang Tong
Jiachen Zhu
Koustuv Sinha
Zhuang Liu
...
Michael G. Rabbat
Nicolas Ballas
Yann LeCun
Amir Bar
Saining Xie
CLIP
VLM
64
2
0
01 Apr 2025
SMILE: Infusing Spatial and Motion Semantics in Masked Video Learning
Fida Mohammad Thoker
Letian Jiang
Chen Zhao
Bernard Ghanem
59
0
0
01 Apr 2025
Structured-Noise Masked Modeling for Video, Audio and Beyond
Aritra Bhowmik
Fida Mohammad Thoker
Carlos Hinojosa
Bernard Ghanem
Cees G. M. Snoek
VGen
59
0
0
20 Mar 2025
Extending Video Masked Autoencoders to 128 frames
N. B. Gundavarapu
Luke Friedman
Raghav Goyal
Chaitra Hegde
Eirikur Agustsson
...
Mikhail Sirotenko
Ming Yang
Tobias Weyand
Boqing Gong
Leonid Sigal
82
1
0
20 Nov 2024
Video Token Merging for Long-form Video Understanding
Seon-Ho Lee
Jue Wang
Zhikang Zhang
D. Fan
Xinyu Li
40
5
0
31 Oct 2024
Text-Guided Video Masked Autoencoder
D. Fan
Jue Wang
Shuai Liao
Zhikang Zhang
Vimal Bhat
Xinyu Li
VGen
28
3
0
01 Aug 2024
SIGMA:Sinkhorn-Guided Masked Video Modeling
Mohammadreza Salehi
Michael Dorkenwald
Fida Mohammad Thoker
E. Gavves
Cees G. M. Snoek
Yuki M. Asano
55
3
0
22 Jul 2024
Self-Supervised Video Representation Learning in a Heuristic Decoupled Perspective
Zeen Song
Jingyao Wang
Jianqi Zhang
Changwen Zheng
Wenwen Qiang
SSL
62
0
0
19 Jul 2024
Efficient Image Pre-Training with Siamese Cropped Masked Autoencoders
Alexandre Eymaël
Renaud Vandeghen
A. Cioppa
Silvio Giancola
Bernard Ghanem
Marc Van Droogenbroeck
ViT
43
6
0
26 Mar 2024
Masked Modeling for Self-supervised Representation Learning on Vision and Beyond
Siyuan Li
Luyuan Zhang
Zedong Wang
Di Wu
Lirong Wu
...
Jun-Xiong Xia
Cheng Tan
Yang Liu
Baigui Sun
Stan Z. Li
SSL
36
14
0
31 Dec 2023
Bootstrap Masked Visual Modeling via Hard Patches Mining
Haochen Wang
Junsong Fan
Yuxi Wang
Kaiyou Song
Tiancai Wang
Xiangyu Zhang
Zhaoxiang Zhang
39
5
0
21 Dec 2023
EVEREST: Efficient Masked Video Autoencoder by Removing Redundant Spatiotemporal Tokens
Sun-Kyoo Hwang
Jaehong Yoon
Youngwan Lee
S. Hwang
31
6
0
19 Nov 2022
It Takes Two: Masked Appearance-Motion Modeling for Self-supervised Video Transformer Pre-training
Yuxin Song
Min Yang
Wenhao Wu
Dongliang He
Fu Li
Jingdong Wang
ViT
95
8
0
11 Oct 2022
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
305
7,443
0
11 Nov 2021
Partial Video Domain Adaptation with Partial Adversarial Temporal Attentive Network
Yuecong Xu
Jianfei Yang
Haozhi Cao
Qi Li
K. Mao
Zhenghua Chen
AI4TS
38
32
0
11 Jul 2021
VidTr: Video Transformer Without Convolutions
Yanyi Zhang
Xinyu Li
Chunhui Liu
Bing Shuai
Yi Zhu
Biagio Brattoli
Hao Chen
I. Marsic
Joseph Tighe
ViT
136
193
0
23 Apr 2021
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
280
1,982
0
09 Feb 2021
1