Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.06377
Cited By
v1
v2
v3 (latest)
Masked Autoencoders Are Scalable Vision Learners
11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Masked Autoencoders Are Scalable Vision Learners"
50 / 4,779 papers shown
Title
Rethinking Multi-view Representation Learning via Distilled Disentangling
Guanzhou Ke
Bo Wang
Xiaoli Wang
Shengfeng He
115
4
0
16 Mar 2024
stMCDI: Masked Conditional Diffusion Model with Graph Neural Network for Spatial Transcriptomics Data Imputation
Xiaoyu Li
Wenwen Min
Shunfang Wang
Changmiao Wang
Taosheng Xu
MedIm
60
6
0
16 Mar 2024
Affective Behaviour Analysis via Integrating Multi-Modal Knowledge
Wei Zhang
Feng Qiu
Chen Liu
Lincheng Li
Heming Du
Tiancheng Guo
Xin Yu
85
21
0
16 Mar 2024
CORN: Contact-based Object Representation for Nonprehensile Manipulation of General Unseen Objects
Yoonyoung Cho
Junhyek Han
Yoontae Cho
Beomjoon Kim
115
8
0
16 Mar 2024
P-MapNet: Far-seeing Map Generator Enhanced by both SDMap and HDMap Priors
Zhou Jiang
Zhenxin Zhu
Pengfei Li
Huan-ang Gao
Tianyuan Yuan
Yongliang Shi
Hang Zhao
Hao Zhao
94
28
0
15 Mar 2024
CoReEcho: Continuous Representation Learning for 2D+time Echocardiography Analysis
F. Maani
Numan Saeed
Aleksandr Matsun
Mohammad Yaqub
SyDa
94
4
0
15 Mar 2024
Improving Medical Multi-modal Contrastive Learning with Expert Annotations
Yogesh Kumar
Pekka Marttinen
MedIm
VLM
92
11
0
15 Mar 2024
Autonomous Monitoring of Pharmaceutical R&D Laboratories with 6 Axis Arm Equipped Quadruped Robot and Generative AI: A Preliminary Study
Shunichi Hato
Nozomi Ogawa
64
1
0
15 Mar 2024
PAME: Self-Supervised Masked Autoencoder for No-Reference Point Cloud Quality Assessment
Ziyu Shan
Yujie Zhang
Qi Yang
Haichen Yang
Yiling Xu
Shan Liu
3DPC
75
2
0
15 Mar 2024
Multi-criteria Token Fusion with One-step-ahead Attention for Efficient Vision Transformers
Sanghyeok Lee
Joonmyung Choi
Hyunwoo J. Kim
ViT
82
10
0
15 Mar 2024
Autoregressive Queries for Adaptive Tracking with Spatio-TemporalTransformers
Jinxia Xie
Bineng Zhong
Zhiyi Mo
Shengping Zhang
Liangtao Shi
Shuxiang Song
Rongrong Ji
95
42
0
15 Mar 2024
Robust Light-Weight Facial Affective Behavior Recognition with CLIP
Li Lin
Sarah Papabathini
Xin Eric Wang
Shu Hu
CVBM
92
16
0
14 Mar 2024
Self-Supervised Learning for Time Series: Contrastive or Generative?
Ziyu Liu
Azadeh Alavi
Minyi Li
Xiang Zhang
AI4TS
85
7
0
14 Mar 2024
OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning
Lingyi Hong
Shilin Yan
Renrui Zhang
Wanyun Li
Xinyu Zhou
...
Kaixun Jiang
Yiting Chen
Jinglun Li
Zhaoyu Chen
Wenqiang Zhang
VLM
82
51
0
14 Mar 2024
Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding
Guo Chen
Yifei Huang
Jilan Xu
Baoqi Pei
Zhe Chen
Zhiqi Li
Jiahao Wang
Kunchang Li
Tong Lu
Limin Wang
Mamba
140
78
0
14 Mar 2024
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Brandon McKinzie
Zhe Gan
J. Fauconnier
Sam Dodge
Bowen Zhang
...
Zirui Wang
Ruoming Pang
Peter Grasch
Alexander Toshev
Yinfei Yang
MLLM
129
209
0
14 Mar 2024
uaMix-MAE: Efficient Tuning of Pretrained Audio Transformers with Unsupervised Audio Mixtures
Afrina Tabassum
Dung N. Tran
Trung D. Q. Dang
Ismini Lourentzou
K. Koishida
82
0
0
14 Mar 2024
Generalizing Denoising to Non-Equilibrium Structures Improves Equivariant Force Fields
Yi-Lun Liao
Tess E. Smidt
Abhishek Das
DiffM
AI4CE
63
12
0
14 Mar 2024
EquiAV: Leveraging Equivariance for Audio-Visual Contrastive Learning
Jongsuk Kim
Hyeongkeun Lee
Kyeongha Rho
Junmo Kim
Joon Son Chung
61
6
0
14 Mar 2024
Unsupervised Modality-Transferable Video Highlight Detection with Representation Activation Sequence Learning
Tingtian Li
Zixun Sun
Xinyu Xiao
77
3
0
14 Mar 2024
ConDiSR: Contrastive Disentanglement and Style Regularization for Single Domain Generalization
Aleksandr Matsun
Numan Saeed
F. Maani
Mohammad Yaqub
OOD
432
1
0
14 Mar 2024
LocalMamba: Visual State Space Model with Windowed Selective Scan
Tao Huang
Xiaohuan Pei
Shan You
Fei Wang
Chao Qian
Chang Xu
Mamba
123
158
0
14 Mar 2024
SELECTOR: Heterogeneous graph network with convolutional masked autoencoder for multimodal robust prediction of cancer survival
Liangrui Pan
Yijun Peng
Yan Li
Xiang Wang
Wenjuan Liu
Liwen Xu
Qingchun Liang
Shaoliang Peng
71
4
0
14 Mar 2024
PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation
Yizhe Xiong
Hui Chen
Tianxiang Hao
Zijia Lin
Jungong Han
Yuesong Zhang
Guoxin Wang
Yongjun Bao
Guiguang Ding
97
18
0
14 Mar 2024
When Semantic Segmentation Meets Frequency Aliasing
Linwei Chen
Lin Gu
Ying Fu
110
6
0
14 Mar 2024
Adaptive Hybrid Masking Strategy for Privacy-Preserving Face Recognition Against Model Inversion Attack
Yinggui Wang
Yuanqing Huang
Jianshu Li
Le Yang
Kai Song
Lei Wang
AAML
PICV
95
0
0
14 Mar 2024
Explore In-Context Segmentation via Latent Diffusion Models
Chaoyang Wang
Xiangtai Li
Henghui Ding
Lu Qi
Jiangning Zhang
Yunhai Tong
Chen Change Loy
Shuicheng Yan
DiffM
158
7
0
14 Mar 2024
MIM4D: Masked Modeling with Multi-View Video for Autonomous Driving Representation Learning
Jialv Zou
Bencheng Liao
Qian Zhang
Wenyu Liu
Xinggang Wang
78
3
0
13 Mar 2024
OneVOS: Unifying Video Object Segmentation with All-in-One Transformer Framework
Wanyun Li
Pinxue Guo
Xinyu Zhou
Lingyi Hong
Yangji He
Xiangyu Zheng
Wei Zhang
Wenqiang Zhang
VOS
104
4
0
13 Mar 2024
Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts
Yue Ma
Yin-Yin He
Hongfa Wang
Andong Wang
Chenyang Qi
...
Xiu Li
Zhifeng Li
H. Shum
Wei Liu
Qifeng Chen
VGen
DiffM
163
43
0
13 Mar 2024
LAFS: Landmark-based Facial Self-supervised Learning for Face Recognition
Zhonglin Sun
Chen Feng
Ioannis Patras
Georgios Tzimiropoulos
CVBM
SSL
68
6
0
13 Mar 2024
A Decade's Battle on Dataset Bias: Are We There Yet?
Zhuang Liu
Kaiming He
96
37
0
13 Mar 2024
CAMSIC: Content-aware Masked Image Modeling Transformer for Stereo Image Compression
Xinjie Zhang
Shenyuan Gao
Zhening Liu
Jiawei Shao
Xingtong Ge
Dailan He
Tongda Xu
Yan Wang
Jun Zhang
118
1
0
13 Mar 2024
VANP: Learning Where to See for Navigation with Self-Supervised Vision-Action Pre-Training
Mohammad Nazeri
Junzhe Wang
Amirreza Payandeh
Xuesu Xiao
SSL
ViT
101
8
0
12 Mar 2024
CuVLER: Enhanced Unsupervised Object Discoveries through Exhaustive Self-Supervised Transformers
Shahaf Arica
Or Rubin
Sapir Gershov
S. Laufer
72
7
0
12 Mar 2024
Masked AutoDecoder is Effective Multi-Task Vision Generalist
Han Qiu
Jiaxing Huang
Peng Gao
Lewei Lu
Xiaoqin Zhang
Shijian Lu
89
4
0
12 Mar 2024
AACP: Aesthetics assessment of children's paintings based on self-supervised learning
Shiqi Jiang
Ning Li
Chen Shi
Liping Guo
Changbo Wang
Chenhui Li
61
0
0
12 Mar 2024
NightHaze: Nighttime Image Dehazing via Self-Prior Learning
Beibei Lin
Yeying Jin
Wending Yan
Wei Ye
Yuan. Yuan
Robby T. Tan
84
10
0
12 Mar 2024
Attention Prompt Tuning: Parameter-efficient Adaptation of Pre-trained Models for Spatiotemporal Modeling
W. G. C. Bandara
Vishal M. Patel
VPVLM
VLM
78
1
0
11 Mar 2024
On the Generalization Ability of Unsupervised Pretraining
Yuyang Deng
Junyuan Hong
Jiayu Zhou
M. Mahdavi
SSL
94
5
0
11 Mar 2024
Zero-Shot ECG Classification with Multimodal Learning and Test-time Clinical Knowledge Enhancement
Che Liu
Zhongwei Wan
Ouyang Cheng
Anand Shah
Wenjia Bai
Rossella Arcucci
90
33
0
11 Mar 2024
PointSeg: A Training-Free Paradigm for 3D Scene Segmentation via Foundation Models
Qingdong He
Jinlong Peng
Zhengkai Jiang
Xiaobin Hu
Jiangning Zhang
Qiang Nie
Yabiao Wang
Chengjie Wang
3DPC
VLM
95
5
0
11 Mar 2024
See Through Their Minds: Learning Transferable Neural Representation from Cross-Subject fMRI
Yulong Liu
Yongqiang Ma
Guibo Zhu
Haodong Jing
Nanning Zheng
57
4
0
11 Mar 2024
Joint-Embedding Masked Autoencoder for Self-supervised Learning of Dynamic Functional Connectivity from the Human Brain
Jungwon Choi
Hyungi Lee
Byung-Hoon Kim
Juho Lee
137
1
0
11 Mar 2024
Leveraging Foundation Models for Content-Based Image Retrieval in Radiology
Stefan Denner
David Zimmerer
Dimitrios Bounias
Markus Bujotzek
Shuhan Xiao
...
Lisa Kausch
Philipp Schader
Tobias Penzkofer
Paul F. Jäger
Klaus H. Maier-Hein
MedIm
VLM
59
8
0
11 Mar 2024
Advancing Generalizable Remote Physiological Measurement through the Integration of Explicit and Implicit Prior Knowledge
Yuting Zhang
Haobo Lu
Xin Liu
Ying-Cong Chen
Kaishun Wu
82
7
0
11 Mar 2024
Can Generative Models Improve Self-Supervised Representation Learning?
Sana Ayromlou
Arash Afkanpour
Vahid Reza Khazaie
Fereshteh Forghani
94
3
0
09 Mar 2024
S
2
\textbf{S}^2
S
2
IP-LLM: Semantic Space Informed Prompt Learning with LLM for Time Series Forecasting
Zijie Pan
Yushan Jiang
Sahil Garg
Anderson Schneider
Yuriy Nevmyvaka
Dongjin Song
AI4TS
148
8
0
09 Mar 2024
Augmentations vs Algorithms: What Works in Self-Supervised Learning
Warren Morningstar
Alex Bijamov
Chris Duvarney
Luke Friedman
Neha Kalibhat
...
Philip Mansfield
Renan A. Rojas-Gomez
Karan Singhal
Bradley Green
Sushant Prakash
SSL
77
12
0
08 Mar 2024
JointMotion: Joint Self-supervision for Joint Motion Prediction
Royden Wagner
Ömer Sahin Tas
Marvin Klemp
Carlos Fernandez Lopez
TTA
100
3
0
08 Mar 2024
Previous
1
2
3
...
38
39
40
...
94
95
96
Next