Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.06377
Cited By
v1
v2
v3 (latest)
Masked Autoencoders Are Scalable Vision Learners
11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Masked Autoencoders Are Scalable Vision Learners"
50 / 4,779 papers shown
Title
ActiveDC: Distribution Calibration for Active Finetuning
Wenshuai Xu
Zhenhui Hu
Yu Lu
Jinzhou Meng
Qingjie Liu
Yunhong Wang
111
1
0
13 Nov 2023
MonoDiffusion: Self-Supervised Monocular Depth Estimation Using Diffusion Model
Shuwei Shao
Zhongcai Pei
Weihai Chen
Dingchi Sun
Peter C. Y. Chen
Zhengguo Li
MDE
DiffM
77
8
0
13 Nov 2023
SpectralGPT: Spectral Remote Sensing Foundation Model
Danfeng Hong
Bing Zhang
Xuyang Li
Yuxuan Li
Chenyu Li
...
Xiuping Jia
Antonio J. Plaza
Paolo Gamba
J. Benediktsson
J. Chanussot
123
433
0
13 Nov 2023
Pretrain like Your Inference: Masked Tuning Improves Zero-Shot Composed Image Retrieval
Junyang Chen
Hanjiang Lai
VLM
140
15
0
13 Nov 2023
Concept-wise Fine-tuning Matters in Preventing Negative Transfer
Yunqiao Yang
Long-Kai Huang
Ying Wei
75
2
0
12 Nov 2023
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Bin Xiao
Haiping Wu
Weijian Xu
Xiyang Dai
Houdong Hu
Yumao Lu
Michael Zeng
Ce Liu
Lu Yuan
VLM
141
175
0
10 Nov 2023
Learning Human Action Recognition Representations Without Real Humans
Howard Zhong
Samarth Mishra
Donghyun Kim
SouYoung Jin
Yikang Shen
Hildegard Kuehne
Leonid Karlinsky
Venkatesh Saligrama
Aude Oliva
Rogerio Feris
99
4
0
10 Nov 2023
Layer-wise Auto-Weighting for Non-Stationary Test-Time Adaptation
Junyoung Park
Jin-Hwa Kim
Hyeongjun Kwon
Ilhoon Yoon
Kwanghoon Sohn
128
9
0
10 Nov 2023
Scale-MIA: A Scalable Model Inversion Attack against Secure Federated Learning via Latent Space Reconstruction
Shanghao Shi
Ning Wang
Yang Xiao
Chaoyu Zhang
Yi Shi
Y. T. Hou
W. Lou
75
8
0
10 Nov 2023
Window Attention is Bugged: How not to Interpolate Position Embeddings
Daniel Bolya
Chaitanya K. Ryali
Judy Hoffman
Christoph Feichtenhofer
94
11
0
09 Nov 2023
Real-Time Neural Rasterization for Large Scenes
Jeffrey Yunfan Liu
Yun Chen
Ze Yang
Jingkang Wang
S. Manivasagam
R. Urtasun
AI4TS
AI4CE
106
35
0
09 Nov 2023
Self-similarity Prior Distillation for Unsupervised Remote Physiological Measurement
Xinyu Zhang
Weiyu Sun
Hao Lu
Ying-Cong Chen
Yun Ge
Xiaolin Huang
Jie Yuan
Yingcong Chen
55
2
0
09 Nov 2023
Self-Supervised Learning for Visual Relationship Detection through Masked Bounding Box Reconstruction
Zacharias Anastasakis
Dimitrios Mallis
Markos Diomataris
George Alexandridis
Stefanos D. Kollias
Vassilis Pitsikalis
57
2
0
08 Nov 2023
FetMRQC: an open-source machine learning framework for multi-centric fetal brain MRI quality control
Thomas Sanchez
Oscar Esteban
Yvan Gomez
A. Pron
Mériam Koob
...
Nadine Girard
Andras Jakab
E. Eixarch
G. Auzias
Meritxell Bach Cuadra
53
4
0
08 Nov 2023
Learning Discriminative Features for Crowd Counting
Yuehai Chen
Qingzhong Wang
Jing Yang
Badong Chen
Haoyi Xiong
Shaoyi Du
89
8
0
08 Nov 2023
PRED: Pre-training via Semantic Rendering on LiDAR Point Clouds
Hao Yang
Haiyang Wang
Di Dai
Liwei Wang
3DPC
76
5
0
08 Nov 2023
PersonMAE: Person Re-Identification Pre-Training with Masked AutoEncoders
Hezhen Hu
Xiaoyi Dong
Jianmin Bao
Dongdong Chen
Lu Yuan
Dong Chen
Houqiang Li
103
6
0
08 Nov 2023
SS-MAE: Spatial-Spectral Masked Auto-Encoder for Multi-Source Remote Sensing Image Classification
Junyan Lin
Feng Gao
Xiaochen Shi
Junyu Dong
Q. Du
94
52
0
08 Nov 2023
3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features
Chenfeng Xu
Huan Ling
Sanja Fidler
Or Litany
109
15
0
07 Nov 2023
OmniVec: Learning robust representations with cross modal sharing
Siddharth Srivastava
Gaurav Sharma
SSL
97
67
0
07 Nov 2023
GPT-ST: Generative Pre-Training of Spatio-Temporal Graph Neural Networks
Zhonghang Li
Lianghao Xia
Yong-mei Xu
Chao Huang
AI4TS
AI4CE
128
28
0
07 Nov 2023
Instruct Me More! Random Prompting for Visual In-Context Learning
Jiahao Zhang
Bowen Wang
Liangzhi Li
Yuta Nakashima
Hajime Nagahara
VLM
77
19
0
07 Nov 2023
Random Field Augmentations for Self-Supervised Representation Learning
Philip Mansfield
Arash Afkanpour
Warren Morningstar
Karan Singhal
OOD
105
2
0
07 Nov 2023
Exploitation-Guided Exploration for Semantic Embodied Navigation
Justin Wasserman
Girish Chowdhary
Abhinav Gupta
Unnat Jain
95
4
0
06 Nov 2023
Uni-O4: Unifying Online and Offline Deep Reinforcement Learning with Multi-Step On-Policy Optimization
Kun Lei
Zhengmao He
Chenhao Lu
Kaizhe Hu
Yang Gao
Huazhe Xu
OffRL
OnRL
134
13
0
06 Nov 2023
FATE: Feature-Agnostic Transformer-based Encoder for learning generalized embedding spaces in flow cytometry data
Lisa Weijler
Florian Kowarsch
Michael Reiter
Pedro Hermosilla
Margarita Maurer-Granofszky
Michael N. Dworzak
MedIm
50
3
0
06 Nov 2023
A Single 2D Pose with Context is Worth Hundreds for 3D Human Pose Estimation
Qi-jun Zhao
Ce Zheng
Mengyuan Liu
Chong Chen
74
14
0
06 Nov 2023
Asymmetric Masked Distillation for Pre-Training Small Foundation Models
Zhiyu Zhao
Bingkun Huang
Sen Xing
Gangshan Wu
Yu Qiao
Limin Wang
89
5
0
06 Nov 2023
Pelvic floor MRI segmentation based on semi-supervised deep learning
Jianwei Zuo
Fei Feng
Zhuhui Wang
J. A. Ashton-Miller
J. Delancey
Jiajia Luo
66
0
0
06 Nov 2023
The Pursuit of Human Labeling: A New Perspective on Unsupervised Learning
Artyom Gadetsky
Maria Brbić
72
7
0
06 Nov 2023
CycleCL: Self-supervised Learning for Periodic Videos
Matteo Destro
Michael Gygli
SSL
102
2
0
05 Nov 2023
Augment the Pairs: Semantics-Preserving Image-Caption Pair Augmentation for Grounding-Based Vision and Language Models
Jingru Yi
Burak Uzkent
Oana Ignat
Zili Li
Amanmeet Garg
Xiang Yu
Linda Liu
VLM
85
1
0
05 Nov 2023
What Makes Pre-Trained Visual Representations Successful for Robust Manipulation?
Kaylee Burns
Zach Witzel
Jubayer Ibn Hamid
Tianhe Yu
Chelsea Finn
Karol Hausman
OOD
SSL
95
25
0
03 Nov 2023
ProS: Facial Omni-Representation Learning via Prototype-based Self-Distillation
Xing Di
Yiyu Zheng
Xiaoming Liu
Yu Cheng
91
3
0
03 Nov 2023
Holistic Representation Learning for Multitask Trajectory Anomaly Detection
Alexandros Stergiou
B. D. Weerdt
Nikos Deligiannis
106
13
0
03 Nov 2023
FLAP: Fast Language-Audio Pre-training
Ching-Feng Yeh
Po-Yao Huang
Vasu Sharma
Shang-Wen Li
Gargi Ghosh
CLIP
VLM
74
9
0
02 Nov 2023
UltraLiDAR: Learning Compact Representations for LiDAR Completion and Generation
Yuwen Xiong
Wei-Chiu Ma
Jingkang Wang
R. Urtasun
82
47
0
02 Nov 2023
VIGraph: Generative Self-supervised Learning for Class-Imbalanced Node Classification
Yulan Hu
Ouyang Sheng
Zhirui Yang
Yong Liu
88
0
0
02 Nov 2023
Dynamic Multimodal Information Bottleneck for Multimodality Classification
Yingying Fang
Shuang Wu
Sheng Zhang
Chao Huang
Tieyong Zeng
Xiaodan Xing
Simon Walsh
Guang Yang
74
9
0
02 Nov 2023
Sam-Guided Enhanced Fine-Grained Encoding with Mixed Semantic Learning for Medical Image Captioning
Zhenyu Zhang
Benlu Wang
Weijie Liang
Yizhi Li
Xuechen Guo
Guanhong Wang
Shiyan Li
Gaoang Wang
MedIm
LM&MA
34
9
0
02 Nov 2023
Concatenated Masked Autoencoders as Spatial-Temporal Learner
Zhouqiang Jiang
Bowen Wang
Tong Xiang
Zhaofeng Niu
Hong Tang
Guangshun Li
Liangzhi Li
55
2
0
02 Nov 2023
The Power of the Senses: Generalizable Manipulation from Vision and Touch through Masked Multimodal Learning
Carmelo Sferrazza
Younggyo Seo
Hao Liu
Youngwoon Lee
Pieter Abbeel
138
21
0
02 Nov 2023
De-Diffusion Makes Text a Strong Cross-Modal Interface
Chen Wei
Chenxi Liu
Siyuan Qiao
Zhishuai Zhang
Alan Yuille
Jiahui Yu
VLM
DiffM
105
11
0
01 Nov 2023
CROMA: Remote Sensing Representations with Contrastive Radar-Optical Masked Autoencoders
A. Fuller
K. Millard
James R. Green
103
72
0
01 Nov 2023
Text Rendering Strategies for Pixel Language Models
Jonas F. Lotz
Elizabeth Salesky
Phillip Rust
Desmond Elliott
VLM
87
12
0
01 Nov 2023
REBAR: Retrieval-Based Reconstruction for Time-series Contrastive Learning
Maxwell A. Xu
Alexander Moreno
Hui Wei
Benjamin M. Marlin
James M. Rehg
AI4TS
SSL
105
13
0
01 Nov 2023
fMRI-PTE: A Large-scale fMRI Pretrained Transformer Encoder for Multi-Subject Brain Activity Decoding
Xuelin Qian
Yun Wang
Jingyang Huo
Jianfeng Feng
Yanwei Fu
MedIm
54
8
0
01 Nov 2023
OpenForest: A data catalogue for machine learning in forest monitoring
Arthur Ouaknine
T. Kattenborn
Etienne Laliberté
David Rolnick
173
6
0
01 Nov 2023
Limited Data, Unlimited Potential: A Study on ViTs Augmented by Masked Autoencoders
Srijan Das
Tanmay Jain
Dominick Reilly
P. Balaji
Soumyajit Karmakar
Shyam Marjit
Xiang Li
Abhijit Das
Michael S. Ryoo
116
16
0
31 Oct 2023
HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception
Junkun Yuan
Xinyu Zhang
Hao Zhou
Jian Wang
Zhongwei Qiu
...
Junyu Han
Errui Ding
Lanfen Lin
Leilei Gan
Jingdong Wang
77
19
0
31 Oct 2023
Previous
1
2
3
...
50
51
52
...
94
95
96
Next