Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.06377
Cited By
v1
v2
v3 (latest)
Masked Autoencoders Are Scalable Vision Learners
11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Masked Autoencoders Are Scalable Vision Learners"
50 / 4,779 papers shown
Title
PRIOR: Prototype Representation Joint Learning from Medical Images and Reports
Pujin Cheng
Li Lin
Junyan Lyu
Yijin Huang
Wenhan Luo
Xiaoying Tang
MedIm
142
51
0
24 Jul 2023
Towards Video Anomaly Retrieval from Video Anomaly Detection: New Benchmarks and Model
Peng Wu
Jing Liu
Xiangteng He
Yuxin Peng
Peng Wang
Yanning Zhang
124
34
0
24 Jul 2023
HIQL: Offline Goal-Conditioned RL with Latent States as Actions
Seohong Park
Dibya Ghosh
Benjamin Eysenbach
Sergey Levine
OffRL
128
61
0
22 Jul 2023
Improving Viewpoint Robustness for Visual Recognition via Adversarial Training
Shouwei Ruan
Yinpeng Dong
Han Su
Jianteng Peng
Ning Chen
Xingxing Wei
60
7
0
21 Jul 2023
CORE: Cooperative Reconstruction for Multi-Agent Perception
Binglu Wang
Lei Zhang
Zhaozhong Wang
Yongqiang Zhao
Tianfei Zhou
115
37
0
21 Jul 2023
Attention Consistency Refined Masked Frequency Forgery Representation for Generalizing Face Forgery Detection
Decheng Liu
Tao Chen
Chunlei Peng
Nannan Wang
R. Hu
Xinbo Gao
CVBM
65
3
0
21 Jul 2023
Latent-OFER: Detect, Mask, and Reconstruct with Latent Vectors for Occluded Facial Expression Recognition
Isack Lee
Eung-Joo Lee
S. Yoo
97
27
0
21 Jul 2023
Tuning Pre-trained Model via Moment Probing
Mingze Gao
Qilong Wang
Zhenyi Lin
Pengfei Zhu
Qinghua Hu
Jingbo Zhou
76
8
0
21 Jul 2023
AlignDet: Aligning Pre-training and Fine-tuning in Object Detection
Ming Li
Jie Wu
Xionghui Wang
Chen Chen
Jie Qin
Xu Xiao
Rui Wang
Min Zheng
Xin Pan
ObjD
VLM
88
18
0
20 Jul 2023
PASTA: Pretrained Action-State Transformer Agents
Raphael Boige
Yannis Flet-Berliac
Arthur Flajolet
Guillaume Richard
Thomas Pierrot
LM&Ro
OffRL
122
5
0
20 Jul 2023
Sequential Multi-Dimensional Self-Supervised Learning for Clinical Time Series
Aniruddh Raghu
P. Chandak
Ridwan Alam
John Guttag
Collin M. Stultz
AI4TS
73
13
0
20 Jul 2023
Revisiting Fine-Tuning Strategies for Self-supervised Medical Imaging Analysis
Muhammad Osama Khan
Yi Fang
78
4
0
20 Jul 2023
Meta-Transformer: A Unified Framework for Multimodal Learning
Yiyuan Zhang
Kaixiong Gong
Kaipeng Zhang
Hongsheng Li
Yu Qiao
Wanli Ouyang
Xiangyu Yue
105
150
0
20 Jul 2023
Conditional expectation network for SHAP
Ronald Richman
M. Wüthrich
FAtt
BDL
57
3
0
20 Jul 2023
Quantized Feature Distillation for Network Quantization
Kevin Zhu
Yin He
Jianxin Wu
MQ
62
11
0
20 Jul 2023
A Holistic Assessment of the Reliability of Machine Learning Systems
Anthony Corso
David Karamadian
Romeo Valentin
Mary Cooper
Mykel J. Kochenderfer
77
7
0
20 Jul 2023
Mining Conditional Part Semantics with Occluded Extrapolation for Human-Object Interaction Detection
Guangzhi Wang
Yangyang Guo
Mohan S. Kankanhalli
74
0
0
19 Jul 2023
LightPath: Lightweight and Scalable Path Representation Learning
Sean Bin Yang
Jilin Hu
Chenjuan Guo
B. Yang
Christian S. Jensen
83
29
0
19 Jul 2023
DVPT: Dynamic Visual Prompt Tuning of Large Pre-trained Models for Medical Image Analysis
Along He
Kai Wang
Zhihong Wang
Tao Li
Huazhu Fu
MedIm
100
5
0
19 Jul 2023
Self-Supervised Learning for WiFi CSI-Based Human Activity Recognition: A Systematic Study
Ke Xu
Jiangtao Wang
Erik Cambria
Dingchang Zheng
71
6
0
19 Jul 2023
CPCM: Contextual Point Cloud Modeling for Weakly-supervised Point Cloud Semantic Segmentation
Lizhao Liu
Zhuangwei Zhuang
Shan Huang
Xu Xiao
Tian-Zhu Xiang
Cen Chen
Jingdong Wang
Mingkui Tan
3DPC
95
19
0
19 Jul 2023
Towards A Unified Agent with Foundation Models
Norman Di Palo
Arunkumar Byravan
Leonard Hasenclever
Markus Wulfmeier
N. Heess
Martin Riedmiller
LM&Ro
LLMAG
OffRL
83
60
0
18 Jul 2023
MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments
Spyros Gidaris
Andrei Bursuc
Oriane Siméoni
Antonín Vobecký
N. Komodakis
Matthieu Cord
Patrick Pérez
SSL
ViT
63
3
0
18 Jul 2023
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future
Chaoyang Zhu
Long Chen
ObjD
VLM
148
40
0
18 Jul 2023
CSSL-RHA: Contrastive Self-Supervised Learning for Robust Handwriting Authentication
Wenwen Qiang
Luntian Mou
Changwen Zheng
Wen Gao
AAML
87
4
0
18 Jul 2023
Diffusion Models Beat GANs on Image Classification
Soumik Mukhopadhyay
M. Gwilliam
Vatsal Agarwal
Namitha Padmanabhan
A. Swaminathan
Srinidhi Hegde
Dinesh Manocha
Abhinav Shrivastava
DiffM
163
48
1
17 Jul 2023
Learning to Count without Annotations
Lukas Knobel
Tengda Han
Yuki M. Asano
SSL
88
2
0
17 Jul 2023
Deficiency-Aware Masked Transformer for Video Inpainting
Yongsheng Yu
Hengrui Fan
Libo Zhang
VGen
68
9
0
17 Jul 2023
Does Visual Pretraining Help End-to-End Reasoning?
Chen Sun
Calvin Luo
Xingyi Zhou
Anurag Arnab
Cordelia Schmid
OCL
LRM
ViT
78
3
0
17 Jul 2023
SkeletonMAE: Graph-based Masked Autoencoder for Skeleton Sequence Pre-training
Hongfei Yan
Yang Liu
Yushen Wei
Zerui Li
Guanbin Li
Liang Lin
89
43
0
17 Jul 2023
Revisiting Scene Text Recognition: A Data Perspective
Qing-Yuan Jiang
Jiapeng Wang
Dezhi Peng
Chongyu Liu
Lianwen Jin
107
41
0
17 Jul 2023
Complexity Matters: Rethinking the Latent Space for Generative Modeling
Tianyang Hu
Fei Chen
Hong Wang
Jiawei Li
Wei Cao
Jiacheng Sun
Hao Sun
DiffM
120
10
0
17 Jul 2023
Random Boxes Are Open-world Object Detectors
Yanghao Wang
Zhongqi Yue
Xiansheng Hua
Hanwang Zhang
ObjD
145
18
0
17 Jul 2023
An Empirical Study of Pre-trained Model Selection for Out-of-Distribution Generalization and Calibration
Hiroki Naganuma
Ryuichiro Hataya
Kotaro Yoshida
Ioannis Mitliagkas
OODD
177
3
0
17 Jul 2023
Towards Viewpoint-Invariant Visual Recognition via Adversarial Training
Shouwei Ruan
Yinpeng Dong
Han Su
Jianteng Peng
Ning Chen
Xingxing Wei
OOD
77
10
0
16 Jul 2023
Is Imitation All You Need? Generalized Decision-Making with Dual-Phase Training
Yao Wei
Yanchao Sun
Ruijie Zheng
Sai H. Vemprala
Rogerio Bonatti
Shuhang Chen
Ratnesh Madaan
Zhongjie Ba
Ashish Kapoor
Shuang Ma
OffRL
73
17
0
16 Jul 2023
Semantic Contrastive Bootstrapping for Single-positive Multi-label Recognition
Cheng Chen
Yifan Zhao
Jia Li
87
5
0
15 Jul 2023
DreamTeacher: Pretraining Image Backbones with Deep Generative Models
Daiqing Li
Huan Ling
Amlan Kar
David Acuna
Seung Wook Kim
Karsten Kreis
Antonio Torralba
Sanja Fidler
VLM
DiffM
82
29
0
14 Jul 2023
Learn from Incomplete Tactile Data: Tactile Representation Learning with Masked Autoencoders
G. Cao
Jiaqi Jiang
Danushka Bollegala
Shan Luo
86
14
0
14 Jul 2023
Improving BERT with Hybrid Pooling Network and Drop Mask
Qian Chen
Wen Wang
Qinglin Zhang
Chong Deng
Ma Yukun
Siqi Zheng
48
1
0
14 Jul 2023
Masked Autoencoders for Unsupervised Anomaly Detection in Medical Images
Mariana-Iuliana Georgescu
MedIm
116
8
0
14 Jul 2023
Long Short-term Memory with Two-Compartment Spiking Neuron
Shimin Zhang
Qu Yang
Chenxiang Ma
Jibin Wu
Haizhou Li
Kay Chen Tan
70
7
0
14 Jul 2023
DenseMP: Unsupervised Dense Pre-training for Few-shot Medical Image Segmentation
Zhaoxin Fan
Puquan Pan
Zeren Zhang
C. Chen
Tianyang Wang
Si Zheng
Min Xu
VLM
87
0
0
13 Jul 2023
Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution
Mostafa Dehghani
Basil Mustafa
Josip Djolonga
Jonathan Heek
Matthias Minderer
...
Avital Oliver
Piotr Padlewski
A. Gritsenko
Mario Luvcić
N. Houlsby
ViT
193
119
0
12 Jul 2023
Multimodal Molecular Pretraining via Modality Blending
Qiying Yu
Yudi Zhang
Yuyan Ni
Shi Feng
Yanyan Lan
Hao Zhou
Jingjing Liu
72
13
0
12 Jul 2023
OG: Equip vision occupancy with instance segmentation and visual grounding
Zichao Dong
Hang Ji
Weikun Zhang
Xufeng Huang
Junbo Chen
ISeg
VLM
46
0
0
12 Jul 2023
Self-supervised adversarial masking for 3D point cloud representation learning
Michal Szachniewicz
Wojciech Kozlowski
Michal Stypulkowski
Maciej Ziȩba
3DPC
51
2
0
11 Jul 2023
Masked Vision and Language Pre-training with Unimodal and Multimodal Contrastive Losses for Medical Visual Question Answering
Pengfei Li
Gang Liu
Jinlong He
Zixu Zhao
Shenjun Zhong
53
37
0
11 Jul 2023
Test-Time Training on Video Streams
Renhao Wang
Yu Sun
Yossi Gandelsman
Xinlei Chen
Alexei A. Efros
Alexei A. Efros
Xiaolong Wang
TTA
ViT
3DGS
164
21
0
11 Jul 2023
Substance or Style: What Does Your Image Embedding Know?
Cyrus Rashtchian
Charles Herrmann
Chun-Sung Ferng
Ayan Chakrabarti
Dilip Krishnan
Deqing Sun
Da-Cheng Juan
Andrew Tomkins
57
6
0
10 Jul 2023
Previous
1
2
3
...
61
62
63
...
94
95
96
Next