Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.06377
Cited By
v1
v2
v3 (latest)
Masked Autoencoders Are Scalable Vision Learners
11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Masked Autoencoders Are Scalable Vision Learners"
50 / 4,779 papers shown
Title
Never Train from Scratch: Fair Comparison of Long-Sequence Models Requires Data-Driven Priors
Ido Amos
Jonathan Berant
Ankit Gupta
120
29
0
04 Oct 2023
Point-PEFT: Parameter-Efficient Fine-Tuning for 3D Pre-trained Models
Yiwen Tang
Ivan Tang
Ray Gu
Dong Wang
Eric Zhang
Bin Zhao
Xuelong Li
3DPC
151
22
0
04 Oct 2023
NOLA: Compressing LoRA using Linear Combination of Random Basis
Soroush Abbasi Koohpayegani
K. Navaneet
Parsa Nooralinejad
Soheil Kolouri
Hamed Pirsiavash
145
16
0
04 Oct 2023
SlowFormer: Universal Adversarial Patch for Attack on Compute and Energy Efficiency of Inference Efficient Vision Transformers
K. Navaneet
Soroush Abbasi Koohpayegani
Essam Sleiman
Hamed Pirsiavash
AAML
ViT
67
3
0
04 Oct 2023
FairVision: Equitable Deep Learning for Eye Disease Screening via Fair Identity Scaling
Yan Luo
Muhammad Osama Khan
Yu Tian
Minfei Shi
Zehao Dou
T. Elze
Yi Fang
Mengyu Wang
93
10
0
03 Oct 2023
Understanding Masked Autoencoders From a Local Contrastive Perspective
Xiaoyu Yue
Lei Bai
Meng Wei
Jiangmiao Pang
Xihui Liu
Luping Zhou
Wanli Ouyang
SSL
114
4
0
03 Oct 2023
MFOS: Model-Free & One-Shot Object Pose Estimation
Jongmin Lee
Yohann Cabon
Romain Brégier
Sungjoo Yoo
Jérôme Revaud
ViT
74
6
0
03 Oct 2023
LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
Bin Zhu
Bin Lin
Munan Ning
Yang Yan
Jiaxi Cui
...
Zongwei Li
Wancai Zhang
Zhifeng Li
Wei Liu
Liejie Yuan
VLM
MLLM
211
229
0
03 Oct 2023
Selective Feature Adapter for Dense Vision Transformers
XueQing Deng
Qi Fan
Xiaojie Jin
Linjie Yang
Peng Wang
75
0
0
03 Oct 2023
AI-Generated Images as Data Source: The Dawn of Synthetic Era
Zuhao Yang
Fangneng Zhan
Kunhao Liu
Muyu Xu
Shijian Lu
EGVM
111
20
0
03 Oct 2023
Keypoint-Augmented Self-Supervised Learning for Medical Image Segmentation with Limited Annotation
Zhangsihao Yang
Mengwei Ren
Kaize Ding
Guido Gerig
Yalin Wang
SSL
69
5
0
02 Oct 2023
Operator Learning Meets Numerical Analysis: Improving Neural Networks through Iterative Methods
E. Zappala
Daniel Levine
Shiyang Zhang
S. Rizvi
Sacha Lévy
David van Dijk
69
1
0
02 Oct 2023
H-InDex: Visual Reinforcement Learning with Hand-Informed Representations for Dexterous Manipulation
Yanjie Ze
Yuyao Liu
Ruizhe Shi
Jiaxin Qin
Zhecheng Yuan
Jiashun Wang
Huazhe Xu
120
1
0
02 Oct 2023
ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video
Xinhao Li
Yuhan Zhu
Limin Wang
VLM
117
9
0
02 Oct 2023
Generating 3D Brain Tumor Regions in MRI using Vector-Quantization Generative Adversarial Networks
Meng Zhou
Matthias W. Wagner
U. Tabori
C. Hawkins
B. Ertl-Wagner
Farzad Khalvati
MedIm
117
5
0
02 Oct 2023
Self-supervised Learning for Anomaly Detection in Computational Workflows
Hongwei Jin
Krishnan Raghavan
George Papadimitriou
Cong Wang
A. Mandal
Ewa Deelman
Prasanna Balaprakash
78
1
0
02 Oct 2023
Self-distilled Masked Attention guided masked image modeling with noise Regularized Teacher (SMART) for medical image analysis
Jue Jiang
Aneesh Rangnekar
Chloe Choi
Harini Veeraraghavan
MedIm
83
0
0
02 Oct 2023
Segment Any Building
Lei Li
86
10
0
02 Oct 2023
Modularity in Deep Learning: A Survey
Haozhe Sun
Isabelle Guyon
MoMe
113
3
0
02 Oct 2023
Can Pre-trained Networks Detect Familiar Out-of-Distribution Data?
Atsuyuki Miyai
Qing Yu
Go Irie
Kiyoharu Aizawa
OODD
211
6
0
02 Oct 2023
Large Scale Masked Autoencoding for Reducing Label Requirements on SAR Data
Matt Allen
Francisco Dorr
Joseph A. Gallego-Mejia
Laura Martínez-Ferrer
Anna Jungbluth
F. Kalaitzis
Raúl Ramos-Pollán
88
9
0
02 Oct 2023
Win-Win: Training High-Resolution Vision Transformers from Two Windows
Vincent Leroy
Jérôme Revaud
Thomas Lucas
Philippe Weinzaepfel
ViT
111
2
0
01 Oct 2023
PixArt-
α
α
α
: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
Junsong Chen
Jincheng Yu
Chongjian Ge
Lewei Yao
Enze Xie
...
Zhongdao Wang
James T. Kwok
Ping Luo
Huchuan Lu
Zhenguo Li
DiffM
127
460
0
30 Sep 2023
Structural Adversarial Objectives for Self-Supervised Representation Learning
Xiao Zhang
Michael Maire
118
1
0
30 Sep 2023
Domain-Controlled Prompt Learning
Qinglong Cao
Zhengqin Xu
Yuantian Chen
Chao Ma
Xiaokang Yang
VLM
100
18
0
30 Sep 2023
Pixel-Inconsistency Modeling for Image Manipulation Localization
Chenqi Kong
Anwei Luo
Shiqi Wang
Haoliang Li
Anderson de Rezende Rocha
Alex C. Kot
AAML
88
17
0
30 Sep 2023
Region-centric Image-Language Pretraining for Open-Vocabulary Detection
Dahun Kim
A. Angelova
Weicheng Kuo
ObjD
VLM
82
4
0
29 Sep 2023
Practical Membership Inference Attacks Against Large-Scale Multi-Modal Models: A Pilot Study
Myeongseob Ko
Ming Jin
Chenguang Wang
Ruoxi Jia
104
29
0
29 Sep 2023
Towards Free Data Selection with General-Purpose Models
Alessandro Mutti
Mingyu Ding
Patrizia Semeraro
Wei Zhan
86
10
0
29 Sep 2023
Scaling Experiments in Self-Supervised Cross-Table Representation Learning
Maximilian Schambach
Dominique Paul
Wei Le
LMTD
65
2
0
29 Sep 2023
Improving Trajectory Prediction in Dynamic Multi-Agent Environment by Dropping Waypoints
Pranav Singh Chib
Pravendra Singh
85
1
0
29 Sep 2023
Information Flow in Self-Supervised Learning
Zhiyuan Tan
Jingqin Yang
Weiran Huang
Yang Yuan
Yifan Zhang
SSL
83
14
0
29 Sep 2023
MuSe-GNN: Learning Unified Gene Representation From Multimodal Biological Graph Data
Tianyu Liu
Yuge Wang
Rex Ying
Hongyu Zhao
121
15
0
29 Sep 2023
Understanding and Mitigating the Label Noise in Pre-training on Downstream Tasks
Hao Chen
Jindong Wang
Ankit Shah
Ran Tao
Jianguo Huang
Berfin cSimcsek
Masashi Sugiyama
Bhiksha Raj
116
32
0
29 Sep 2023
CtxMIM: Context-Enhanced Masked Image Modeling for Remote Sensing Image Understanding
Mingming Zhang
Qingjie Liu
Yunhong Wang
104
6
0
28 Sep 2023
Training a Large Video Model on a Single Machine in a Day
Yue Zhao
Philipp Krahenbuhl
VLM
104
17
0
28 Sep 2023
Visual In-Context Learning for Few-Shot Eczema Segmentation
Monitirtha Dey
S. K. Bhandari
Venugopal Vasudevan
29
2
0
28 Sep 2023
End-to-End (Instance)-Image Goal Navigation through Correspondence as an Emergent Phenomenon
G. Bono
L. Antsfeld
Boris Chidlovskii
Zhi Zheng
Christian Wolf
3DV
72
10
0
28 Sep 2023
Vision Transformers Need Registers
Zilong Chen
Maxime Oquab
Julien Mairal
Huaping Liu
ViT
205
357
0
28 Sep 2023
FORB: A Flat Object Retrieval Benchmark for Universal Image Embedding
Ana Ezquerro
Carlos Gómez-Rodríguez
Kevin Dela Rosa
Derek Hao Hu
45
6
0
28 Sep 2023
Cloth2Body: Generating 3D Human Body Mesh from 2D Clothing
Lu Dai
Liqian Ma
Shenhan Qian
Hao Liu
Ziwei Liu
Hui Xiong
3DH
94
4
0
28 Sep 2023
ELIP: Efficient Language-Image Pre-training with Fewer Vision Tokens
Yangyang Guo
Haoyu Zhang
Yongkang Wong
Liqiang Nie
Mohan Kankanhalli
VLM
71
4
0
28 Sep 2023
Feature Normalization Prevents Collapse of Non-contrastive Learning Dynamics
Han Bao
SSL
MLT
106
1
0
28 Sep 2023
Masked Autoencoders are Scalable Learners of Cellular Morphology
Oren Z. Kraus
Kian Kenyon-Dean
Saber Saberian
Maryam Fallah
Peter McLean
...
Chi Vicky Cheng
Kristen Morse
Maureen Makes
Ben Mabey
Berton Earnshaw
84
15
0
27 Sep 2023
Graph-level Representation Learning with Joint-Embedding Predictive Architectures
Geri Skenderi
Hang Li
Jiliang Tang
Marco Cristani
AI4TS
GNN
159
5
0
27 Sep 2023
Rapid Network Adaptation: Learning to Adapt Neural Networks Using Test-Time Feedback
Teresa Yeo
Oğuzhan Fatih Kar
Zahra Sodagar
Amir Zamir
TTA
OOD
74
4
0
27 Sep 2023
CAIT: Triple-Win Compression towards High Accuracy, Fast Inference, and Favorable Transferability For ViTs
Ao Wang
Hui Chen
Zijia Lin
Sicheng Zhao
Jiawei Han
Guiguang Ding
ViT
58
6
0
27 Sep 2023
Factorized Diffusion Architectures for Unsupervised Image Generation and Segmentation
Xin Yuan
Michael Maire
DiffM
77
2
0
27 Sep 2023
SGRec3D: Self-Supervised 3D Scene Graph Learning via Object-Level Scene Reconstruction
Sebastian Koch
Pedro Hermosilla
Narunas Vaskevicius
Mirco Colosi
Timo Ropinski
3DPC
SSL
90
16
0
27 Sep 2023
Improving Facade Parsing with Vision Transformers and Line Integration
Bowen Wang
Jiaxing Zhang
Ran Zhang
Yunqin Li
Liangzhi Li
Yuta Nakashima
ViT
31
18
0
27 Sep 2023
Previous
1
2
3
...
54
55
56
...
94
95
96
Next