Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.06377
Cited By
v1
v2
v3 (latest)
Masked Autoencoders Are Scalable Vision Learners
11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Masked Autoencoders Are Scalable Vision Learners"
50 / 4,779 papers shown
Title
Initialization Matters for Adversarial Transfer Learning
Andong Hua
Jindong Gu
Zhiyu Xue
Nicholas Carlini
Eric Wong
Yao Qin
AAML
101
8
0
10 Dec 2023
The Counterattack of CNNs in Self-Supervised Learning: Larger Kernel Size might be All You Need
Tianjin Huang
Tianlong Chen
Zhangyang Wang
Shiwei Liu
80
1
0
09 Dec 2023
From Static to Dynamic: Adapting Landmark-Aware Image Models for Facial Expression Recognition in Videos
Yin Chen
Jia Li
Shiguang Shan
Meng Wang
Richang Hong
91
35
0
09 Dec 2023
Bridging Synthetic and Real Worlds for Pre-training Scene Text Detectors
Tongkun Guan
Wei Shen
Xuehang Yang
Xuehui Wang
Xiaokang Yang
109
7
0
08 Dec 2023
SlimSAM: 0.1% Data Makes Segment Anything Slim
Zigeng Chen
Gongfan Fang
Xinyin Ma
Xinchao Wang
103
15
0
08 Dec 2023
Cross-BERT for Point Cloud Pretraining
Xin Li
Peng Li
Zeyong Wei
Zhe Zhu
Mingqiang Wei
Junhui Hou
Liangliang Nan
J. Qin
H. Xie
F. Wang
SSL
3DPC
82
0
0
08 Dec 2023
Adapting Vision Transformer for Efficient Change Detection
Yang Zhao
Yuxiang Zhang
Yanni Dong
Bo Du
VLM
75
2
0
08 Dec 2023
MIMIR: Masked Image Modeling for Mutual Information-based Adversarial Robustness
Xiaoyun Xu
Shujian Yu
Jingzheng Wu
S. Picek
AAML
122
0
0
08 Dec 2023
GenTron: Diffusion Transformers for Image and Video Generation
Shoufa Chen
Mengmeng Xu
Jiawei Ren
Yuren Cong
Sen He
Yanping Xie
Animesh Sinha
Ping Luo
Tao Xiang
Juan-Manuel Perez-Rua
VGen
101
41
0
07 Dec 2023
Multimodal Industrial Anomaly Detection by Crossmodal Feature Mapping
Alex Costanzino
Pierluigi Zama Ramirez
Giuseppe Lisanti
Luigi Di Stefano
85
19
0
07 Dec 2023
Bootstrapping Autonomous Driving Radars with Self-Supervised Learning
Yiduo Hao
Sohrab Madani
Junfeng Guan
Mohammed Alloulah
Saurabh Gupta
Haitham Hassanieh
SSL
77
3
0
07 Dec 2023
DemoCaricature: Democratising Caricature Generation with a Rough Sketch
Dar-Yen Chen
A. Bhunia
Subhadeep Koley
Aneeshan Sain
Pinaki Nath Chowdhury
Yi-Zhe Song
94
8
0
07 Dec 2023
Stronger, Fewer, & Superior: Harnessing Vision Foundation Models for Domain Generalized Semantic Segmentation
Zhixiang Wei
Lin Chen
Yi Jin
Xiaoxiao Ma
Tianle Liu
Pengyang Lin
Ben Wang
H. Chen
Jinjin Zheng
105
48
0
07 Dec 2023
Fine-tuning vision foundation model for crack segmentation in civil infrastructures
Kang Ge
Chen Wang
Yutao Guo
Yansong Tang
Zhenzhong Hu
Hongbing Chen
VLM
74
23
0
07 Dec 2023
Guided Reconstruction with Conditioned Diffusion Models for Unsupervised Anomaly Detection in Brain MRIs
F. Behrendt
Debayan Bhattacharya
R. Mieling
Lennart Maack
Julia Kruger
R. Opfer
Alexander Schlaefer
DiffM
MedIm
87
10
0
07 Dec 2023
Augmentation-Free Dense Contrastive Knowledge Distillation for Efficient Semantic Segmentation
Jiawei Fan
Chao Li
Xiaolong Liu
Meina Song
Anbang Yao
64
6
0
07 Dec 2023
An Improved Masking Strategy for Self-supervised Masked Reconstruction in Human Activity Recognition
Jinqiang Wang
Tao Zhu
Huansheng Ning
65
2
0
07 Dec 2023
An unsupervised approach towards promptable defect segmentation in laser-based additive manufacturing by Segment Anything
Israt Zarin Era
Imtiaz Ahmed
Zhichao Liu
Srinjoy Das
80
2
0
07 Dec 2023
LiDAR: Sensing Linear Probing Performance in Joint Embedding SSL Architectures
Vimal Thilak
Chen Huang
Omid Saremi
Laurent Dinh
Hanlin Goh
Preetum Nakkiran
Josh Susskind
Etai Littwin
112
10
0
07 Dec 2023
Transformer-Powered Surrogates Close the ICF Simulation-Experiment Gap with Extremely Limited Data
M. Olson
Shusen Liu
Jayaraman J. Thiagarajan
B. Kustowski
Weng-Keen Wong
Rushil Anirudh
AI4CE
95
1
0
06 Dec 2023
Low-shot Object Learning with Mutual Exclusivity Bias
Anh Thai
Ahmad Humayun
Stefan Stojanov
Zixuan Huang
Bikram Boote
James M. Rehg
102
3
0
06 Dec 2023
Improving the Generalization of Segmentation Foundation Model under Distribution Shift via Weakly Supervised Adaptation
Haojie Zhang
Yongyi Su
Xun Xu
Kui Jia
OOD
VLM
86
25
0
06 Dec 2023
Benchmarking Continual Learning from Cognitive Perspectives
Xiaoqian Liu
Junge Zhang
Mingyi Zhang
Peipei Yang
88
1
0
06 Dec 2023
DiffPMAE: Diffusion Masked Autoencoders for Point Cloud Reconstruction
Yanlong Li
Chamara Madarasingha
Kanchana Thilakarathna
56
2
0
06 Dec 2023
Deep Multimodal Fusion for Surgical Feedback Classification
Rafal Kocielnik
Elyssa Y. Wong
Timothy N. Chu
Lydia Lin
De-An Huang
Jiayun Wang
A. Anandkumar
Andrew J. Hung
62
2
0
06 Dec 2023
Cache Me if You Can: Accelerating Diffusion Models through Block Caching
Felix Wimbauer
Bichen Wu
Edgar Schoenfeld
Xiaoliang Dai
Ji Hou
...
Jonas Kohler
Christian Rupprecht
Zorah Lähner
Peter Vajda
Jialiang Wang
DiffM
105
78
0
06 Dec 2023
Feature 3DGS: Supercharging 3D Gaussian Splatting to Enable Distilled Feature Fields
Shijie Zhou
Haoran Chang
Sicheng Jiang
Zhiwen Fan
Zehao Zhu
Dejia Xu
Pradyumna Chari
Suya You
Zhangyang Wang
A. Kadambi
3DGS
142
183
0
06 Dec 2023
Multitask Learning Can Improve Worst-Group Outcomes
Atharva Kulkarni
Lucio Dery
Amrith Rajagopal Setlur
Aditi Raghunathan
Ameet Talwalkar
Graham Neubig
96
2
0
05 Dec 2023
MoSA: Mixture of Sparse Adapters for Visual Efficient Tuning
Qizhe Zhang
Bocheng Zou
Ruichuan An
Jiaming Liu
Shanghang Zhang
MoE
105
3
0
05 Dec 2023
Are Vision Transformers More Data Hungry Than Newborn Visual Systems?
Lalit Pandey
Samantha M. W. Wood
Justin N. Wood
74
12
0
05 Dec 2023
SEVA: Leveraging sketches to evaluate alignment between human and machine visual abstraction
Kushin Mukherjee
Holly Huey
Xuanchen Lu
Yael Vinker
Rio Aguina-Kang
Ariel Shamir
Judith E. Fan
159
13
0
05 Dec 2023
GeNIe: Generative Hard Negative Images Through Diffusion
Soroush Abbasi Koohpayegani
Anuj Singh
K. Navaneet
Hadi Jamali Rad
Hamed Pirsiavash
VLM
DiffM
131
4
0
05 Dec 2023
Foundation Models for Weather and Climate Data Understanding: A Comprehensive Survey
Shengchao Chen
Guodong Long
Jing Jiang
Dikai Liu
Chengqi Zhang
SyDa
AI4CE
129
25
0
05 Dec 2023
Unsupervised Video Domain Adaptation with Masked Pre-Training and Collaborative Self-Training
Arun V. Reddy
William Paul
Corban Rivera
Ketul Shah
Celso M. de Melo
Rama Chellappa
132
4
0
05 Dec 2023
Towards General Purpose Vision Foundation Models for Medical Image Analysis: An Experimental Study of DINOv2 on Radiology Benchmarks
Mohammed Baharoon
Waseem Qureshi
J. Ouyang
Yanwu Xu
Abdulrhman Aljouie
Wei Peng
MedIm
AI4CE
97
7
0
04 Dec 2023
Rejuvenating image-GPT as Strong Visual Representation Learners
Sucheng Ren
Zeyu Wang
Hongru Zhu
Junfei Xiao
Alan Yuille
Cihang Xie
VLM
123
8
0
04 Dec 2023
Object Recognition as Next Token Prediction
Kaiyu Yue
Borchun Chen
Jonas Geiping
Hengduo Li
Tom Goldstein
Ser-Nam Lim
97
9
0
04 Dec 2023
Action Inference by Maximising Evidence: Zero-Shot Imitation from Observation with World Models
Xingyuan Zhang
Philip Becker-Ehmck
Patrick van der Smagt
Maximilian Karl
80
6
0
04 Dec 2023
A Generative Self-Supervised Framework using Functional Connectivity in fMRI Data
Jungwon Choi
Seongho Keum
Eunggu Yun
Byung-Hoon Kim
Juho Lee
77
6
0
04 Dec 2023
Bootstrapping SparseFormers from Vision Foundation Models
Ziteng Gao
Zhan Tong
Kevin Qinghong Lin
Joya Chen
Mike Zheng Shou
57
0
0
04 Dec 2023
UniGS: Unified Representation for Image Generation and Segmentation
Lu Qi
Lehan Yang
Weidong Guo
Yu-Syuan Xu
Bo Du
Varun Jampani
Ming-Hsuan Yang
98
19
0
04 Dec 2023
IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks
Jiarui Xu
Yossi Gandelsman
Amir Bar
Jianwei Yang
Jianfeng Gao
Trevor Darrell
Xiaolong Wang
VLM
58
3
0
04 Dec 2023
Hulk: A Universal Knowledge Translator for Human-Centric Tasks
Yizhou Wang
YiXuan Wu
Shixiang Tang
Weizhen He
Xun Guo
...
Lei Bai
Rui Zhao
Jian Wu
Tong He
Wanli Ouyang
VLM
206
14
0
04 Dec 2023
Multi-task Image Restoration Guided By Robust DINO Features
Xin Lin
Chao Ren
Kelvin C. K. Chan
Lu Qi
Jinshan Pan
Ming-Hsuan Yang
109
5
0
04 Dec 2023
SCLIP: Rethinking Self-Attention for Dense Vision-Language Inference
Feng Wang
Jieru Mei
Alan Yuille
VLM
146
66
0
04 Dec 2023
SANeRF-HQ: Segment Anything for NeRF in High Quality
Yichen Liu
Benran Hu
Chi-Keung Tang
Yu-Wing Tai
97
13
0
03 Dec 2023
G2D: From Global to Dense Radiography Representation Learning via Vision-Language Pre-training
Che Liu
Ouyang Cheng
Sibo Cheng
Anand Shah
Wenjia Bai
Rossella Arcucci
VLM
MedIm
105
10
0
03 Dec 2023
Brain Decodes Deep Nets
Huzheng Yang
James C. Gee
Jianbo Shi
67
8
0
03 Dec 2023
ESTformer: Transformer Utilizing Spatiotemporal Dependencies for Electroencaphalogram Super-resolution
Dongdong Li
Zhongliang Zeng
Zhe Wang
Hai Yang
118
1
0
03 Dec 2023
Disentangling the Effects of Data Augmentation and Format Transform in Self-Supervised Learning of Image Representations
Neha Kalibhat
Warren Morningstar
Alex Bijamov
Luyang Liu
Karan Singhal
Philip Mansfield
55
2
0
02 Dec 2023
Previous
1
2
3
...
47
48
49
...
94
95
96
Next