Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.06377
Cited By
v1
v2
v3 (latest)
Masked Autoencoders Are Scalable Vision Learners
11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Masked Autoencoders Are Scalable Vision Learners"
50 / 4,779 papers shown
Title
Self-supervised Learning for Electroencephalogram: A Systematic Survey
Weining Weng
Yang Gu
Shuai Guo
Yuan Ma
Zhaohua Yang
Yuchen Liu
Yiqiang Chen
87
12
0
09 Jan 2024
Masked AutoEncoder for Graph Clustering without Pre-defined Cluster Number k
Yuanchi Ma
Hui He
Zhongxiang Lei
ZhenDong Niu
78
1
0
09 Jan 2024
MST: Adaptive Multi-Scale Tokens Guided Interactive Segmentation
Long Xu
Shanghong Li
Yongquan Chen
Jun Luo
Shiwu Lai
66
0
0
09 Jan 2024
Connect Later: Improving Fine-tuning for Robustness with Targeted Augmentations
Helen Qu
Sang Michael Xie
89
5
0
08 Jan 2024
Dr
2
^2
2
Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning
Chen Zhao
Shuming Liu
K. Mangalam
Guocheng Qian
Fatimah Zohra
Abdulmohsen Alghannam
Jitendra Malik
Guohao Li
84
3
0
08 Jan 2024
RudolfV: A Foundation Model by Pathologists for Pathologists
Jonas Dippel
Barbara Feulner
Tobias Winterhoff
Timo Milbich
Stephan Tietz
...
David Horst
Lukas Ruff
Klaus-Robert Muller
Frederick Klauschen
Maximilian Alber
122
32
0
08 Jan 2024
Fully Attentional Networks with Self-emerging Token Labeling
Bingyin Zhao
Zhiding Yu
Shiyi Lan
Yutao Cheng
A. Anandkumar
Yingjie Lao
Jose M. Alvarez
1.0K
6
0
08 Jan 2024
EAT: Self-Supervised Pre-Training with Efficient Audio Transformer
Wenxi Chen
Yuzhe Liang
Ziyang Ma
Zhisheng Zheng
Xie Chen
ViT
107
22
0
07 Jan 2024
Segment Anything Model for Medical Image Segmentation: Current Applications and Future Directions
Yichi Zhang
Zhenrong Shen
Rushi Jiao
VLM
MedIm
110
144
0
07 Jan 2024
Deep Learning-based Image and Video Inpainting: A Survey
Weize Quan
Jiaxi Chen
Yanli Liu
Dong-Ming Yan
Peter Wonka
3DV
81
40
0
07 Jan 2024
PIXAR: Auto-Regressive Language Modeling in Pixel Space
Yintao Tai
Xiyang Liao
Alessandro Suglia
Antonio Vergari
MLLM
74
9
0
06 Jan 2024
Explicit Visual Prompts for Visual Object Tracking
Liangtao Shi
Bineng Zhong
Qihua Liang
Ning Li
Shengping Zhang
Xianxian Li
77
31
0
06 Jan 2024
AccidentGPT: Large Multi-Modal Foundation Model for Traffic Accident Analysis
Kebin Wu
Wenbin Li
Xiaofei Xiao
37
4
0
05 Jan 2024
Denoising Vision Transformers
Jiawei Yang
Katie Z Luo
Jie Li
Kilian Q. Weinberger
Yonglong Tian
Yue Wang
DiffM
54
15
0
05 Jan 2024
Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively
Haobo Yuan
Xiangtai Li
Chong Zhou
Yining Li
Kai Chen
Chen Change Loy
VLM
118
51
0
05 Jan 2024
SPFormer: Enhancing Vision Transformer with Superpixel Representation
Jieru Mei
Liang-Chieh Chen
Alan Yuille
Cihang Xie
ViT
MDE
83
4
0
05 Jan 2024
CrisisViT: A Robust Vision Transformer for Crisis Image Classification
Zijun Long
R. McCreadie
Muhammad Imran
151
10
0
05 Jan 2024
GTA: Guided Transfer of Spatial Attention from Object-Centric Representations
SeokHyun Seo
Jinwoo Hong
Jungwoo Chae
Kyungyul Kim
Sangheum Hwang
77
0
0
05 Jan 2024
Scaling and Masking: A New Paradigm of Data Sampling for Image and Video Quality Assessment
Yongxu Liu
Yinghui Quan
Guoyao Xiao
Aobo Li
Jinjian Wu
85
11
0
05 Jan 2024
Towards Weakly Supervised Text-to-Audio Grounding
Xuenan Xu
Ziyang Ma
Mengyue Wu
Kai Yu
AI4TS
83
9
0
05 Jan 2024
Fus-MAE: A cross-attention-based data fusion approach for Masked Autoencoders in remote sensing
Hugo Chan-To-Hing
B. Veeravalli
105
9
0
05 Jan 2024
ClassWise-SAM-Adapter: Parameter Efficient Fine-tuning Adapts Segment Anything to SAR Domain for Semantic Segmentation
Xinyang Pu
He Jia
Linghao Zheng
Feng Wang
Feng Xu
56
0
0
04 Jan 2024
BA-SAM: Scalable Bias-Mode Attention Mask for Segment Anything Model
Yiran Song
Qianyu Zhou
Hefei Ling
Deng-Ping Fan
Xuequan Lu
Lizhuang Ma
VLM
143
15
0
04 Jan 2024
SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for Multimodal Alignment
Ziping Ma
Furong Xu
Jian Liu
Ming Yang
Qingpei Guo
VLM
79
3
0
04 Jan 2024
Data-Centric Foundation Models in Computational Healthcare: A Survey
Yunkun Zhang
Jin Gao
Zheling Tan
Lingfeng Zhou
Kexin Ding
Mu Zhou
Shaoting Zhang
Dequan Wang
AI4CE
113
25
0
04 Jan 2024
Spikformer V2: Join the High Accuracy Club on ImageNet with an SNN Ticket
Zhaokun Zhou
Kaiwei Che
Wei Fang
Keyu Tian
Yuesheng Zhu
Shuicheng Yan
Yonghong Tian
Liuliang Yuan
ViT
122
33
0
04 Jan 2024
SwitchTab: Switched Autoencoders Are Effective Tabular Learners
Jing Wu
Suiyao Chen
Qi Zhao
Renat Sergazinov
Chen Li
...
Tianpei Xie
Hanqing Guo
Cheng Ji
Daniel Cociorva
Hakan Brunzel
SSL
109
47
0
04 Jan 2024
GPS-SSL: Guided Positive Sampling to Inject Prior Into Self-Supervised Learning
Aarash Feizi
Randall Balestriero
Adriana Romero Soriano
Reihaneh Rabbany
97
2
0
03 Jan 2024
Few-shot Adaptation of Multi-modal Foundation Models: A Survey
Fan Liu
Tianshu Zhang
Wenwen Dai
Wenwen Cai
Wenwen Cai Xiaocong Zhou
Delong Chen
VLM
OffRL
82
30
0
03 Jan 2024
ODTrack: Online Dense Temporal Token Learning for Visual Tracking
Yaozong Zheng
Bineng Zhong
Qihua Liang
Zhiyi Mo
Shengping Zhang
Xianxian Li
VOT
76
61
0
03 Jan 2024
Enhancing Representation in Medical Vision-Language Foundation Models via Multi-Scale Information Extraction Techniques
Weijian Huang
Cheng Li
Hong-Yu Zhou
Jiarun Liu
Hao Yang
Yong Liang
Guangming Shi
Hairong Zheng
Shanshan Wang
74
2
0
03 Jan 2024
LORE++: Logical Location Regression Network for Table Structure Recognition with Pre-training
Rujiao Long
Hangdi Xing
Zhibo Yang
Qi Zheng
Zhi Yu
Cong Yao
Fei Huang
66
5
0
03 Jan 2024
SwapTransformer: highway overtaking tactical planner model via imitation learning on OSHA dataset
Alireza Shamsoshoara
Safin B Salih
Pedram Aghazadeh
ViT
OffRL
77
3
0
02 Jan 2024
CityPulse: Fine-Grained Assessment of Urban Change with Street View Time Series
Tianyuan Huang
Zejia Wu
Jiajun Wu
Jackelyn Hwang
Ram Rajagopal
AI4TS
47
4
0
02 Jan 2024
Skeleton2vec: A Self-supervised Learning Framework with Contextualized Target Representations for Skeleton Sequence
Ruizhuo Xu
Linzhi Huang
Mei Wang
Jiani Hu
Weihong Deng
ViT
MedIm
99
3
0
01 Jan 2024
PROMPT-IML: Image Manipulation Localization with Pre-trained Foundation Models Through Prompt Tuning
Xuntao Liu
Yuzhou Yang
Qichao Ying
Zhenxing Qian
Xinpeng Zhang
Sheng Li
VLM
68
4
0
01 Jan 2024
A Generalist FaceX via Learning Unified Facial Representation
Yue Han
Jiangning Zhang
Junwei Zhu
Xiangtai Li
Yanhao Ge
Wei Li
Chengjie Wang
Yong Liu
Xiaoming Liu
Ying Tai
DiffM
104
13
0
31 Dec 2023
Masked Modeling for Self-supervised Representation Learning on Vision and Beyond
Siyuan Li
Luyuan Zhang
Zedong Wang
Di Wu
Lirong Wu
...
Jun Xia
Cheng Tan
Yang Liu
Baigui Sun
Stan Z. Li
SSL
118
15
0
31 Dec 2023
Analyzing Local Representations of Self-supervised Vision Transformers
Ani Vanyan
Alvard Barseghyan
Hakob Tamazyan
Vahan Huroyan
Hrant Khachatrian
Martin Danelljan
114
3
0
31 Dec 2023
Brain-Conditional Multimodal Synthesis: A Survey and Taxonomy
Weijian Mai
Jian Zhang
Pengfei Fang
Zhijun Zhang
184
11
0
31 Dec 2023
SVFAP: Self-supervised Video Facial Affect Perceiver
Guoying Zhao
Zheng Lian
Kexin Wang
Yu He
Ming Xu
Haiyang Sun
Bin Liu
Jianhua Tao
108
14
0
31 Dec 2023
EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture Modeling
Haiyang Liu
Zihao Zhu
Giorgio Becherini
Yichen Peng
Mingyang Su
You Zhou
Xuefei Zhe
Naoya Iwamoto
Bo Zheng
Michael J. Black
SLR
176
36
0
31 Dec 2023
SSL-OTA: Unveiling Backdoor Threats in Self-Supervised Learning for Object Detection
Qiannan Wang
Changchun Yin
Lu Zhou
Liming Fang
65
1
0
30 Dec 2023
Morphing Tokens Draw Strong Masked Image Models
Taekyung Kim
Byeongho Heo
Dongyoon Han
194
3
0
30 Dec 2023
Multiscale Vision Transformers meet Bipartite Matching for efficient single-stage Action Localization
Ioanna Ntinou
Enrique Sanchez
Georgios Tzimiropoulos
100
5
0
29 Dec 2023
Visual Point Cloud Forecasting enables Scalable Autonomous Driving
Zetong Yang
Li Chen
Yanan Sun
Hongyang Li
3DPC
129
51
0
29 Dec 2023
HEAP: Unsupervised Object Discovery and Localization with Contrastive Grouping
Xin Zhang
Jinheng Xie
Yuan. Yuan
Michael Bi Mi
Robby T. Tan
VOS
OCL
VLM
143
4
0
29 Dec 2023
Any-point Trajectory Modeling for Policy Learning
Chuan Wen
Xingyu Lin
John So
Kai-xiang Chen
Qi Dou
Yang Gao
Pieter Abbeel
PINN
VGen
142
99
0
28 Dec 2023
Learning Vision from Models Rivals Learning Vision from Data
Yonglong Tian
Lijie Fan
Kaifeng Chen
Dina Katabi
Dilip Krishnan
Phillip Isola
111
51
0
28 Dec 2023
Unsupervised Universal Image Segmentation
Dantong Niu
Xudong Wang
Xinyang Han
Long Lian
Roei Herzig
Trevor Darrell
VLM
94
20
0
28 Dec 2023
Previous
1
2
3
...
44
45
46
...
94
95
96
Next