Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.06377
Cited By
v1
v2
v3 (latest)
Masked Autoencoders Are Scalable Vision Learners
11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Masked Autoencoders Are Scalable Vision Learners"
50 / 4,779 papers shown
Title
Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos
Sagnik Majumder
Ziad Al-Halah
Kristen Grauman
SSL
EgoV
111
4
0
10 Jul 2023
VampNet: Music Generation via Masked Acoustic Token Modeling
Hugo Flores Garcia
Prem Seetharaman
Rithesh Kumar
Bryan Pardo
MGen
93
68
0
10 Jul 2023
MiVOLO: Multi-input Transformer for Age and Gender Estimation
Maksim Kuprashevich
Irina Tolstykh
98
38
0
10 Jul 2023
Distill-SODA: Distilling Self-Supervised Vision Transformer for Source-Free Open-Set Domain Adaptation in Computational Pathology
Guillaume Vray
Devavrat Tomar
Jean-Philippe Thiran
Behzad Bozorgtabar
MedIm
76
1
0
10 Jul 2023
Mx2M: Masked Cross-Modality Modeling in Domain Adaptation for 3D Semantic Segmentation
Boxiang Zhang
Zunran Wang
Yonggen Ling
Yuanyuan Guan
Shenghao Zhang
Wenhui Li
80
6
0
09 Jul 2023
Cross-modal Orthogonal High-rank Augmentation for RGB-Event Transformer-trackers
Zhiyu Zhu
Xianqiang Lyu
Dapeng Wu
ViT
87
33
0
09 Jul 2023
SVIT: Scaling up Visual Instruction Tuning
Bo Zhao
Boya Wu
Muyang He
Tiejun Huang
MLLM
108
128
0
09 Jul 2023
Sketch-A-Shape: Zero-Shot Sketch-to-3D Shape Generation
Aditya Sanghi
P. Jayaraman
Arianna Rampini
Joseph Lambourne
Hooman Shayani
Evan Atherton
Saeid Asgari Taghanaki
3DV
97
15
0
08 Jul 2023
SpawnNet: Learning Generalizable Visuomotor Skills from Pre-trained Networks
Xingyu Lin
John So
Sashwat Mahalingam
Fangchen Liu
Pieter Abbeel
SSL
92
26
0
07 Jul 2023
Language-free Compositional Action Generation via Decoupling Refinement
Xiao Liu
Guangyi Chen
Yansong Tang
Guangrun Wang
Xiao-Ping Zhang
Ser-Nam Lim
CoGe
73
1
0
07 Jul 2023
Distilling Self-Supervised Vision Transformers for Weakly-Supervised Few-Shot Classification & Segmentation
Dahyun Kang
Piotr Koniusz
Minsu Cho
Naila Murray
VLM
ViT
85
26
0
07 Jul 2023
Goal-Conditioned Predictive Coding for Offline Reinforcement Learning
Zilai Zeng
Ce Zhang
Shijie Wang
Chen Sun
OffRL
82
6
0
07 Jul 2023
Weakly-supervised Contrastive Learning for Unsupervised Object Discovery
Yun-Qiu Lv
Jing Zhang
Nick Barnes
Yuchao Dai
89
11
0
07 Jul 2023
All in One: Exploring Unified Vision-Language Tracking with Multi-Modal Alignment
Chunhui Zhang
Xin Sun
Li Liu
Yiqian Yang
Qiong Liu
Xiaoping Zhou
Yanfeng Wang
218
17
0
07 Jul 2023
VideoGLUE: Video General Understanding Evaluation of Foundation Models
Liangzhe Yuan
N. B. Gundavarapu
Long Zhao
Hao Zhou
Huayu Chen
...
Florian Schroff
Hartwig Adam
Ming-Hsuan Yang
Ting Liu
Boqing Gong
ELM
85
10
0
06 Jul 2023
Knowledge Graph Self-Supervised Rationalization for Recommendation
Yuhao Yang
Chao Huang
Lianghao Xia
Chunzhen Huang
104
100
0
06 Jul 2023
AxonCallosumEM Dataset: Axon Semantic Segmentation of Whole Corpus Callosum cross section from EM Images
Ao Cheng
Guoqiang Zhao
Lirong Wang
Ruobing Zhang
54
3
0
05 Jul 2023
MAE-DFER: Efficient Masked Autoencoder for Self-supervised Dynamic Facial Expression Recognition
Guoying Zhao
Zheng Lian
B. Liu
Jianhua Tao
108
41
0
05 Jul 2023
Prompting Diffusion Representations for Cross-Domain Semantic Segmentation
R. Gong
Martin Danelljan
Hanqi Sun
Julio Mangas
Luc Van Gool
DiffM
VLM
99
24
0
05 Jul 2023
Make A Long Image Short: Adaptive Token Length for Vision Transformers
Yuqin Zhu
Yichen Zhu
ViT
125
17
0
05 Jul 2023
Distilling Missing Modality Knowledge from Ultrasound for Endometriosis Diagnosis with Magnetic Resonance Images
Yuan Zhang
Hu Wang
David Butler
Minh-Son To
Jodie Avery
M. L. Hull
Gustavo Carneiro
40
6
0
05 Jul 2023
ClimateLearn: Benchmarking Machine Learning for Weather and Climate Modeling
Tung Nguyen
Jason Jewik
Hritik Bansal
Prakhar Sharma
Aditya Grover
AI4Cl
AI4CE
80
33
0
04 Jul 2023
Crossway Diffusion: Improving Diffusion-based Visuomotor Policy via Self-supervised Learning
Xiang Li
Varun Belagali
Jinghuan Shang
Michael S. Ryoo
107
33
0
04 Jul 2023
In-Domain Self-Supervised Learning Improves Remote Sensing Image Scene Classification
I. Dimitrovski
Ivan Kitanovski
Nikola Simidjievski
D. Kocev
SSL
61
4
0
04 Jul 2023
IAdet: Simplest human-in-the-loop object detection
Franco Marchesoni-Acland
Gabriele Facciolo
VLM
107
1
0
04 Jul 2023
SelfFed: Self-Supervised Federated Learning for Data Heterogeneity and Label Scarcity in Medical Images
Sunder Ali Khowaja
Kapal Dev
Syed Muhammad Anwar
M. Linguraru
FedML
42
3
0
04 Jul 2023
SAM-DA: UAV Tracks Anything at Night with SAM-Powered Domain Adaptation
Changhong Fu
L. Yao
Haobo Zuo
Guang-Zheng Zheng
Jia Pan
134
17
0
03 Jul 2023
RefSAM: Efficiently Adapting Segmenting Anything Model for Referring Video Object Segmentation
Yonglin Li
Jing Zhang
Xiao Teng
Long Lan
VOS
VLM
94
18
0
03 Jul 2023
Review of Large Vision Models and Visual Prompt Engineering
Jiaqi Wang
Zheng Liu
Lin Zhao
Zihao Wu
Chong Ma
...
Bao Ge
Yixuan Yuan
Dinggang Shen
Tianming Liu
Shu Zhang
VLM
LRM
157
163
0
03 Jul 2023
Hierarchical Open-vocabulary Universal Image Segmentation
Xudong Wang
Shufang Li
Konstantinos Kallidromitis
Yu Kato
Kazuki Kozuka
Trevor Darrell
VLM
OCL
126
41
0
03 Jul 2023
SSC3OD: Sparsely Supervised Collaborative 3D Object Detection from LiDAR Point Clouds
Yushan Han
Hui Zhang
Honglei Zhang
Yidong Li
3DPC
114
2
0
03 Jul 2023
Intra- & Extra-Source Exemplar-Based Style Synthesis for Improved Domain Generalization
Yumeng Li
Dan Zhang
Margret Keuper
Anna Khoreva
114
11
0
02 Jul 2023
Filter Pruning for Efficient CNNs via Knowledge-driven Differential Filter Sampler
Shaohui Lin
Wenxuan Huang
Jiao Xie
Baochang Zhang
Yunhang Shen
Zhou Yu
Jungong Han
David Doermann
60
2
0
01 Jul 2023
Stitched ViTs are Flexible Vision Backbones
Zizheng Pan
Jing Liu
Haoyu He
Jianfei Cai
Bohan Zhuang
53
3
0
30 Jun 2023
Hierarchical Neural Coding for Controllable CAD Model Generation
Xiang Xu
P. Jayaraman
Joseph G. Lambourne
Karl D. D. Willis
Yasutaka Furukawa
102
43
0
30 Jun 2023
HYDRA: Hybrid Robot Actions for Imitation Learning
Suneel Belkhale
Yuchen Cui
Dorsa Sadigh
112
41
0
29 Jun 2023
End-to-end Autonomous Driving: Challenges and Frontiers
Li Chen
Peng Wu
Kashyap Chitta
Bernhard Jaeger
Andreas Geiger
Hongyang Li
3DV
201
319
0
29 Jun 2023
DreamDiffusion: Generating High-Quality Images from Brain EEG Signals
Yun-Hao Bai
Xintao Wang
Yanpei Cao
Yixiao Ge
Chun Yuan
Ying Shan
DiffM
89
57
0
29 Jun 2023
MIS-FM: 3D Medical Image Segmentation using Foundation Models Pretrained on a Large-Scale Unannotated Dataset
Guotai Wang
Jianghao Wu
Xiangde Luo
Xinglong Liu
Kang Li
Shaoting Zhang
77
28
0
29 Jun 2023
MPM: A Unified 2D-3D Human Pose Representation via Masked Pose Modeling
Zhenyu Zhang
Wenhao Chai
Zhongyu Jiang
Tianbo Ye
Xiuming Zhang
Lei Li
Gaoang Wang
3DH
66
5
0
29 Jun 2023
Unified Language Representation for Question Answering over Text, Tables, and Images
Yu Bowen
Cheng Fu
Haiyang Yu
Fei Huang
Yongbin Li
LMTD
85
23
0
29 Jun 2023
Prompt Ensemble Self-training for Open-Vocabulary Domain Adaptation
Jiaxing Huang
Jingyi Zhang
Han Qiu
Sheng Jin
Shijian Lu
VPVLM
VLM
104
0
0
29 Jun 2023
ICSVR: Investigating Compositional and Syntactic Understanding in Video Retrieval Models
Avinash Madasu
Vasudev Lal
CoGe
104
3
0
28 Jun 2023
RSPrompter: Learning to Prompt for Remote Sensing Instance Segmentation based on Visual Foundation Model
Keyan Chen
Chenyang Liu
Hao Chen
Haotian Zhang
Wenyuan Li
Zhengxia Zou
Z. Shi
VLM
125
221
0
28 Jun 2023
Multi-network Contrastive Learning Based on Global and Local Representations
Weiquan Li
Xianzhong Long
Yun Li
SSL
60
0
0
28 Jun 2023
Hybrid Distillation: Connecting Masked Autoencoders with Contrastive Learners
Bowen Shi
Xiaopeng Zhang
Yaoming Wang
Jin Li
Wenrui Dai
Junni Zou
H. Xiong
Qi Tian
101
4
0
28 Jun 2023
A generic self-supervised learning (SSL) framework for representation learning from spectra-spatial feature of unlabeled remote sensing imagery
Xin Zhang
Liangxiu Han
SSL
99
3
0
27 Jun 2023
Symphonize 3D Semantic Scene Completion with Contextual Instance Queries
Hao Jiang
Tianheng Cheng
Naiyu Gao
Haoyang Zhang
Tianwei Lin
Wenyu Liu
Xinggang Wang
95
61
0
27 Jun 2023
Enhancing Representation Learning on High-Dimensional, Small-Size Tabular Data: A Divide and Conquer Method with Ensembled VAEs
Navindu Leelarathna
Andrei Margeloiu
M. Jamnik
Nikola Simidjievski
DRL
78
1
0
27 Jun 2023
CLIPA-v2: Scaling CLIP Training with 81.1% Zero-shot ImageNet Accuracy within a \
10,000 Budget; An Extra \
4,000 Unlocks 81.8% Accuracy
Xianhang Li
Zeyu Wang
Cihang Xie
CLIP
VLM
129
20
0
27 Jun 2023
Previous
1
2
3
...
62
63
64
...
94
95
96
Next