Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.06377
Cited By
v1
v2
v3 (latest)
Masked Autoencoders Are Scalable Vision Learners
11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Masked Autoencoders Are Scalable Vision Learners"
50 / 4,778 papers shown
Title
MMCLIP: Cross-modal Attention Masked Modelling for Medical Language-Image Pre-Training
Biao Wu
Yutong Xie
Zeyu Zhang
Minh Hieu Phan
Qi Chen
Ling-Hao Chen
Qi Wu
LM&MA
99
0
0
28 Jul 2024
Sparse Refinement for Efficient High-Resolution Semantic Segmentation
Zhijian Liu
Zhuoyang Zhang
Samir Khaki
Shang Yang
Haotian Tang
Chenfeng Xu
Kurt Keutzer
Song Han
SSeg
91
1
0
26 Jul 2024
HRP: Human Affordances for Robotic Pre-Training
Mohan Kumar Srirama
Sudeep Dasari
Shikhar Bahl
Abhinav Gupta
101
19
0
26 Jul 2024
Deep Companion Learning: Enhancing Generalization Through Historical Consistency
Ruizhao Zhu
Venkatesh Saligrama
FedML
87
0
0
26 Jul 2024
Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers
Longkun Zou
Wanru Zhu
Ke Chen
Lihua Guo
K. Guo
Kui Jia
Yaowei Wang
3DPC
ViT
84
0
0
26 Jul 2024
Trajectory-aligned Space-time Tokens for Few-shot Action Recognition
Pulkit Kumar
Namitha Padmanabhan
Luke Luo
Sai Saketh Rambhatla
Abhinav Shrivastava
95
4
0
25 Jul 2024
HVM-1: Large-scale video models pretrained with nearly 5000 hours of human-like video data
Emin Orhan
VLM
SyDa
74
1
0
25 Jul 2024
How Lightweight Can A Vision Transformer Be
Jen Hong Tan
ViT
MoE
94
0
0
25 Jul 2024
Revisiting Machine Unlearning with Dimensional Alignment
Seonguk Seo
Dongwan Kim
Bohyung Han
MU
63
1
0
25 Jul 2024
Transformers on Markov Data: Constant Depth Suffices
Nived Rajaraman
Marco Bondaschi
Kannan Ramchandran
Michael C. Gastpar
Ashok Vardhan Makkuva
84
7
0
25 Jul 2024
Unsqueeze [CLS] Bottleneck to Learn Rich Representations
Qing Su
Shihao Ji
90
0
0
24 Jul 2024
PEEKABOO: Hiding parts of an image for unsupervised object localization
Hasib Zunair u
24 A.BenHamza
SSL
132
0
0
24 Jul 2024
Multi-label Cluster Discrimination for Visual Representation Learning
Xiang An
Kaicheng Yang
Xiangzi Dai
Ziyong Feng
Jiankang Deng
VLM
98
7
0
24 Jul 2024
Parameter-Efficient Fine-Tuning for Continual Learning: A Neural Tangent Kernel Perspective
Jingren Liu
Zhong Ji
YunLong Yu
Jiale Cao
Yanwei Pang
Jungong Han
Xuelong Li
CLL
142
5
0
24 Jul 2024
SINDER: Repairing the Singular Defects of DINOv2
Haoqian Wang
Tong Zhang
Mathieu Salzmann
59
4
0
23 Jul 2024
PartGLEE: A Foundation Model for Recognizing and Parsing Any Objects
Junyi Li
Junfeng Wu
Weizhi Zhao
Song Bai
Xiang Bai
81
3
0
23 Jul 2024
QPT V2: Masked Image Modeling Advances Visual Scoring
Qizhi Xie
Kun Yuan
Yunpeng Qu
Mingda Wu
Ming Sun
Chao Zhou
Jihong Zhu
83
3
0
23 Jul 2024
Masks and Manuscripts: Advancing Medical Pre-training with End-to-End Masking and Narrative Structuring
Shreyank N. Gowda
David A. Clifton
MedIm
73
1
0
23 Jul 2024
A Multi-view Mask Contrastive Learning Graph Convolutional Neural Network for Age Estimation
Yiping Zhang
Yuntao Shou
Tao Meng
Wei Ai
Keqin Li
CVBM
110
10
0
23 Jul 2024
CLII: Visual-Text Inpainting via Cross-Modal Predictive Interaction
Liang Zhao
Qing Guo
Xiaoguang Li
Song Wang
DiffM
77
0
0
23 Jul 2024
Diffusion Models as Optimizers for Efficient Planning in Offline RL
Renming Huang
Yunqiang Pei
Guoqing Wang
Yangming Zhang
Yang Yang
Peng Wang
H. Shen
OffRL
103
1
0
23 Jul 2024
SAM-CP: Marrying SAM with Composable Prompts for Versatile Segmentation
Pengfei Chen
Lingxi Xie
Xinyue Huo
Xuehui Yu
Xiaopeng Zhang
Yingfei Sun
Zhenjun Han
Qi Tian
VLM
200
1
0
23 Jul 2024
Reconstructing Training Data From Real World Models Trained with Transfer Learning
Yakir Oz
Gilad Yehudai
Gal Vardi
Itai Antebi
Michal Irani
Niv Haim
67
3
0
22 Jul 2024
Towards Latent Masked Image Modeling for Self-Supervised Visual Representation Learning
Yibing Wei
Abhinav Gupta
Pedro Morgado
SSL
75
8
0
22 Jul 2024
Learning to Manipulate Anywhere: A Visual Generalizable Framework For Reinforcement Learning
Zhecheng Yuan
Tianming Wei
Shuiqi Cheng
Gu Zhang
Yuanpei Chen
Huazhe Xu
98
23
0
22 Jul 2024
Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget
Vikash Sehwag
Xianghao Kong
Jingtao Li
Michael Spranger
Lingjuan Lyu
DiffM
88
11
0
22 Jul 2024
Disentangling spatio-temporal knowledge for weakly supervised object detection and segmentation in surgical video
Guiqiu Liao
M. Jogan
Sai Koushik
Eric Eaton
Daniel A. Hashimoto
VOS
106
2
0
22 Jul 2024
SIGMA:Sinkhorn-Guided Masked Video Modeling
Mohammadreza Salehi
Michael Dorkenwald
Fida Mohammad Thoker
E. Gavves
Cees G. M. Snoek
Yuki M. Asano
96
7
0
22 Jul 2024
Towards Robust Vision Transformer via Masked Adaptive Ensemble
Fudong Lin
Jiadong Lou
Xu Yuan
Nianfeng Tzeng
ViT
AAML
91
2
0
22 Jul 2024
Exploring the Effectiveness of Object-Centric Representations in Visual Question Answering: Comparative Insights with Foundation Models
Amir Mohammad Karimi Mamaghan
Samuele Papa
Karl Henrik Johansson
Stefan Bauer
Andrea Dittadi
OCL
174
9
0
22 Jul 2024
Self-supervised transformer-based pre-training method with General Plant Infection dataset
Zhengle Wang
Ruifeng Wang
Minjuan Wang
Tianyun Lai
Man Zhang
103
11
0
20 Jul 2024
MedMAE: A Self-Supervised Backbone for Medical Imaging Tasks
Anubhav Gupta
Islam I. Osman
Mohamed S. Shehata
John W. Braun
59
1
0
20 Jul 2024
CrowdMAC: Masked Crowd Density Completion for Robust Crowd Density Forecasting
Ryoske Fujii
Ryo Hachiuma
Hideo Saito
129
1
0
20 Jul 2024
Downstream-Pretext Domain Knowledge Traceback for Active Learning
Beichen Zhang
Liang-Sheng Li
Zheng-Jun Zha
Jiebo Luo
Qingming Huang
72
0
0
20 Jul 2024
Universal Medical Imaging Model for Domain Generalization with Data Privacy
Ahmed Radwan
Islam I. Osman
Mohamed S. Shehata
49
2
0
20 Jul 2024
Dyn-Adapter: Towards Disentangled Representation for Efficient Visual Recognition
Yurong Zhang
Honghao Chen
Xinyu Zhang
Xiangxiang Chu
Li Song
105
1
0
19 Jul 2024
I Know About "Up"! Enhancing Spatial Reasoning in Visual Language Models Through 3D Reconstruction
Zaiqiao Meng
Hao Zhou
Yifang Chen
68
4
0
19 Jul 2024
Multi-modal Relation Distillation for Unified 3D Representation Learning
Huiqun Wang
Yiping Bao
Panwang Pan
Zeming Li
Xiao Liu
Ruijie Yang
Di Huang
87
0
0
19 Jul 2024
Improving Representation of High-frequency Components for Medical Visual Foundation Models
Yuetan Chu
Yilan Zhang
Zhongyi Han
Changchun Yang
Longxi Zhou
Gongning Luo
Chao Huang
Xin Gao
MedIm
140
1
0
19 Jul 2024
X-Former: Unifying Contrastive and Reconstruction Learning for MLLMs
S. Swetha
Jinyu Yang
T. Neiman
Mamshad Nayeem Rizve
Son Tran
Benjamin Z. Yao
Trishul Chilimbi
Mubarak Shah
112
2
0
18 Jul 2024
PICASSO: A Feed-Forward Framework for Parametric Inference of CAD Sketches via Rendering Self-Supervision
Ahmet Serdar Karadeniz
Dimitrios Mallis
Nesryne Mejri
K. Cherenkova
Anis Kacem
Djamila Aouada
77
4
0
18 Jul 2024
Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation
Pengfei Wang
Yuxi Wang
Shuai Li
Zhaoxiang Zhang
Zhen Lei
Lei Zhang
110
3
0
18 Jul 2024
Long Input Sequence Network for Long Time Series Forecasting
Chao Ma
Yikai Hou
Xiang Li
Yinggang Sun
Haining Yu
AI4TS
54
0
0
18 Jul 2024
Universal Facial Encoding of Codec Avatars from VR Headsets
Shaojie Bai
Tenia Wang
Chenghui Li
Akshay Venkatesh
Tomas Simon
...
Gabriel Schwartz
Ryan Wrench
Jason M. Saragih
Yaser Sheikh
S. Wei
3DH
131
6
0
17 Jul 2024
ColorMAE: Exploring data-independent masking strategies in Masked AutoEncoders
Carlos Hinojosa
Shuming Liu
Guohao Li
72
2
0
17 Jul 2024
Enhancing Gene Expression Prediction from Histology Images with Spatial Transcriptomics Completion
Gabriel Mejía
Daniela Ruiz
Paula Cárdenas
Leonardo Manrique
Daniela Vega
Pablo Arbelaez
24
0
0
17 Jul 2024
Benchmarking Robust Self-Supervised Learning Across Diverse Downstream Tasks
Antoni Kowalczuk
Jan Dubiñski
Atiyeh Ashari Ghomi
Yi Sui
George Stein
Jiapeng Wu
Jesse C. Cresswell
Franziska Boenisch
Adam Dziedzic
SSL
AAML
77
3
0
17 Jul 2024
Progressive Proxy Anchor Propagation for Unsupervised Semantic Segmentation
Hyun Seok Seong
WonJun Moon
Subeen Lee
Jae-Pil Heo
90
1
0
17 Jul 2024
Label-Efficient 3D Brain Segmentation via Complementary 2D Diffusion Models with Orthogonal Views
Jihoon Cho
Suhyun Ahn
Beomju Kim
Hyungjoon Bae
Xiaofeng Liu
...
Kyungeun Lee
Georges Elfakhri
Van Wedeen
Jonghye Woo
Jinah Park
MedIm
DiffM
58
0
0
17 Jul 2024
Efficient Depth-Guided Urban View Synthesis
Sheng Miao
Jiaxin Huang
Dongfeng Bai
Weichao Qiu
Bingbing Liu
Andreas Geiger
Yiyi Liao
102
3
0
17 Jul 2024
Previous
1
2
3
...
25
26
27
...
94
95
96
Next