ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.06377
  4. Cited By
Masked Autoencoders Are Scalable Vision Learners
v1v2v3 (latest)

Masked Autoencoders Are Scalable Vision Learners

11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
    ViTTPM
ArXiv (abs)PDFHTML

Papers citing "Masked Autoencoders Are Scalable Vision Learners"

50 / 4,779 papers shown
Title
Efficient RL via Disentangled Environment and Agent Representations
Efficient RL via Disentangled Environment and Agent Representations
Kevin Gmelin
Shikhar Bahl
Russell Mendonca
Deepak Pathak
DRL
73
9
0
05 Sep 2023
SeisCLIP: A seismology foundation model pre-trained by multi-modal data
  for multi-purpose seismic feature extraction
SeisCLIP: A seismology foundation model pre-trained by multi-modal data for multi-purpose seismic feature extraction
Xu Si
Xinming Wu
Hanlin Sheng
Jun Zhu
Zefeng Li
66
14
0
05 Sep 2023
Hierarchical Masked 3D Diffusion Model for Video Outpainting
Hierarchical Masked 3D Diffusion Model for Video Outpainting
Fanda Fan
Chaoxu Guo
Litong Gong
Biao Wang
T. Ge
Yuning Jiang
Chunjie Luo
Jianfeng Zhan
DiffMVGen
85
15
0
05 Sep 2023
Probabilistic Self-supervised Learning via Scoring Rules Minimization
Probabilistic Self-supervised Learning via Scoring Rules Minimization
Amirhossein Vahidi
Simon Schoßer
Lisa Wimmer
Yawei Li
B. Bischl
Eyke Hüllermeier
Mina Rezaei
SSL
73
2
0
05 Sep 2023
Empowering Low-Light Image Enhancer through Customized Learnable Priors
Empowering Low-Light Image Enhancer through Customized Learnable Priors
Naishan Zheng
Man Zhou
Yanmeng Dong
Xiangyu Rui
Jie Huang
Chongyi Li
Fengmei Zhao
115
30
0
05 Sep 2023
Extract-and-Adaptation Network for 3D Interacting Hand Mesh Recovery
Extract-and-Adaptation Network for 3D Interacting Hand Mesh Recovery
J. Park
Daniel Sungho Jung
Gyeongsik Moon
Kyoung Mu Lee
85
6
0
05 Sep 2023
Corgi^2: A Hybrid Offline-Online Approach To Storage-Aware Data
  Shuffling For SGD
Corgi^2: A Hybrid Offline-Online Approach To Storage-Aware Data Shuffling For SGD
Etay Livne
Gal Kaplun
Eran Malach
Shai Shalev-Schwatz
OffRL
100
0
0
04 Sep 2023
Locality-Aware Hyperspectral Classification
Locality-Aware Hyperspectral Classification
Fangqin Zhou
Mert Kilickaya
Joaquin Vanschoren
ViT
31
6
0
04 Sep 2023
DAT++: Spatially Dynamic Vision Transformer with Deformable Attention
DAT++: Spatially Dynamic Vision Transformer with Deformable Attention
Zhuofan Xia
Xuran Pan
Shiji Song
Li Erran Li
Gao Huang
ViT
93
27
0
04 Sep 2023
Leveraging Self-Supervised Vision Transformers for Segmentation-based
  Transfer Function Design
Leveraging Self-Supervised Vision Transformers for Segmentation-based Transfer Function Design
Dominik Engel
Leon Sick
Timo Ropinski
ViT
84
0
0
04 Sep 2023
COMEDIAN: Self-Supervised Learning and Knowledge Distillation for Action
  Spotting using Transformers
COMEDIAN: Self-Supervised Learning and Knowledge Distillation for Action Spotting using Transformers
J. Denize
Mykola Liashuha
Jaonary Rabarisoa
Astrid Orcesi
Romain Hérault
ViT
66
13
0
03 Sep 2023
Large AI Model Empowered Multimodal Semantic Communications
Large AI Model Empowered Multimodal Semantic Communications
Feibo Jiang
Yubo Peng
Li Dong
Kezhi Wang
Kun Yang
Cunhua Pan
Xiaohu You
101
47
0
03 Sep 2023
EdaDet: Open-Vocabulary Object Detection Using Early Dense Alignment
EdaDet: Open-Vocabulary Object Detection Using Early Dense Alignment
Cheng Shi
Sibei Yang
VLMObjD
89
39
0
03 Sep 2023
RevColV2: Exploring Disentangled Representations in Masked Image
  Modeling
RevColV2: Exploring Disentangled Representations in Masked Image Modeling
Qi Han
Yuxuan Cai
Xiangyu Zhang
123
8
0
02 Sep 2023
Contrastive Feature Masking Open-Vocabulary Vision Transformer
Contrastive Feature Masking Open-Vocabulary Vision Transformer
Dahun Kim
A. Angelova
Weicheng Kuo
ObjDVLM
125
27
0
02 Sep 2023
Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D
  Understanding, Generation, and Instruction Following
Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following
Ziyu Guo
Renrui Zhang
Xiangyang Zhu
Yiwen Tang
Xianzheng Ma
...
Ke Chen
Peng Gao
Xianzhi Li
Hongsheng Li
Pheng-Ann Heng
MLLM
110
146
0
01 Sep 2023
Geometry-aware Line Graph Transformer Pre-training for Molecular
  Property Prediction
Geometry-aware Line Graph Transformer Pre-training for Molecular Property Prediction
Peizhen Bai
Xianyuan Liu
Haiping Lu
ViTAI4CE
73
2
0
01 Sep 2023
A Locality-based Neural Solver for Optical Motion Capture
A Locality-based Neural Solver for Optical Motion Capture
Xiaoyu Pan
Bowen Zheng
Xinwei Jiang
Guanglong Xu
Xianli Gu
...
Qilong Kou
He Wang
Tianjia Shao
Kun Zhou
Xiaogang Jin
48
5
0
01 Sep 2023
RigNet++: Semantic Assisted Repetitive Image Guided Network for Depth
  Completion
RigNet++: Semantic Assisted Repetitive Image Guided Network for Depth Completion
Zhiqiang Yan
Xiang Li
Le Hui
Zhenyu Zhang
Jun Yu Li
Jian Yang
VLM3DV
113
5
0
01 Sep 2023
RePo: Resilient Model-Based Reinforcement Learning by Regularizing
  Posterior Predictability
RePo: Resilient Model-Based Reinforcement Learning by Regularizing Posterior Predictability
Chuning Zhu
Max Simchowitz
Siri Gadipudi
Abhishek Gupta
107
14
0
31 Aug 2023
TouchStone: Evaluating Vision-Language Models by Language Models
TouchStone: Evaluating Vision-Language Models by Language Models
Shuai Bai
Shusheng Yang
Jinze Bai
Peng Wang
Xing Zhang
Junyang Lin
Xinggang Wang
Chang Zhou
Jingren Zhou
MLLM
119
48
0
31 Aug 2023
SportsSloMo: A New Benchmark and Baselines for Human-centric Video Frame
  Interpolation
SportsSloMo: A New Benchmark and Baselines for Human-centric Video Frame Interpolation
Jiaben Chen
Huaizu Jiang
3DH
79
7
0
31 Aug 2023
Masked Transformer for Electrocardiogram Classification
Masked Transformer for Electrocardiogram Classification
Ya Zhou
Xiaolin Diao
Yanni Huo
Yang Liu
Xiaohan Fan
Wei Zhao
MedIm
76
2
0
31 Aug 2023
CL-MAE: Curriculum-Learned Masked Autoencoders
CL-MAE: Curriculum-Learned Masked Autoencoders
Neelu Madan
Nicolae-Cătălin Ristea
Kamal Nasrollahi
T. Moeslund
Radu Tudor Ionescu
106
12
0
31 Aug 2023
Self-Sampling Meta SAM: Enhancing Few-shot Medical Image Segmentation
  with Meta-Learning
Self-Sampling Meta SAM: Enhancing Few-shot Medical Image Segmentation with Meta-Learning
Yiming Zhang
Tianang Leng
Kun Han
Xiaohui Xie
101
19
0
31 Aug 2023
Emergence of Segmentation with Minimalistic White-Box Transformers
Emergence of Segmentation with Minimalistic White-Box Transformers
Yaodong Yu
Tianzhe Chu
Shengbang Tong
Ziyang Wu
Druv Pai
Sam Buchanan
Yi Ma
ViT
50
22
0
30 Aug 2023
Learned Image Reasoning Prior Penetrates Deep Unfolding Network for
  Panchromatic and Multi-Spectral Image Fusion
Learned Image Reasoning Prior Penetrates Deep Unfolding Network for Panchromatic and Multi-Spectral Image Fusion
Man Zhou
Jie Huang
Naishan Zheng
Chongyi Li
37
7
0
30 Aug 2023
Exploring Multi-Modal Contextual Knowledge for Open-Vocabulary Object
  Detection
Exploring Multi-Modal Contextual Knowledge for Open-Vocabulary Object Detection
Yifan Xu
Mengdan Zhang
Xiaoshan Yang
Changsheng Xu
ObjD
84
5
0
30 Aug 2023
Towards a Rigorous Analysis of Mutual Information in Contrastive
  Learning
Towards a Rigorous Analysis of Mutual Information in Contrastive Learning
Kyungeun Lee
Jaeill Kim
Suhyun Kang
Wonjong Rhee
SSL
82
2
0
30 Aug 2023
Prototype Fission: Closing Set for Robust Open-set Semi-supervised
  Learning
Prototype Fission: Closing Set for Robust Open-set Semi-supervised Learning
Xuwei Tan
Yi-Jie Huang
Yaqian Li
131
2
0
29 Aug 2023
A General-Purpose Self-Supervised Model for Computational Pathology
A General-Purpose Self-Supervised Model for Computational Pathology
Richard J. Chen
Tong Ding
Ming Y. Lu
Drew F. K. Williamson
Guillaume Jaume
...
Judy J. Wang
Walt Williams
L. Le
Georg Gerber
Faisal Mahmood
MedIm
140
44
0
29 Aug 2023
Efficient Model Personalization in Federated Learning via
  Client-Specific Prompt Generation
Efficient Model Personalization in Federated Learning via Client-Specific Prompt Generation
Fu-En Yang
Chien-Yi Wang
Yu-Chiang Frank Wang
VLMFedML
112
69
0
29 Aug 2023
Enhancing Robot Learning through Learned Human-Attention Feature Maps
Enhancing Robot Learning through Learned Human-Attention Feature Maps
D. Scheuchenstuhl
Stefan Ulmer
Felix Resch
Luigi Berducci
Radu Grosu
69
0
0
29 Aug 2023
Cross-Modal Retrieval Meets Inference:Improving Zero-Shot Classification
  with Cross-Modal Retrieval
Cross-Modal Retrieval Meets Inference:Improving Zero-Shot Classification with Cross-Modal Retrieval
Seong-Hoon Eom
Namgyu Ho
Jaehoon Oh
Se-Young Yun
CLIPVLM
75
0
0
29 Aug 2023
PronounFlow: A Hybrid Approach for Calibrating Pronouns in Sentences
PronounFlow: A Hybrid Approach for Calibrating Pronouns in Sentences
Nicos Isaak
66
1
0
29 Aug 2023
When hard negative sampling meets supervised contrastive learning
When hard negative sampling meets supervised contrastive learning
Zijun Long
George Killick
R. McCreadie
Gerardo Aragon Camarasa
Zaiqiao Meng
SSL
59
3
0
28 Aug 2023
Efficient Discovery and Effective Evaluation of Visual Perceptual
  Similarity: A Benchmark and Beyond
Efficient Discovery and Effective Evaluation of Visual Perceptual Similarity: A Benchmark and Beyond
Oren Barkan
Tal Reiss
Jonathan Weill
Ori Katz
Roy Hirsch
Itzik Malkiel
Noam Koenigstein
83
6
0
28 Aug 2023
VideoCutLER: Surprisingly Simple Unsupervised Video Instance
  Segmentation
VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation
Xudong Wang
Ishan Misra
Ziyun Zeng
Rohit Girdhar
Trevor Darrell
93
18
0
28 Aug 2023
Diversified Ensemble of Independent Sub-Networks for Robust
  Self-Supervised Representation Learning
Diversified Ensemble of Independent Sub-Networks for Robust Self-Supervised Representation Learning
Amirhossein Vahidi
Lisa Wimmer
H. Gündüz
Bernd Bischl
Eyke Hüllermeier
Mina Rezaei
OODUQCV
100
4
0
28 Aug 2023
Self-Supervision for Tackling Unsupervised Anomaly Detection: Pitfalls
  and Opportunities
Self-Supervision for Tackling Unsupervised Anomaly Detection: Pitfalls and Opportunities
Leman Akoglu
Jaemin Yoo
63
1
0
28 Aug 2023
A Unified Transformer-based Network for multimodal Emotion Recognition
A Unified Transformer-based Network for multimodal Emotion Recognition
Kamran Ali
Charles E. Hughes
85
1
0
27 Aug 2023
Nonrigid Object Contact Estimation With Regional Unwrapping Transformer
Nonrigid Object Contact Estimation With Regional Unwrapping Transformer
Wei Xie
Zimeng Zhao
Shiying Li
Binghui Zuo
Yangang Wang
67
4
0
27 Aug 2023
Forensic Histopathological Recognition via a Context-Aware MIL Network
  Powered by Self-Supervised Contrastive Learning
Forensic Histopathological Recognition via a Context-Aware MIL Network Powered by Self-Supervised Contrastive Learning
Chen Shen
Jun Zhang
Xinggong Liang
Zeyi Hao
Ke Li
Fan Wang
Zhenyuan Wang
C. Lian
45
2
0
27 Aug 2023
Computation-efficient Deep Learning for Computer Vision: A Survey
Computation-efficient Deep Learning for Computer Vision: A Survey
Yulin Wang
Yizeng Han
Chaofei Wang
Shiji Song
Qi Tian
Gao Huang
VLM
141
21
0
27 Aug 2023
Attending Generalizability in Course of Deep Fake Detection by Exploring
  Multi-task Learning
Attending Generalizability in Course of Deep Fake Detection by Exploring Multi-task Learning
P. Balaji
Abhijit Das
Srijan Das
A. Dantcheva
CVBM
61
4
0
25 Aug 2023
Fine-tuning can cripple your foundation model; preserving features may
  be the solution
Fine-tuning can cripple your foundation model; preserving features may be the solution
Jishnu Mukhoti
Y. Gal
Philip Torr
P. Dokania
CLL
135
47
0
25 Aug 2023
AtmoRep: A stochastic model of atmosphere dynamics using large scale
  representation learning
AtmoRep: A stochastic model of atmosphere dynamics using large scale representation learning
C. Lessig
Ilaria Luise
Bing Gong
M. Langguth
S. Stadtler
Martin G. Schultz
54
29
0
25 Aug 2023
Integrating Boxes and Masks: A Multi-Object Framework for Unified Visual
  Tracking and Segmentation
Integrating Boxes and Masks: A Multi-Object Framework for Unified Visual Tracking and Segmentation
Yuanyou Xu
Zongxin Yang
Yi Yang
VOS
112
9
0
25 Aug 2023
TC-LIF: A Two-Compartment Spiking Neuron Model for Long-Term Sequential
  Modelling
TC-LIF: A Two-Compartment Spiking Neuron Model for Long-Term Sequential Modelling
Shimin Zhang
Qu Yang
Chenxiang Ma
Jibin Wu
Haizhou Li
Kay Chen Tan
83
20
0
25 Aug 2023
Joint Modeling of Feature, Correspondence, and a Compressed Memory for Video Object Segmentation
Joint Modeling of Feature, Correspondence, and a Compressed Memory for Video Object Segmentation
Jiaming Zhang
Yutao Cui
Gangshan Wu
Limin Wang
VOS
130
10
0
25 Aug 2023
Previous
123...575859...949596
Next