ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.06377
  4. Cited By
Masked Autoencoders Are Scalable Vision Learners

Masked Autoencoders Are Scalable Vision Learners

11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
    ViT
    TPM
ArXivPDFHTML

Papers citing "Masked Autoencoders Are Scalable Vision Learners"

50 / 4,613 papers shown
Title
Efficient Object-centric Representation Learning with Pre-trained
  Geometric Prior
Efficient Object-centric Representation Learning with Pre-trained Geometric Prior
Phúc H. Lê Khắc
Graham Healy
Alan F. Smeaton
OCL
84
0
0
16 Dec 2024
SAMIC: Segment Anything with In-Context Spatial Prompt Engineering
SAMIC: Segment Anything with In-Context Spatial Prompt Engineering
S. Nagendra
Kashif Rashid
Chaopeng Shen
Daniel Kifer
VLM
76
2
0
16 Dec 2024
GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training
GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training
Renqiu Xia
M. Li
Hancheng Ye
Wenjie Wu
Hongbin Zhou
...
Zeang Sheng
Botian Shi
Tao Chen
Junchi Yan
Bo Zhang
91
7
0
16 Dec 2024
AMI-Net: Adaptive Mask Inpainting Network for Industrial Anomaly
  Detection and Localization
AMI-Net: Adaptive Mask Inpainting Network for Industrial Anomaly Detection and Localization
Wei Luo
Haiming Yao
Wenyong Yu
Zhengyong Li
69
12
0
16 Dec 2024
CiTrus: Squeezing Extra Performance out of Low-data Bio-signal Transfer
  Learning
CiTrus: Squeezing Extra Performance out of Low-data Bio-signal Transfer Learning
Eloy Geenjaar
Lie Lu
82
0
0
16 Dec 2024
$\texttt{DINO-Foresight}$: Looking into the Future with DINO
DINO-Foresight\texttt{DINO-Foresight}DINO-Foresight: Looking into the Future with DINO
Efstathios Karypidis
Ioannis Kakogeorgiou
Spyros Gidaris
N. Komodakis
AI4CE
87
2
0
16 Dec 2024
LineArt: A Knowledge-guided Training-free High-quality Appearance
  Transfer for Design Drawing with Diffusion Model
LineArt: A Knowledge-guided Training-free High-quality Appearance Transfer for Design Drawing with Diffusion Model
Xi Wang
Yiming Li
Heng Fang
Yichen Peng
H. Xie
Xi Yang
Chuntao Li
DiffM
74
0
0
16 Dec 2024
Wonderland: Navigating 3D Scenes from a Single Image
Wonderland: Navigating 3D Scenes from a Single Image
Hanwen Liang
Junli Cao
Vidit Goel
Guocheng Qian
Sergei Korolev
Demetri Terzopoulos
Konstantinos N. Plataniotis
Sergey Tulyakov
Jian Ren
VGen
128
11
0
16 Dec 2024
One-Shot Multilingual Font Generation Via ViT
One-Shot Multilingual Font Generation Via ViT
Zhiheng Wang
Jiarui Liu
VLM
78
0
0
15 Dec 2024
Wearable Accelerometer Foundation Models for Health via Knowledge Distillation
Wearable Accelerometer Foundation Models for Health via Knowledge Distillation
Salar Abbaspourazad
Anshuman Mishra
Joseph D. Futoma
Andrew C. Miller
Ian Shapiro
90
0
0
15 Dec 2024
Unconstrained Salient and Camouflaged Object Detection
Unconstrained Salient and Camouflaged Object Detection
Zhangjun Zhou
Yiping Li
Chunlin Zhong
Jianuo Huang
Jialun Pei
He Tang
84
0
0
14 Dec 2024
Medical Manifestation-Aware De-Identification
Medical Manifestation-Aware De-Identification
Yuan Tian
Shuo Wang
Guangtao Zhai
MedIm
73
0
0
14 Dec 2024
MAL: Cluster-Masked and Multi-Task Pretraining for Enhanced xLSTM Vision
  Performance
MAL: Cluster-Masked and Multi-Task Pretraining for Enhanced xLSTM Vision Performance
Wenjun Huang
Jianguo Hu
84
0
0
14 Dec 2024
Video Diffusion Transformers are In-Context Learners
Video Diffusion Transformers are In-Context Learners
Zhengcong Fei
Di Qiu
Changqian Yu
Debang Li
Mingyuan Fan
VGen
DiffM
211
2
0
14 Dec 2024
RegMixMatch: Optimizing Mixup Utilization in Semi-Supervised Learning
RegMixMatch: Optimizing Mixup Utilization in Semi-Supervised Learning
Haorong Han
Jidong Yuan
Chixuan Wei
Zhongyang Yu
202
1
0
14 Dec 2024
SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer
SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer
Hongyu Chen
Zihan Wang
Xianrui Li
Xingchen Sun
Fangyi Chen
Jiang Liu
Jiadong Wang
Bhiksha Raj
Zicheng Liu
Emad Barsoum
VLM
114
7
0
14 Dec 2024
Towards Unified Benchmark and Models for Multi-Modal Perceptual Metrics
Towards Unified Benchmark and Models for Multi-Modal Perceptual Metrics
Sara Ghazanfari
Siddharth Garg
Nicolas Flammarion
Prashanth Krishnamurthy
Farshad Khorrami
Francesco Croce
VLM
94
0
0
13 Dec 2024
Feat2GS: Probing Visual Foundation Models with Gaussian Splatting
Feat2GS: Probing Visual Foundation Models with Gaussian Splatting
Yue Chen
Xingyu Chen
Anpei Chen
Gerard Pons-Moll
Yuliang Xiu
3DGS
86
3
0
12 Dec 2024
Gaze-LLE: Gaze Target Estimation via Large-Scale Learned Encoders
Fiona Ryan
Ajay Bati
Sangmin Lee
Daniel Bolya
Judy Hoffman
James M. Rehg
176
2
0
12 Dec 2024
USDRL: Unified Skeleton-Based Dense Representation Learning with
  Multi-Grained Feature Decorrelation
USDRL: Unified Skeleton-Based Dense Representation Learning with Multi-Grained Feature Decorrelation
Wanjiang Weng
Hongsong Wang
Junbo He
Lei He
Guosen Xie
91
2
0
12 Dec 2024
Cross-View Completion Models are Zero-shot Correspondence Estimators
Cross-View Completion Models are Zero-shot Correspondence Estimators
Honggyu An
J. Kim
Seonghoon Park
Jaewoo Jung
Jisang Han
Sunghwan Hong
Seungryong Kim
3DV
82
3
0
12 Dec 2024
3D Mesh Editing using Masked LRMs
3D Mesh Editing using Masked LRMs
Will Gao
Dilin Wang
Yuchen Fan
Aljaz Bozic
Tuur Stuyck
Zhengqin Li
Zhao Dong
Rakesh Ranjan
N. Sarafianos
106
2
0
11 Dec 2024
Orchestrating the Symphony of Prompt Distribution Learning for
  Human-Object Interaction Detection
Orchestrating the Symphony of Prompt Distribution Learning for Human-Object Interaction Detection
Mingda Jia
Liming Zhao
Ge Li
Yun Zheng
VLM
78
0
0
11 Dec 2024
DiffCLIP: Few-shot Language-driven Multimodal Classifier
DiffCLIP: Few-shot Language-driven Multimodal Classifier
Jiaqing Zhang
Mingxiang Cao
Xue Yang
Kai Jiang
Yunsong Li
VLM
82
0
0
10 Dec 2024
DFREC: DeepFake Identity Recovery Based on Identity-aware Masked Autoencoder
DFREC: DeepFake Identity Recovery Based on Identity-aware Masked Autoencoder
Peipeng Yu
Hui Gao
Zhitao Huang
Zhihua Xia
Chip-Hong Chang
Chip-Hong Chang
80
0
0
10 Dec 2024
[MASK] is All You Need
[MASK] is All You Need
Vincent Tao Hu
Bjorn Ommer
DiffM
137
2
0
09 Dec 2024
Self-Supervised Learning with Probabilistic Density Labeling for
  Rainfall Probability Estimation
Self-Supervised Learning with Probabilistic Density Labeling for Rainfall Probability Estimation
Junha Lee
Sojung An
Sujeong You
Namik Cho
78
0
0
08 Dec 2024
Remix-DiT: Mixing Diffusion Transformers for Multi-Expert Denoising
Remix-DiT: Mixing Diffusion Transformers for Multi-Expert Denoising
Gongfan Fang
Xinyin Ma
Xinchao Wang
DiffM
MoE
104
0
0
07 Dec 2024
SQ-Whisper: Speaker-Querying based Whisper Model for Target-Speaker ASR
SQ-Whisper: Speaker-Querying based Whisper Model for Target-Speaker ASR
Pengcheng Guo
Xuankai Chang
Hang Lv
Shinji Watanabe
Lei Xie
66
0
0
07 Dec 2024
Slicing Vision Transformer for Flexible Inference
Slicing Vision Transformer for Flexible Inference
Yitian Zhang
Huseyin Coskun
Xu Ma
Huan Wang
Ke Ma
Xi
Chen
Derek Hao Hu
Y. Fu
ViT
81
0
0
06 Dec 2024
Scalable Early Childhood Reading Performance Prediction
Scalable Early Childhood Reading Performance Prediction
Zhongkai Shangguan
Zanming Huang
Eshed Ohn-Bar
Ola Ozernov-Palchik
Derek Kosty
Michael Stoolmiller
Hank Fien
AI4Ed
70
1
0
05 Dec 2024
Towards Real-Time Open-Vocabulary Video Instance Segmentation
Towards Real-Time Open-Vocabulary Video Instance Segmentation
Bin Yan
Martin Sundermeyer
D. Tan
Huchuan Lu
F. Tombari
VLM
VOS
97
1
0
05 Dec 2024
Towards Zero-shot 3D Anomaly Localization
Towards Zero-shot 3D Anomaly Localization
Yizhou Wang
Kuan-Chuan Peng
Y. Fu
78
3
0
05 Dec 2024
CLAP: Unsupervised 3D Representation Learning for Fusion 3D Perception via Curvature Sampling and Prototype Learning
CLAP: Unsupervised 3D Representation Learning for Fusion 3D Perception via Curvature Sampling and Prototype Learning
Runjian Chen
H. Zhang
Avinash Ravichandran
Wenqi Shao
Alex Wong
Ping Luo
Ping Luo
3DPC
83
0
0
04 Dec 2024
Beyond [cls]: Exploring the true potential of Masked Image Modeling representations
Beyond [cls]: Exploring the true potential of Masked Image Modeling representations
Marcin Przewiȩźlikowski
Randall Balestriero
Wojciech Jasiński
Marek 'Smieja
Bartosz Zieliñski
69
0
0
04 Dec 2024
Mixture of Physical Priors Adapter for Parameter-Efficient Fine-Tuning
Mixture of Physical Priors Adapter for Parameter-Efficient Fine-Tuning
Zehua Wang
C. J. Li
QiXiang Ye
Tong Zhang
MoE
84
1
0
03 Dec 2024
Medical Multimodal Foundation Models in Clinical Diagnosis and
  Treatment: Applications, Challenges, and Future Directions
Medical Multimodal Foundation Models in Clinical Diagnosis and Treatment: Applications, Challenges, and Future Directions
Kai Sun
Siyan Xue
F. Sun
Haoran Sun
Yu-Juan Luo
...
Xinzhou Wang
Lei Yang
Shuo Jin
Jun Yan
Jiahong Dong
AI4CE
76
2
0
03 Dec 2024
Noisy Ostracods: A Fine-Grained, Imbalanced Real-World Dataset for
  Benchmarking Robust Machine Learning and Label Correction Methods
Noisy Ostracods: A Fine-Grained, Imbalanced Real-World Dataset for Benchmarking Robust Machine Learning and Label Correction Methods
Jiamian Hu
Yuanyuan Hong
Yihua Chen
He Wang
Moriaki Yasuhara
71
0
0
03 Dec 2024
Prithvi-EO-2.0: A Versatile Multi-Temporal Foundation Model for Earth Observation Applications
Prithvi-EO-2.0: A Versatile Multi-Temporal Foundation Model for Earth Observation Applications
Daniela Szwarcman
Sujit Roy
P. Fraccaro
Þorsteinn Elí Gíslason
Benedikt Blumenstiel
...
Rahul Ramachandran
Juan Bernabé-Moreno
Manil Maskey
Rahul Ramachandran
Juan Bernabe Moreno
VLM
83
13
0
03 Dec 2024
RandAR: Decoder-only Autoregressive Visual Generation in Random Orders
RandAR: Decoder-only Autoregressive Visual Generation in Random Orders
Ziqi Pang
Tianyuan Zhang
Fujun Luan
Yunze Man
Hao Tan
Kai Zhang
William T. Freeman
Yu-Xiong Wang
VGen
81
14
0
02 Dec 2024
Gen-SIS: Generative Self-augmentation Improves Self-supervised Learning
Gen-SIS: Generative Self-augmentation Improves Self-supervised Learning
Varun Belagali
Srikar Yellapragada
Alexandros Graikos
S. Kapse
Zilinghan Li
Tarak Nandi
Ravi K. Madduri
Prateek Prasanna
Joel H. Saltz
Dimitris Samaras
DiffM
83
1
0
02 Dec 2024
COSMOS: Cross-Modality Self-Distillation for Vision Language Pre-training
COSMOS: Cross-Modality Self-Distillation for Vision Language Pre-training
Sanghwan Kim
Rui Xiao
Mariana-Iuliana Georgescu
Stephan Alaniz
Zeynep Akata
VLM
85
2
0
02 Dec 2024
OmniGuard: Hybrid Manipulation Localization via Augmented Versatile Deep Image Watermarking
OmniGuard: Hybrid Manipulation Localization via Augmented Versatile Deep Image Watermarking
X. Zhang
Zecheng Tang
Zhipei Xu
Runyi Li
Youmin Xu
Bin Chen
Feng Gao
Jian Zhang
WIGM
93
4
0
02 Dec 2024
Token Cropr: Faster ViTs for Quite a Few Tasks
Token Cropr: Faster ViTs for Quite a Few Tasks
Benjamin Bergner
C. Lippert
Aravindh Mahendran
ViT
VLM
74
0
0
01 Dec 2024
FiffDepth: Feed-forward Transformation of Diffusion-Based Generators for Detailed Depth Estimation
FiffDepth: Feed-forward Transformation of Diffusion-Based Generators for Detailed Depth Estimation
Yunpeng Bai
Qixing Huang
DiffM
94
0
0
01 Dec 2024
Rethinking Generalizability and Discriminability of Self-Supervised
  Learning from Evolutionary Game Theory Perspective
Rethinking Generalizability and Discriminability of Self-Supervised Learning from Evolutionary Game Theory Perspective
Jiangmeng Li
Zehua Zang
Qirui Ji
Chuxiong Sun
Jingyao Wang
Junge Zhang
Changwen Zheng
Gang Hua
Hui Xiong
SSL
71
0
0
30 Nov 2024
Vision Technologies with Applications in Traffic Surveillance Systems: A Holistic Survey
Wei Zhou
Lei Zhao
Runyu Zhang
Yifan Cui
Hongpu Huang
Kun Qie
Chen Wang
AI4TS
73
0
0
30 Nov 2024
Deepfake Media Generation and Detection in the Generative AI Era: A
  Survey and Outlook
Deepfake Media Generation and Detection in the Generative AI Era: A Survey and Outlook
Florinel-Alin Croitoru
Andrei Iulian Hiji
Vlad Hondru
Nicolae-Cătălin Ristea
Paul Irofti
Marius Popescu
Cristian Rusu
Radu Tudor Ionescu
Fahad Shahbaz Khan
Mubarak Shah
89
3
0
29 Nov 2024
Curriculum Fine-tuning of Vision Foundation Model for Medical Image Classification Under Label Noise
Yeonguk Yu
Minhwan Ko
Sungho Shin
Kangmin Kim
K. Lee
NoLa
82
1
0
29 Nov 2024
Effective Fine-Tuning of Vision-Language Models for Accurate Galaxy
  Morphology Analysis
Effective Fine-Tuning of Vision-Language Models for Accurate Galaxy Morphology Analysis
Ruoqi Wang
Haitao Wang
Qiong Luo
77
1
0
29 Nov 2024
Previous
123...101112...919293
Next