ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.06377
  4. Cited By
Masked Autoencoders Are Scalable Vision Learners
v1v2v3 (latest)

Masked Autoencoders Are Scalable Vision Learners

11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
    ViTTPM
ArXiv (abs)PDFHTML

Papers citing "Masked Autoencoders Are Scalable Vision Learners"

50 / 4,777 papers shown
Title
SP$^2$OT: Semantic-Regularized Progressive Partial Optimal Transport for Imbalanced Clustering
SP2^22OT: Semantic-Regularized Progressive Partial Optimal Transport for Imbalanced Clustering
Chuyu Zhang
Hui Ren
Xuming He
OT
88
1
0
01 Jul 2025
USP: Unified Self-Supervised Pretraining for Image Generation and Understanding
USP: Unified Self-Supervised Pretraining for Image Generation and Understanding
Xiangxiang Chu
Renda Li
Yong Wang
257
1
0
01 Jul 2025
Parameter-Efficient Fine-Tuning for Pre-Trained Vision Models: A Survey and Benchmark
Parameter-Efficient Fine-Tuning for Pre-Trained Vision Models: A Survey and Benchmark
Yi Xin
Jianjiang Yang
Haodi Zhou
Junlong Du
Qi Qin
...
Bin Fu
Xiaokang Yang
Guangtao Zhai
Ming-Hsuan Yang
Xiaohong Liu
VLM
178
86
0
01 Jul 2025
Vision Technologies with Applications in Traffic Surveillance Systems: A Holistic Survey
Wei Zhou
Lei Zhao
Lei Zhao
Runyu Zhang
Yifan Cui
Hongpu Huang
Kun Qie
Chen Wang
AI4TS
201
0
0
01 Jul 2025
Deep generative models as the probability transformation functions
Deep generative models as the probability transformation functions
Vitalii Bondar
Vira Babenko
Roman Trembovetskyi
Yurii Korobeinyk
Viktoriya Dzyuba
22
0
0
20 Jun 2025
Reliable Few-shot Learning under Dual Noises
Reliable Few-shot Learning under Dual Noises
Ji Zhang
Jingkuan Song
Lianli Gao
N. Sebe
Heng Tao Shen
NoLa
24
0
0
19 Jun 2025
Bridging Brain with Foundation Models through Self-Supervised Learning
Hamdi Altaheri
Fakhri Karray
Md. Milon Islam
S M Taslim Uddin Raju
Amir-Hossein Karimi
19
0
0
19 Jun 2025
AeroGPT: Leveraging Large-Scale Audio Model for Aero-Engine Bearing Fault Diagnosis
AeroGPT: Leveraging Large-Scale Audio Model for Aero-Engine Bearing Fault Diagnosis
Jiale Liu
Dandan Peng
Huan Wang
Chenyu Liu
Yan-Fu Li
Min Xie
12
0
0
19 Jun 2025
NTIRE 2025 Image Shadow Removal Challenge Report
NTIRE 2025 Image Shadow Removal Challenge Report
Florin-Alexandru Vasluianu
Tim Seizinger
Z. Zhou
C. L. Philip Chen
Zongwei Wu
...
Suiyi Zhao
Bo Wang
Yan Luo
M. Y. Wang
Yilin Zhang
49
1
0
18 Jun 2025
Foundation Artificial Intelligence Models for Health Recognition Using Face Photographs (FAHR-Face)
Foundation Artificial Intelligence Models for Health Recognition Using Face Photographs (FAHR-Face)
Fridolin Haugg
Grace Lee
John He
Leonard Nürnberg
Dennis Bontempi
...
Christian Guthier
Benjamin H. Kann
Vadim N. Gladyshev
Hugo J. W. L. Aerts
Raymond H. Mak
15
0
0
17 Jun 2025
Discrete JEPA: Learning Discrete Token Representations without Reconstruction
Discrete JEPA: Learning Discrete Token Representations without Reconstruction
Junyeob Baek
Hosung Lee
Christopher Hoang
Mengye Ren
Sungjin Ahn
22
0
0
17 Jun 2025
Scaling-Up the Pretraining of the Earth Observation Foundation Model PhilEO to the MajorTOM Dataset
Scaling-Up the Pretraining of the Earth Observation Foundation Model PhilEO to the MajorTOM Dataset
Nikolaos Dionelis
Jente Bosmans
Riccardo Musto
Giancarlo Paoletti
Simone Sarti
Giacomo Cascarano
Casper Fibaek
Luke Camilleri
B. L. Saux
Nicolas Longépé
22
0
0
17 Jun 2025
Leveraging Satellite Image Time Series for Accurate Extreme Event Detection
Leveraging Satellite Image Time Series for Accurate Extreme Event Detection
Heng Fang
Hossein Azizpour
13
0
0
13 Jun 2025
MRI-CORE: A Foundation Model for Magnetic Resonance Imaging
MRI-CORE: A Foundation Model for Magnetic Resonance Imaging
Haoyu Dong
Yuwen Chen
H. Gu
Nicholas Konz
Yaqian Chen
Qihang Li
Maciej A. Mazurowski
MedImVLM
27
0
0
13 Jun 2025
Uncertainty Awareness Enables Efficient Labeling for Cancer Subtyping in Digital Pathology
Uncertainty Awareness Enables Efficient Labeling for Cancer Subtyping in Digital Pathology
Nirhoshan Sivaroopan
Chamuditha Jayanga Galappaththige
Chalani Ekanayake
Hasindri Watawana
Ranga Rodrigo
Chamira U. S. Edussooriya
D. Wadduwage
18
0
0
13 Jun 2025
RollingQ: Reviving the Cooperation Dynamics in Multimodal Transformer
RollingQ: Reviving the Cooperation Dynamics in Multimodal Transformer
Haotian Ni
Yake Wei
Hang Liu
Gong Chen
Chong Peng
Hao Lin
Di Hu
OffRL
70
0
0
13 Jun 2025
Auto-Connect: Connectivity-Preserving RigFormer with Direct Preference Optimization
Auto-Connect: Connectivity-Preserving RigFormer with Direct Preference Optimization
Jingfeng Guo
Jian Liu
Jinnan Chen
Shiwei Mao
Changrong Hu
...
Jing Xu
Qi Liu
Lixin Xu
Zhuo Chen
Chunchao Guo
30
0
0
13 Jun 2025
SSLAM: Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes
SSLAM: Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes
Tony Alex
S. Ahmed
A. Mustafa
Muhammad Awais
Philip J. B. Jackson
17
1
0
13 Jun 2025
Symmetrical Flow Matching: Unified Image Generation, Segmentation, and Classification with Score-Based Generative Models
Symmetrical Flow Matching: Unified Image Generation, Segmentation, and Classification with Score-Based Generative Models
Francisco Caetano
Christiaan Viviers
Peter H. N. de With
Fons van der Sommen
DiffM
114
0
0
12 Jun 2025
Teaching in adverse scenes: a statistically feedback-driven threshold and mask adjustment teacher-student framework for object detection in UAV images under adverse scenes
Teaching in adverse scenes: a statistically feedback-driven threshold and mask adjustment teacher-student framework for object detection in UAV images under adverse scenes
Hongyu Chen
J. Liu
Yong Wang
Jun Zhu
Dejun Feng
Yakun Xie
23
0
0
12 Jun 2025
Pisces: An Auto-regressive Foundation Model for Image Understanding and Generation
Pisces: An Auto-regressive Foundation Model for Image Understanding and Generation
Zhiyang Xu
Jiuhai Chen
Zhaojiang Lin
Xichen Pan
Lifu Huang
...
Di Jin
Michihiro Yasunaga
Lili Yu
Xi Lin
Shaoliang Nie
121
1
0
12 Jun 2025
BioClinical ModernBERT: A State-of-the-Art Long-Context Encoder for Biomedical and Clinical NLP
BioClinical ModernBERT: A State-of-the-Art Long-Context Encoder for Biomedical and Clinical NLP
Thomas Sounack
Joshua Davis
Brigitte N Durieux
Antoine Chaffin
Tom Pollard
Eric P. Lehman
Alistair E. W. Johnson
Matthew B. A. McDermott
Tristan Naumann
Charlotta Lindvall
MedIm
114
0
0
12 Jun 2025
HyBiomass: Global Hyperspectral Imagery Benchmark Dataset for Evaluating Geospatial Foundation Models in Forest Aboveground Biomass Estimation
HyBiomass: Global Hyperspectral Imagery Benchmark Dataset for Evaluating Geospatial Foundation Models in Forest Aboveground Biomass Estimation
Aaron Banze
Timothée Stassin
Nassim Ait Ali Braham
Rıdvan Salih Kuzu
Simon Besnard
Michael Schmitt
22
0
0
12 Jun 2025
Demonstrating Multi-Suction Item Picking at Scale via Multi-Modal Learning of Pick Success
Demonstrating Multi-Suction Item Picking at Scale via Multi-Modal Learning of Pick Success
Che Wang
Jeroen van Baar
Chaitanya Mitash
Shuai-Peng Li
Dylan Randle
Weiyao Wang
Sumedh Sontakke
Kostas E. Bekris
Kapil Katyal
SSL
115
1
0
12 Jun 2025
UniPre3D: Unified Pre-training of 3D Point Cloud Models with Cross-Modal Gaussian Splatting
UniPre3D: Unified Pre-training of 3D Point Cloud Models with Cross-Modal Gaussian Splatting
Ziyi Wang
Yanran Zhang
Jie Zhou
Jiwen Lu
3DPC3DGS
55
0
0
11 Jun 2025
Attention, Please! Revisiting Attentive Probing for Masked Image Modeling
Attention, Please! Revisiting Attentive Probing for Masked Image Modeling
Bill Psomas
Dionysis Christopoulos
Eirini Baltzi
Ioannis Kakogeorgiou
Tilemachos Aravanis
N. Komodakis
Konstantinos Karantzalos
Yannis Avrithis
Giorgos Tolias
61
0
0
11 Jun 2025
EquiCaps: Predictor-Free Pose-Aware Pre-Trained Capsule Networks
Athinoulla Konstantinou
Georgios Leontidis
Mamatha Thota
A. Durrant
3DPC
77
0
0
11 Jun 2025
California Crop Yield Benchmark: Combining Satellite Image, Climate, Evapotranspiration, and Soil Data Layers for County-Level Yield Forecasting of Over 70 Crops
California Crop Yield Benchmark: Combining Satellite Image, Climate, Evapotranspiration, and Soil Data Layers for County-Level Yield Forecasting of Over 70 Crops
Hamid Kamangir
Mona Hajiesmaeeli
Mason Earles
AI4TS
48
0
0
11 Jun 2025
Segment This Thing: Foveated Tokenization for Efficient Point-Prompted Segmentation
Segment This Thing: Foveated Tokenization for Efficient Point-Prompted Segmentation
Tanner Schmidt
Richard Newcombe
VLM
23
0
0
10 Jun 2025
FUSE: Measure-Theoretic Compact Fuzzy Set Representation for Taxonomy Expansion
Fred Xu
Song Jiang
Z. Huang
Xiao Luo
Shichang Zhang
Adrian Chen
Yizhou Sun
22
3
0
10 Jun 2025
Inherently Faithful Attention Maps for Vision Transformers
Inherently Faithful Attention Maps for Vision Transformers
Ananthu Aniraj
C. Dantas
Dino Ienco
Diego Marcos
OODOCL
32
0
0
10 Jun 2025
Intention-Conditioned Flow Occupancy Models
Chongyi Zheng
S. Park
Sergey Levine
Benjamin Eysenbach
AI4TSOffRLAI4CE
34
0
0
10 Jun 2025
Robust Noise Attenuation via Adaptive Pooling of Transformer Outputs
Robust Noise Attenuation via Adaptive Pooling of Transformer Outputs
Greyson Brothers
ViT
25
0
0
10 Jun 2025
SECOND: Mitigating Perceptual Hallucination in Vision-Language Models via Selective and Contrastive Decoding
Woohyeon Park
Woojin Kim
Jaeik Kim
Jaeyoung Do
VLM
10
0
0
10 Jun 2025
Time Series Representations for Classification Lie Hidden in Pretrained Vision Transformers
Simon Roschmann
Quentin Bouniot
Vasilii Feofanov
I. Redko
Zeynep Akata
AI4TS
34
0
0
10 Jun 2025
Diffuse and Disperse: Image Generation with Representation Regularization
Runqian Wang
Kaiming He
DiffM
46
0
0
10 Jun 2025
UAD: Unsupervised Affordance Distillation for Generalization in Robotic Manipulation
UAD: Unsupervised Affordance Distillation for Generalization in Robotic Manipulation
Yihe Tang
Wenlong Huang
Yingke Wang
Chengshu Li
Roy Yuan
Ruohan Zhang
Jiajun Wu
Li Fei-Fei
48
0
0
10 Jun 2025
Foundation Models in Medical Imaging -- A Review and Outlook
Foundation Models in Medical Imaging -- A Review and Outlook
Vivien van Veldhuizen
Vanessa Botha
C. Lu
Melis Erdal Cesur
Kevin Groot Lipman
...
Cees Snoek
Lodewyk Wessels
Ritse Mann
Eric Marcus
Jonas Teuwen
MedImVLMAI4CE
64
0
0
10 Jun 2025
MapBERT: Bitwise Masked Modeling for Real-Time Semantic Mapping Generation
MapBERT: Bitwise Masked Modeling for Real-Time Semantic Mapping Generation
Yijie Deng
Shuaihang Yuan
Congcong Wen
Hao Huang
Anthony Tzes
Geeta Chandra Raju Bethala
Yi Fang
24
0
0
09 Jun 2025
GIQ: Benchmarking 3D Geometric Reasoning of Vision Foundation Models with Simulated and Real Polyhedra
Mateusz Michalkiewicz
Anekha Sokhal
Tadeusz Michalkiewicz
Piotr Pawlikowski
Mahsa Baktashmotlagh
Varun Jampani
Guha Balakrishnan
22
0
0
09 Jun 2025
SoK: Data Reconstruction Attacks Against Machine Learning Models: Definition, Metrics, and Benchmark
SoK: Data Reconstruction Attacks Against Machine Learning Models: Definition, Metrics, and Benchmark
Rui Wen
Yiyong Liu
Michael Backes
Yang Zhang
AAML
15
0
0
09 Jun 2025
Circumventing Backdoor Space via Weight Symmetry
Circumventing Backdoor Space via Weight Symmetry
Jie Peng
Hongwei Yang
Jing Zhao
Hengji Dong
Hui He
Weizhe Zhang
Haoyu He
AAML
15
0
0
09 Jun 2025
Info-Coevolution: An Efficient Framework for Data Model Coevolution
Ziheng Qin
Hailun Xu
Wei Chee Yew
Qi Jia
Yang Luo
Kanchan Sarkar
Danhui Guan
Kai Wang
Yang You
28
0
0
09 Jun 2025
EgoM2P: Egocentric Multimodal Multitask Pretraining
EgoM2P: Egocentric Multimodal Multitask Pretraining
Gen Li
Yutong Chen
Yiqian Wu
Kaifeng Zhao
Marc Pollefeys
Siyu Tang
EgoVVLM
38
0
0
09 Jun 2025
Towards Generalized Source Tracing for Codec-Based Deepfake Speech
Towards Generalized Source Tracing for Codec-Based Deepfake Speech
Xuanjun Chen
I-Ming Lin
Lin Zhang
Haibin Wu
Hung-yi Lee
J. Jang
20
0
0
08 Jun 2025
UCOD-DPL: Unsupervised Camouflaged Object Detection via Dynamic Pseudo-label Learning
UCOD-DPL: Unsupervised Camouflaged Object Detection via Dynamic Pseudo-label Learning
Weiqi Yan
Lvhai Chen
Huaijia Kou
Shengchuan Zhang
Yan Zhang
Liujuan Cao
20
0
0
08 Jun 2025
From Swath to Full-Disc: Advancing Precipitation Retrieval with Multimodal Knowledge Expansion
From Swath to Full-Disc: Advancing Precipitation Retrieval with Multimodal Knowledge Expansion
Zheng Wang
Kai Ying
Bin Xu
Chunjiao Wang
Cong Bai
13
0
0
08 Jun 2025
GGBall: Graph Generative Model on Poincaré Ball
GGBall: Graph Generative Model on Poincaré Ball
Tianci Bu
Chuanrui Wang
Hao Ma
Haoren Zheng
Xin Lu
Tailin Wu
22
0
0
08 Jun 2025
Position Prediction Self-Supervised Learning for Multimodal Satellite Imagery Semantic Segmentation
Position Prediction Self-Supervised Learning for Multimodal Satellite Imagery Semantic Segmentation
John Waithaka
Moise Busogi
SSL
10
0
0
07 Jun 2025
Dynamic Mixture of Progressive Parameter-Efficient Expert Library for Lifelong Robot Learning
Dynamic Mixture of Progressive Parameter-Efficient Expert Library for Lifelong Robot Learning
Yuheng Lei
Sitong Mao
Shunbo Zhou
Hongyuan Zhang
Xuelong Li
Ping Luo
CLL
39
0
0
06 Jun 2025
1234...949596
Next