ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.06377
  4. Cited By
Masked Autoencoders Are Scalable Vision Learners
v1v2v3 (latest)

Masked Autoencoders Are Scalable Vision Learners

11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
    ViTTPM
ArXiv (abs)PDFHTML

Papers citing "Masked Autoencoders Are Scalable Vision Learners"

50 / 4,777 papers shown
Title
RobotIQ: Empowering Mobile Robots with Human-Level Planning for Real-World Execution
RobotIQ: Empowering Mobile Robots with Human-Level Planning for Real-World Execution
Emmanuel K. Raptis
Athanasios Ch. Kapoutsis
Elias B. Kosmatopoulos
LM&Ro
125
0
0
18 Feb 2025
Performance of Zero-Shot Time Series Foundation Models on Cloud Data
Performance of Zero-Shot Time Series Foundation Models on Cloud Data
William Toner
Thomas L. Lee
Artjom Joosen
Rajkarn Singh
Martin Asenov
AI4TS
152
2
0
18 Feb 2025
MindLLM: A Subject-Agnostic and Versatile Model for fMRI-to-Text Decoding
MindLLM: A Subject-Agnostic and Versatile Model for fMRI-to-Text Decoding
Weikang Qiu
Zheng Huang
Haoyu Hu
Aosong Feng
Yujun Yan
Rex Ying
97
0
0
18 Feb 2025
Intensity-Spatial Dual Masked Autoencoder for Multi-Scale Feature Learning in Chest CT Segmentation
Intensity-Spatial Dual Masked Autoencoder for Multi-Scale Feature Learning in Chest CT Segmentation
Yuexing Ding
Jun Wang
H. Lyu
218
0
0
17 Feb 2025
Hyper-SET: Designing Transformers via Hyperspherical Energy Minimization
Hyper-SET: Designing Transformers via Hyperspherical Energy Minimization
Yunzhe Hu
Difan Zou
Dong Xu
157
1
0
17 Feb 2025
Simplifying DINO via Coding Rate Regularization
Simplifying DINO via Coding Rate Regularization
Ziyang Wu
Jingyuan Zhang
Druv Pai
Xinze Wang
Chandan Singh
Jianwei Yang
Jianfeng Gao
Yi-An Ma
548
1
0
17 Feb 2025
Adversarially Robust CLIP Models Can Induce Better (Robust) Perceptual Metrics
Adversarially Robust CLIP Models Can Induce Better (Robust) Perceptual Metrics
Francesco Croce
Christian Schlarmann
Naman D. Singh
Matthias Hein
158
7
0
17 Feb 2025
Frequency-Aware Masked Autoencoders for Human Activity Recognition using Accelerometers
Frequency-Aware Masked Autoencoders for Human Activity Recognition using Accelerometers
Niels R. Lorenzen
P. Jennum
Emmanuel Mignot
A. Brink-Kjaer
80
0
0
17 Feb 2025
MaskGWM: A Generalizable Driving World Model with Video Mask Reconstruction
MaskGWM: A Generalizable Driving World Model with Video Mask Reconstruction
Jingcheng Ni
Yuxin Guo
Yichen Liu
Rui Chen
Lewei Lu
Z. Wu
DiffMVGen
144
5
0
17 Feb 2025
CR-CTC: Consistency regularization on CTC for improved speech recognition
CR-CTC: Consistency regularization on CTC for improved speech recognition
Zengwei Yao
Wei Kang
Xiaoyu Yang
Fangjun Kuang
Liyong Guo
Han Zhu
Zengrui Jin
Zhaoqing Li
Long Lin
Daniel Povey
132
4
0
17 Feb 2025
Differentially Private Prototypes for Imbalanced Transfer Learning
Differentially Private Prototypes for Imbalanced Transfer Learning
Dariush Wahdany
Matthew Jagielski
Adam Dziedzic
Franziska Boenisch
145
0
0
17 Feb 2025
Vision-Enhanced Time Series Forecasting via Latent Diffusion Models
Vision-Enhanced Time Series Forecasting via Latent Diffusion Models
Weilin Ruan
Siru Zhong
Haomin Wen
Yuxuan Liang
AI4TS
143
1
0
16 Feb 2025
AnyTouch: Learning Unified Static-Dynamic Representation across Multiple Visuo-tactile Sensors
AnyTouch: Learning Unified Static-Dynamic Representation across Multiple Visuo-tactile Sensors
Ruoxuan Feng
Jiangyu Hu
Wenke Xia
Tianci Gao
Ao Shen
Yuhao Sun
Bin Fang
Di Hu
119
9
0
15 Feb 2025
Harnessing Vision Models for Time Series Analysis: A Survey
Harnessing Vision Models for Time Series Analysis: A Survey
Jingchao Ni
Ziming Zhao
ChengAo Shen
Hanghang Tong
Dongjin Song
Wei Cheng
Dongsheng Luo
Haifeng Chen
AI4TS
180
6
0
13 Feb 2025
E-MD3C: Taming Masked Diffusion Transformers for Efficient Zero-Shot Object Customization
E-MD3C: Taming Masked Diffusion Transformers for Efficient Zero-Shot Object Customization
T. Pham
Zhang Kang
Ji Woo Hong
Xuran Zheng
Chang D. Yoo
136
0
0
13 Feb 2025
ADMN: A Layer-Wise Adaptive Multimodal Network for Dynamic Input Noise and Compute Resources
ADMN: A Layer-Wise Adaptive Multimodal Network for Dynamic Input Noise and Compute Resources
Jason Wu
Kang Yang
Lance M. Kaplan
Mani B. Srivastava
69
0
0
11 Feb 2025
Matrix3D: Large Photogrammetry Model All-in-One
Matrix3D: Large Photogrammetry Model All-in-One
Yuanxun Lu
Jingyang Zhang
Tian Fang
Jean-Daniel Nahmias
Yanghai Tsin
Long Quan
Xun Cao
Yao Yao
Shiwei Li
207
6
0
11 Feb 2025
Multi-Level Decoupled Relational Distillation for Heterogeneous Architectures
Yaoxin Yang
Peng Ye
Weihao Lin
Kangcong Li
Yan Wen
Jia Hao
Tao Chen
94
0
0
10 Feb 2025
From Pixels to Components: Eigenvector Masking for Visual Representation Learning
From Pixels to Components: Eigenvector Masking for Visual Representation Learning
Alice Bizeul
Thomas M. Sutter
Alain Ryser
Bernhard Schölkopf
Julius von Kügelgen
Julia E. Vogt
197
2
0
10 Feb 2025
Understanding Representation Dynamics of Diffusion Models via Low-Dimensional Modeling
Understanding Representation Dynamics of Diffusion Models via Low-Dimensional Modeling
Xiao Li
Zekai Zhang
Xiang Li
Siyi Chen
Zhihui Zhu
Peng Wang
Qing Qu
DiffM
189
1
0
09 Feb 2025
Efficient Reinforcement Learning Through Adaptively Pretrained Visual Encoder
Yuhan Zhang
Guoqing Ma
Guangfu Hao
Liangxuan Guo
Yang Chen
S. Yu
OnRL
169
0
0
08 Feb 2025
Knowledge is Power: Harnessing Large Language Models for Enhanced Cognitive Diagnosis
Knowledge is Power: Harnessing Large Language Models for Enhanced Cognitive Diagnosis
Zhiang Dong
Jingyuan Chen
Leilei Gan
AI4Ed
103
4
0
08 Feb 2025
A Novel Convolutional-Free Method for 3D Medical Imaging Segmentation
Canxuan Gang
MedImViT
89
0
0
08 Feb 2025
Detecting Content Rating Violations in Android Applications: A Vision-Language Approach
Detecting Content Rating Violations in Android Applications: A Vision-Language Approach
Dishanika Denipitiyage
B. Silva
Suranga Seneviratne
A. Seneviratne
Sanjay Chawla
83
0
0
07 Feb 2025
Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More
Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More
Feng Wang
Yaodong Yu
Guoyizhe Wei
Wei Shao
Yuyin Zhou
Alan Yuille
Cihang Xie
ViT
147
7
0
06 Feb 2025
Boosting Knowledge Graph-based Recommendations through Confidence-Aware Augmentation with Large Language Models
Boosting Knowledge Graph-based Recommendations through Confidence-Aware Augmentation with Large Language Models
Rui Cai
Chao Wang
Qianyi Cai
Dazhong Shen
Hui Xiong
RALM
134
0
0
06 Feb 2025
ZISVFM: Zero-Shot Object Instance Segmentation in Indoor Robotic Environments with Vision Foundation Models
ZISVFM: Zero-Shot Object Instance Segmentation in Indoor Robotic Environments with Vision Foundation Models
Ying Zhang
Maoliang Yin
Wenfu Bi
Haibao Yan
Shaohan Bian
Cui-Hua Zhang
C. Hua
127
2
0
05 Feb 2025
Particle Trajectory Representation Learning with Masked Point Modeling
Particle Trajectory Representation Learning with Masked Point Modeling
Sam Young
Yeon-jae Jwa
Kazuhiro Terao
3DPC
104
1
0
04 Feb 2025
UNIP: Rethinking Pre-trained Attention Patterns for Infrared Semantic Segmentation
UNIP: Rethinking Pre-trained Attention Patterns for Infrared Semantic Segmentation
Tao Zhang
Jinyong Wen
Zhen Chen
Kun Ding
Di Zhang
Chunhong Pan
259
1
0
04 Feb 2025
BRIDLE: Generalized Self-supervised Learning with Quantization
BRIDLE: Generalized Self-supervised Learning with Quantization
Hoang M. Nguyen
Satya Narayan Shukla
Qiang Zhang
Hanchao Yu
Sreya D. Roy
Taipeng Tian
Lingjiong Zhu
Yuchen Liu
SSLMQ
138
0
0
04 Feb 2025
ConceptVAE: Self-Supervised Fine-Grained Concept Disentanglement from 2D Echocardiographies
ConceptVAE: Self-Supervised Fine-Grained Concept Disentanglement from 2D Echocardiographies
C. Ciușdel
Alex Serban
Tiziano Passerini
CoGe
114
1
0
03 Feb 2025
Enhancing Environmental Robustness in Few-shot Learning via Conditional Representation Learning
Enhancing Environmental Robustness in Few-shot Learning via Conditional Representation Learning
Qianyu Guo
Jingrong Wu
Tianxing Wu
Hongru Wang
Weifeng Ge
Wenqiang Zhang
58
0
0
03 Feb 2025
Self-Prompt SAM: Medical Image Segmentation via Automatic Prompt SAM Adaptation
Self-Prompt SAM: Medical Image Segmentation via Automatic Prompt SAM Adaptation
Bin Xie
Hao Tang
Dawen Cai
Yan Yan
Gady Agam
MedImVLM
155
2
0
02 Feb 2025
A Survey on Class-Agnostic Counting: Advancements from Reference-Based to Open-World Text-Guided Approaches
A Survey on Class-Agnostic Counting: Advancements from Reference-Based to Open-World Text-Guided Approaches
Luca Ciampi
Ali Azmoudeh
Elif Ecem Akbaba
Erdi Sarıtaş
Ziya Ata Yazıcı
H. K. Ekenel
Giuseppe Amato
Fabrizio Falchi
185
0
0
31 Jan 2025
Learning Priors of Human Motion With Vision Transformers
Learning Priors of Human Motion With Vision Transformers
Placido Falqueto
Alberto Sanfeliu
Luigi Palopoli
Daniele Fontanelli
ViT
242
0
0
30 Jan 2025
Snapshot Compressed Imaging Based Single-Measurement Computer Vision for Videos
Fengpu Pan
Jiangtao Wen
Yuxing Han
68
1
0
28 Jan 2025
Color Flow Imaging Microscopy Improves Identification of Stress Sources of Protein Aggregates in Biopharmaceuticals
Michaela Cohrs
Shiwoo Koak
Yejin Lee
Yu Jin Sung
W. D. Neve
Hristo L. Svilenov
Utku Ozbulak
91
0
0
28 Jan 2025
Multi-View Factorizing and Disentangling: A Novel Framework for Incomplete Multi-View Multi-Label Classification
Multi-View Factorizing and Disentangling: A Novel Framework for Incomplete Multi-View Multi-Label Classification
Wulin Xie
Lian Zhao
Jiang Long
Xiaohuan Lu
Bingyan Nie
99
1
0
28 Jan 2025
BIOSCAN-5M: A Multimodal Dataset for Insect Biodiversity
BIOSCAN-5M: A Multimodal Dataset for Insect Biodiversity
Zahra Gharaee
Scott C. Lowe
ZeMing Gong
Pablo Millán Arias
Nicholas Pellegrino
...
Lila Kari
Dirk Steinke
Graham W. Taylor
Paul Fieguth
Angel X. Chang
134
11
0
28 Jan 2025
Audio-Language Models for Audio-Centric Tasks: A survey
Yi Su
Jisheng Bai
Qisheng Xu
Kele Xu
Yong Dou
AuLLM
164
4
0
28 Jan 2025
BiFold: Bimanual Cloth Folding with Language Guidance
BiFold: Bimanual Cloth Folding with Language Guidance
Oriol Barbany
Adrià Colomé
Carme Torras
44
1
0
27 Jan 2025
MambaTron: Efficient Cross-Modal Point Cloud Enhancement using Aggregate Selective State Space Modeling
MambaTron: Efficient Cross-Modal Point Cloud Enhancement using Aggregate Selective State Space Modeling
Sai Tarun Inaganti
Gennady Petrenko
Mamba
140
1
0
25 Jan 2025
DocTTT: Test-Time Training for Handwritten Document Recognition Using Meta-Auxiliary Learning
DocTTT: Test-Time Training for Handwritten Document Recognition Using Meta-Auxiliary Learning
Wenhao Gu
Li Gu
Ziqiang Wang
Ching Yee Suen
Yang Wang
94
0
0
22 Jan 2025
Slot-BERT: Self-supervised Object Discovery in Surgical Video
Slot-BERT: Self-supervised Object Discovery in Surgical Video
Guiqiu Liao
M. Jogan
Marcel Hussing
Kenta Nakahashi
Kazuhiro Yasufuku
Amin Madani
Eric Eaton
Daniel A. Hashimoto
476
0
0
21 Jan 2025
Taming Teacher Forcing for Masked Autoregressive Video Generation
Taming Teacher Forcing for Masked Autoregressive Video Generation
Deyu Zhou
Quan Sun
Yuang Peng
Kun Yan
Runpei Dong
...
Zheng Ge
Nan Duan
Xiangyu Zhang
L. Ni
H. Shum
VGen
100
9
0
21 Jan 2025
BlanketGen2-Fit3D: Synthetic Blanket Augmentation Towards Improving Real-World In-Bed Blanket Occluded Human Pose Estimation
BlanketGen2-Fit3D: Synthetic Blanket Augmentation Towards Improving Real-World In-Bed Blanket Occluded Human Pose Estimation
Tamás Karácsony
João Carmona
Joao Paulo Cunha
3DH
68
0
0
21 Jan 2025
ENTIRE: Learning-based Volume Rendering Time Prediction
ENTIRE: Learning-based Volume Rendering Time Prediction
Zikai Yin
Hamid Gadirov
Jiri Kosinka
Steffen Frey
3DH
84
0
0
21 Jan 2025
Unified 3D MRI Representations via Sequence-Invariant Contrastive Learning
Unified 3D MRI Representations via Sequence-Invariant Contrastive Learning
Liam Chalcroft
Jenny Crinion
Cathy J. Price
John Ashburner
499
0
0
21 Jan 2025
Contrastive Masked Autoencoders for Character-Level Open-Set Writer Identification
Contrastive Masked Autoencoders for Character-Level Open-Set Writer Identification
Xiaowei Jiang
Wenhao Ma
Yiqun Duan
T. Do
Chin-Teng Lin
188
0
0
21 Jan 2025
Modality Interactive Mixture-of-Experts for Fake News Detection
Modality Interactive Mixture-of-Experts for Fake News Detection
Yifan Liu
Y. Liu
Zehan Li
Ruichen Yao
Yang Zhang
Dong Wang
MoE
92
0
0
21 Jan 2025
Previous
123...111213...949596
Next