Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.06377
Cited By
Masked Autoencoders Are Scalable Vision Learners
11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Masked Autoencoders Are Scalable Vision Learners"
50 / 4,611 papers shown
Title
Harnessing Vision Models for Time Series Analysis: A Survey
Jingchao Ni
Ziming Zhao
ChengAo Shen
Hanghang Tong
Dongjin Song
Wei Cheng
Dongsheng Luo
Haifeng Chen
AI4TS
79
1
0
13 Feb 2025
E-MD3C: Taming Masked Diffusion Transformers for Efficient Zero-Shot Object Customization
T. Pham
Zhang Kang
Ji Woo Hong
Xuran Zheng
Chang D. Yoo
82
0
0
13 Feb 2025
Matrix3D: Large Photogrammetry Model All-in-One
Yuanxun Lu
Jingyang Zhang
Tian Fang
Jean-Daniel Nahmias
Yanghai Tsin
Long Quan
Xun Cao
Yao Yao
Shiwei Li
122
4
0
11 Feb 2025
ADMN: A Layer-Wise Adaptive Multimodal Network for Dynamic Input Noise and Compute Resources
Jason Wu
Kang Yang
Lance M. Kaplan
Mani B. Srivastava
36
0
0
11 Feb 2025
From Pixels to Components: Eigenvector Masking for Visual Representation Learning
Alice Bizeul
Thomas M. Sutter
Alain Ryser
Bernhard Schölkopf
Julius von Kügelgen
Julia E. Vogt
88
1
0
10 Feb 2025
Multi-Level Decoupled Relational Distillation for Heterogeneous Architectures
Yaoxin Yang
Peng Ye
Weihao Lin
Kangcong Li
Yan Wen
Jia Hao
Tao Chen
38
0
0
10 Feb 2025
Understanding Representation Dynamics of Diffusion Models via Low-Dimensional Modeling
Xiao Li
Zekai Zhang
Xiang Li
Siyi Chen
Zhihui Zhu
Peng Wang
Qing Qu
DiffM
51
0
0
09 Feb 2025
Knowledge is Power: Harnessing Large Language Models for Enhanced Cognitive Diagnosis
Zhiang Dong
Jingyuan Chen
Fei Wu
AI4Ed
74
1
0
08 Feb 2025
Efficient Reinforcement Learning Through Adaptively Pretrained Visual Encoder
Yuhan Zhang
Guoqing Ma
Guangfu Hao
Liangxuan Guo
Yang Chen
S. Yu
OnRL
74
0
0
08 Feb 2025
A Novel Convolutional-Free Method for 3D Medical Imaging Segmentation
Canxuan Gang
MedIm
ViT
56
0
0
08 Feb 2025
SEER: Self-Explainability Enhancement of Large Language Models' Representations
Guanxu Chen
Dongrui Liu
Tao Luo
Jing Shao
LRM
MILM
67
1
0
07 Feb 2025
Detecting Content Rating Violations in Android Applications: A Vision-Language Approach
Dishanika Denipitiyage
B. Silva
Suranga Seneviratne
A. Seneviratne
Sanjay Chawla
48
0
0
07 Feb 2025
Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More
Feng Wang
Yaodong Yu
Guoyizhe Wei
Wei Shao
Yuyin Zhou
Alan Yuille
Cihang Xie
ViT
99
4
0
06 Feb 2025
Boosting Knowledge Graph-based Recommendations through Confidence-Aware Augmentation with Large Language Models
Rui Cai
Chao Wang
Qianyi Cai
Dazhong Shen
Hui Xiong
RALM
85
0
0
06 Feb 2025
ZISVFM: Zero-Shot Object Instance Segmentation in Indoor Robotic Environments with Vision Foundation Models
Ying Zhang
Maoliang Yin
Wenfu Bi
Haibao Yan
Shaohan Bian
Cui-Hua Zhang
C. Hua
81
2
0
05 Feb 2025
UNIP: Rethinking Pre-trained Attention Patterns for Infrared Semantic Segmentation
Tao Zhang
Jinyong Wen
Zhen Chen
Kun Ding
S. Xiang
Chunhong Pan
72
1
0
04 Feb 2025
BRIDLE: Generalized Self-supervised Learning with Quantization
Hoang M. Nguyen
Satya Narayan Shukla
Qiang Zhang
Hanchao Yu
Sreya D. Roy
Taipeng Tian
Lingjiong Zhu
Yuchen Liu
SSL
MQ
82
0
0
04 Feb 2025
Particle Trajectory Representation Learning with Masked Point Modeling
Sam Young
Yeon-jae Jwa
Kazuhiro Terao
3DPC
69
1
0
04 Feb 2025
ConceptVAE: Self-Supervised Fine-Grained Concept Disentanglement from 2D Echocardiographies
C. Ciușdel
Alex Serban
Tiziano Passerini
CoGe
74
1
0
03 Feb 2025
Self-Prompt SAM: Medical Image Segmentation via Automatic Prompt SAM Adaptation
Bin Xie
Hao Tang
Dawen Cai
Yan Yan
Gady Agam
MedIm
VLM
64
1
0
02 Feb 2025
A Survey on Class-Agnostic Counting: Advancements from Reference-Based to Open-World Text-Guided Approaches
Luca Ciampi
Ali Azmoudeh
Elif Ecem Akbaba
Erdi Sarıtaş
Ziya Ata Yazıcı
H. K. Ekenel
Giuseppe Amato
Fabrizio Falchi
102
0
0
31 Jan 2025
Learning Priors of Human Motion With Vision Transformers
Placido Falqueto
Alberto Sanfeliu
Luigi Palopoli
Daniele Fontanelli
ViT
153
0
0
30 Jan 2025
Snapshot Compressed Imaging Based Single-Measurement Computer Vision for Videos
Fengpu Pan
Jiangtao Wen
Yuxing Han
31
1
0
28 Jan 2025
Audio-Language Models for Audio-Centric Tasks: A survey
Yi Su
Jisheng Bai
Qisheng Xu
Kele Xu
Yong Dou
AuLLM
99
2
0
28 Jan 2025
Multi-View Factorizing and Disentangling: A Novel Framework for Incomplete Multi-View Multi-Label Classification
Wulin Xie
Lian Zhao
Jiang Long
Xiaohuan Lu
Bingyan Nie
47
0
0
28 Jan 2025
BIOSCAN-5M: A Multimodal Dataset for Insect Biodiversity
Zahra Gharaee
Scott C. Lowe
ZeMing Gong
Pablo Millán Arias
Nicholas Pellegrino
...
Lila Kari
Dirk Steinke
Graham W. Taylor
Paul Fieguth
Angel X. Chang
56
7
0
28 Jan 2025
Color Flow Imaging Microscopy Improves Identification of Stress Sources of Protein Aggregates in Biopharmaceuticals
Michaela Cohrs
Shiwoo Koak
Yejin Lee
Yu Jin Sung
W. D. Neve
Hristo L. Svilenov
Utku Ozbulak
43
0
0
28 Jan 2025
MambaTron: Efficient Cross-Modal Point Cloud Enhancement using Aggregate Selective State Space Modeling
Sai Tarun Inaganti
Gennady Petrenko
Mamba
74
1
0
25 Jan 2025
DocTTT: Test-Time Training for Handwritten Document Recognition Using Meta-Auxiliary Learning
Wenhao Gu
Li Gu
Ziqiang Wang
Ching Yee Suen
Yang Wang
53
0
0
22 Jan 2025
Slot-BERT: Self-supervised Object Discovery in Surgical Video
Guiqiu Liao
M. Jogan
Marcel Hussing
Kenta Nakahashi
Kazuhiro Yasufuku
Amin Madani
Eric Eaton
Daniel A. Hashimoto
141
0
0
21 Jan 2025
Taming Teacher Forcing for Masked Autoregressive Video Generation
Deyu Zhou
Quan Sun
Yuang Peng
Kun Yan
Runpei Dong
...
Zheng Ge
Nan Duan
Xiangyu Zhang
L. Ni
H. Shum
VGen
54
6
0
21 Jan 2025
BlanketGen2-Fit3D: Synthetic Blanket Augmentation Towards Improving Real-World In-Bed Blanket Occluded Human Pose Estimation
Tamás Karácsony
João Carmona
Joao Paulo Cunha
3DH
35
0
0
21 Jan 2025
Memory Storyboard: Leveraging Temporal Segmentation for Streaming Self-Supervised Learning from Egocentric Videos
Yanlai Yang
Mengye Ren
180
0
0
21 Jan 2025
ENTIRE: Learning-based Volume Rendering Time Prediction
Zikai Yin
Hamid Gadirov
Jiri Kosinka
Steffen Frey
3DH
36
0
0
21 Jan 2025
Unified 3D MRI Representations via Sequence-Invariant Contrastive Learning
Liam Chalcroft
Jenny Crinion
Cathy J. Price
John Ashburner
146
0
0
21 Jan 2025
Contrastive Masked Autoencoders for Character-Level Open-Set Writer Identification
Xiaowei Jiang
Wenhao Ma
Yiqun Duan
T. Do
Chin-Teng Lin
36
0
0
21 Jan 2025
Modality Interactive Mixture-of-Experts for Fake News Detection
Yifan Liu
Y. Liu
Zehan Li
Ruichen Yao
Yang Zhang
Dong Wang
MoE
36
0
0
21 Jan 2025
A generalizable 3D framework and model for self-supervised learning in medical imaging
Tony Xu
Sepehr Hosseini
Chris Anderson
Anthony Rinaldi
Rahul G. Krishnan
Anne L. Martel
Maged Goubran
MedIm
56
3
0
20 Jan 2025
How Well Do Supervised 3D Models Transfer to Medical Imaging Tasks?
Wenxuan Li
Alan L. Yuille
Zongwei Zhou
MedIm
46
8
0
20 Jan 2025
Enhancing SAR Object Detection with Self-Supervised Pre-training on Masked Auto-Encoders
Xinyang Pu
Feng Xu
39
0
0
20 Jan 2025
MetaNeRV: Meta Neural Representations for Videos with Spatial-Temporal Guidance
Jialong Guo
Ke Liu
Jiangchao Yao
Zhihua Wang
Jiajun Bu
Haishuai Wang
AI4TS
46
1
0
20 Jan 2025
Enhancing Graph Self-Supervised Learning with Graph Interplay
Xinjian Zhao
Wei Pang
Xiangru Jian
Yaoyao Xu
Chaolong Ying
Tianshu Yu
53
0
0
17 Jan 2025
Few-Shot Adaptation of Training-Free Foundation Model for 3D Medical Image Segmentation
Xingxin He
Yifan Hu
Zhaoye Zhou
Mohamed Jarraya
Fang Liu
VLM
MedIm
45
2
0
17 Jan 2025
SynthLight: Portrait Relighting with Diffusion Model by Learning to Re-render Synthetic Faces
Sumit Chaturvedi
Mengwei Ren
Yannick Hold-Geoffroy
Jingyuan Liu
Julie Dorsey
Zhixin Shu
DiffM
66
0
0
17 Jan 2025
Enhancing Skin Disease Diagnosis: Interpretable Visual Concept Discovery with SAM
Xin Hu
Janet Wang
Jihun Hamm
R. Yotsu
Zhengming Ding
100
0
0
17 Jan 2025
EarthView: A Large Scale Remote Sensing Dataset for Self-Supervision
Diego A. Velázquez
Pau Rodríguez López
Sergio Alonso
Josep M. Gonfaus
Jordi Gonzalez
Gerardo Richarte
Javier Marin
Yoshua Bengio
Alexandre Lacoste
60
0
0
14 Jan 2025
Code and Pixels: Multi-Modal Contrastive Pre-training for Enhanced Tabular Data Analysis
Kankana Roy
Lars Krämer
Sebastian Domaschke
Malik Haris
Roland Aydin
Fabian Isensee
Martin Held
48
0
0
13 Jan 2025
EdgeTAM: On-Device Track Anything Model
Chong Zhou
Chenchen Zhu
Yunyang Xiong
Saksham Suri
Fanyi Xiao
...
Raghuraman Krishnamoorthi
Bo Dai
Chen Change Loy
Vikas Chandra
Bilge Soran
VLM
65
0
0
13 Jan 2025
RoboHorizon: An LLM-Assisted Multi-View World Model for Long-Horizon Robotic Manipulation
Zixuan Chen
Jing Huo
Yangtao Chen
Yang Gao
43
2
0
11 Jan 2025
AI-powered virtual tissues from spatial proteomics for clinical diagnostics and biomedical discovery
Johann Wenckstern
Eeshaan Jain
Kiril Vasilev
Matteo Pariset
Andreas Wicki
Gabriele Gut
Charlotte Bunne
36
1
0
10 Jan 2025
Previous
1
2
3
...
8
9
10
...
91
92
93
Next