ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.06377
  4. Cited By
Masked Autoencoders Are Scalable Vision Learners
v1v2v3 (latest)

Masked Autoencoders Are Scalable Vision Learners

11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
    ViTTPM
ArXiv (abs)PDFHTML

Papers citing "Masked Autoencoders Are Scalable Vision Learners"

50 / 4,778 papers shown
Title
Representing Part-Whole Hierarchies in Foundation Models by Learning
  Localizability, Composability, and Decomposability from Anatomy via
  Self-Supervision
Representing Part-Whole Hierarchies in Foundation Models by Learning Localizability, Composability, and Decomposability from Anatomy via Self-Supervision
M. Taher
Michael B. Gotway
Jianming Liang
MedIm
101
6
0
24 Apr 2024
CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster
  Pre-training on Web-scale Image-Text Data
CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data
Sachin Mehta
Maxwell Horton
Fartash Faghri
Mohammad Hossein Sekhavat
Mahyar Najibi
Mehrdad Farajtabar
Oncel Tuzel
Mohammad Rastegari
VLMCLIP
69
7
0
24 Apr 2024
MiM: Mask in Mask Self-Supervised Pre-Training for 3D Medical Image Analysis
MiM: Mask in Mask Self-Supervised Pre-Training for 3D Medical Image Analysis
Jiaxin Zhuang
Linshan Wu
Qiong Wang
V. Vardhanabhuti
Lin Luo
Hao Chen
Hao Chen
128
4
0
24 Apr 2024
Cross-Temporal Spectrogram Autoencoder (CTSAE): Unsupervised
  Dimensionality Reduction for Clustering Gravitational Wave Glitches
Cross-Temporal Spectrogram Autoencoder (CTSAE): Unsupervised Dimensionality Reduction for Clustering Gravitational Wave Glitches
Yi Li
Yunan Wu
Aggelos K. Katsaggelos
35
1
0
23 Apr 2024
LaneCorrect: Self-supervised Lane Detection
LaneCorrect: Self-supervised Lane Detection
Ming-Jun Nie
Xinyue Cai
Han Xu
Li Zhang
SSL
122
4
0
23 Apr 2024
Towards noise contrastive estimation with soft targets for conditional
  models
Towards noise contrastive estimation with soft targets for conditional models
J. Hugger
Virginie Uhlmann
UQCV
90
1
0
22 Apr 2024
OccFeat: Self-supervised Occupancy Feature Prediction for Pretraining
  BEV Segmentation Networks
OccFeat: Self-supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks
Sophia Sirko-Galouchenko
Alexandre Boulch
Spyros Gidaris
Andrei Bursuc
Antonín Vobecký
Patrick Pérez
Renaud Marlet
3DPC
103
7
0
22 Apr 2024
MaterialSeg3D: Segmenting Dense Materials from 2D Priors for 3D Assets
MaterialSeg3D: Segmenting Dense Materials from 2D Priors for 3D Assets
Zeyu Li
Ruitong Gan
Chuanchen Luo
Yuxi Wang
Jiaheng Liu
Ziwei Zhu
Qing Li
Xucheng Yin
Zhaoxiang Zhang
Junran Peng
DiffM
89
1
0
22 Apr 2024
Neural Radiance Field in Autonomous Driving: A Survey
Neural Radiance Field in Autonomous Driving: A Survey
Lei He
Leheng Li
Wenchao Sun
Zeyu Han
Yichen Liu
Sifa Zheng
Jianqiang Wang
Keqiang Li
106
15
0
22 Apr 2024
Masked Latent Transformer with the Random Masking Ratio to Advance the
  Diagnosis of Dental Fluorosis
Masked Latent Transformer with the Random Masking Ratio to Advance the Diagnosis of Dental Fluorosis
Yun Wu
Hao Xu
Maohua Gu
Zhongchuan Jiang
Jun Xu
Youliang Tian
MedImAI4CE
38
1
0
21 Apr 2024
Composing Pre-Trained Object-Centric Representations for Robotics From
  "What" and "Where" Foundation Models
Composing Pre-Trained Object-Centric Representations for Robotics From "What" and "Where" Foundation Models
Junyao Shi
Jianing Qian
Yecheng Jason Ma
Dinesh Jayaraman
OCL
79
6
0
20 Apr 2024
Vim4Path: Self-Supervised Vision Mamba for Histopathology Images
Vim4Path: Self-Supervised Vision Mamba for Histopathology Images
Ali Nasiri-Sarvi
Vincent Quoc-Huy Trinh
Hassan Rivaz
Mahdi S. Hosseini
69
8
0
20 Apr 2024
Automatic Cranial Defect Reconstruction with Self-Supervised Deep
  Deformable Masked Autoencoders
Automatic Cranial Defect Reconstruction with Self-Supervised Deep Deformable Masked Autoencoders
Marek Wodzinski
D. Hemmerling
Mateusz Daniol
49
3
0
19 Apr 2024
A Large-scale Medical Visual Task Adaptation Benchmark
A Large-scale Medical Visual Task Adaptation Benchmark
Shentong Mo
Xufang Luo
Yansen Wang
Dongsheng Li
MedIm
59
2
0
19 Apr 2024
Continual Learning on a Diet: Learning from Sparsely Labeled Streams
  Under Constrained Computation
Continual Learning on a Diet: Learning from Sparsely Labeled Streams Under Constrained Computation
Wenxuan Zhang
Youssef Mohamed
Guohao Li
Philip Torr
Adel Bibi
Mohamed Elhoseiny
CLL
102
5
0
19 Apr 2024
Adaptive Regularization of Representation Rank as an Implicit Constraint
  of Bellman Equation
Adaptive Regularization of Representation Rank as an Implicit Constraint of Bellman Equation
Qiang He
Dinesh Manocha
Meng Fang
S. Maghsudi
93
3
0
19 Apr 2024
Show and Grasp: Few-shot Semantic Segmentation for Robot Grasping
  through Zero-shot Foundation Models
Show and Grasp: Few-shot Semantic Segmentation for Robot Grasping through Zero-shot Foundation Models
Leonardo Barcellona
Alberto Bacchin
Matteo Terreran
Emanuele Menegatti
Stefano Ghidoni
78
2
0
19 Apr 2024
Improving Chinese Character Representation with Formation Tree
Improving Chinese Character Representation with Formation Tree
Yang Hong
Yinfei Li
Xiaojun Qiao
Rui Li
Junsong Zhang
VLM
79
1
0
19 Apr 2024
Adaptive Memory Replay for Continual Learning
Adaptive Memory Replay for Continual Learning
James Seale Smith
Lazar Valkov
Shaunak Halbe
V. Gutta
Rogerio Feris
Z. Kira
Leonid Karlinsky
86
6
0
18 Apr 2024
On the Content Bias in Fréchet Video Distance
On the Content Bias in Fréchet Video Distance
Jason S. Hoffman
Aniruddha Mahapatra
Gaurav Parmar
Jun-Yan Zhu
Jia-Bin Huang
EGVM
88
20
0
18 Apr 2024
SOHES: Self-supervised Open-world Hierarchical Entity Segmentation
SOHES: Self-supervised Open-world Hierarchical Entity Segmentation
Shengcao Cao
Jiuxiang Gu
Jason Kuen
Hao Tan
Ruiyi Zhang
Handong Zhao
A. Nenkova
Liangyan Gui
Tong Sun
Yu Wang
VLMOCL
141
3
0
18 Apr 2024
Lazy Diffusion Transformer for Interactive Image Editing
Lazy Diffusion Transformer for Interactive Image Editing
Yotam Nitzan
Zongze Wu
Richard Zhang
Eli Shechtman
Daniel Cohen-Or
Taesung Park
Michael Gharbi
90
11
0
18 Apr 2024
An Experimental Study on Exploring Strong Lightweight Vision
  Transformers via Masked Image Modeling Pre-Training
An Experimental Study on Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training
Jin Gao
Shubo Lin
Shaoru Wang
Yutong Kou
Zeming Li
Liang Li
Congxuan Zhang
Xiaoqin Zhang
Yizheng Wang
Weiming Hu
115
1
0
18 Apr 2024
How to Benchmark Vision Foundation Models for Semantic Segmentation?
How to Benchmark Vision Foundation Models for Semantic Segmentation?
Tommie Kerssies
Daan de Geus
Gijs Dubbelman
VLM
99
9
0
18 Apr 2024
When are Foundation Models Effective? Understanding the Suitability for
  Pixel-Level Classification Using Multispectral Imagery
When are Foundation Models Effective? Understanding the Suitability for Pixel-Level Classification Using Multispectral Imagery
Yiqun Xie
Zhihao Wang
Weiye Chen
Zhili Li
Xiaowei Jia
Yanhua Li
Ruichen Wang
Kangyang Chai
Ruohan Li
Sergii Skakun
VLM
82
3
0
17 Apr 2024
Equivariant Spatio-Temporal Self-Supervision for LiDAR Object Detection
Equivariant Spatio-Temporal Self-Supervision for LiDAR Object Detection
Deepti Hegde
Suhas Lohit
Kuan-Chuan Peng
Michael J. Jones
Vishal M. Patel
3DPC
71
0
0
17 Apr 2024
Pretraining Billion-scale Geospatial Foundational Models on Frontier
Pretraining Billion-scale Geospatial Foundational Models on Frontier
A. Tsaris
P. Dias
Abhishek Potnis
Junqi Yin
Feiyi Wang
D. Lunga
AI4CE
47
5
0
17 Apr 2024
On the Scalability of GNNs for Molecular Graphs
On the Scalability of GNNs for Molecular Graphs
Maciej Sypetkowski
Frederik Wenkel
Farimah Poursafaei
Nia Dickson
Karush Suri
Philip Fradkin
Dominique Beaini
GNNAI4CE
111
18
0
17 Apr 2024
Predicting Long-horizon Futures by Conditioning on Geometry and Time
Predicting Long-horizon Futures by Conditioning on Geometry and Time
Tarasha Khurana
Deva Ramanan
AI4TS
93
0
0
17 Apr 2024
JointViT: Modeling Oxygen Saturation Levels with Joint Supervision on
  Long-Tailed OCTA
JointViT: Modeling Oxygen Saturation Levels with Joint Supervision on Long-Tailed OCTA
Zeyu Zhang
Xuyin Qi
Mingxi Chen
Guangxi Li
Ryan Pham
...
Zhibin Liao
O. Siggs
Robert Mclaughlin
Jamie Craig
Minh-Son To
78
12
0
17 Apr 2024
MaeFuse: Transferring Omni Features with Pretrained Masked Autoencoders for Infrared and Visible Image Fusion via Guided Training
MaeFuse: Transferring Omni Features with Pretrained Masked Autoencoders for Infrared and Visible Image Fusion via Guided Training
Jiayang Li
Junjun Jiang
Pengwei Liang
Jiayi Ma
Liqiang Nie
110
3
0
17 Apr 2024
LaDiC: Are Diffusion Models Really Inferior to Autoregressive
  Counterparts for Image-to-Text Generation?
LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-Text Generation?
Yuchi Wang
Shuhuai Ren
Rundong Gao
Linli Yao
Qingyan Guo
Kaikai An
Jianhong Bai
Xu Sun
DiffMVLM
106
9
0
16 Apr 2024
Masked Autoencoders for Microscopy are Scalable Learners of Cellular
  Biology
Masked Autoencoders for Microscopy are Scalable Learners of Cellular Biology
Oren Z. Kraus
Kian Kenyon-Dean
Saber Saberian
Maryam Fallah
Peter McLean
...
Chi Vicky Cheng
Kristen Morse
Maureen Makes
Ben Mabey
Berton Earnshaw
74
35
0
16 Apr 2024
Anomaly Correction of Business Processes Using Transformer Autoencoder
Anomaly Correction of Business Processes Using Transformer Autoencoder
Ziyou Gong
X. Fang
Ping Wu
41
0
0
16 Apr 2024
Offline Trajectory Optimization for Offline Reinforcement Learning
Offline Trajectory Optimization for Offline Reinforcement Learning
Ziqi Zhao
Zhaochun Ren
Liu Yang
Fajie Yuan
Fajie Yuan
Zhumin Chen
Jun Ma
Jun Ma
Xin Xin
OffRL
85
1
0
16 Apr 2024
CryoMAE: Few-Shot Cryo-EM Particle Picking with Masked Autoencoders
CryoMAE: Few-Shot Cryo-EM Particle Picking with Masked Autoencoders
Chentianye Xu
Xueying Zhan
Min Xu
40
4
0
15 Apr 2024
Cross-Modal Self-Training: Aligning Images and Pointclouds to Learn
  Classification without Labels
Cross-Modal Self-Training: Aligning Images and Pointclouds to Learn Classification without Labels
Amaya Dharmasiri
Muzammal Naseer
Salman Khan
Fahad Shahbaz Khan
VLM3DPC
63
1
0
15 Apr 2024
EgoPet: Egomotion and Interaction Data from an Animal's Perspective
EgoPet: Egomotion and Interaction Data from an Animal's Perspective
Amir Bar
Arya Bakhtiar
Danny Tran
Antonio Loquercio
Jathushan Rajasegaran
Yann LeCun
Amir Globerson
Trevor Darrell
EgoV
98
5
0
15 Apr 2024
Diffscaler: Enhancing the Generative Prowess of Diffusion Transformers
Diffscaler: Enhancing the Generative Prowess of Diffusion Transformers
Nithin Gopalakrishnan Nair
Jeya Maria Jose Valanarasu
Vishal M. Patel
91
1
0
15 Apr 2024
XoFTR: Cross-modal Feature Matching Transformer
XoFTR: Cross-modal Feature Matching Transformer
Önder Tuzcuoglu
Aybora Köksal
Bugra Sofu
Sinan Kalkan
A. Aydin Alatan
ViT
81
11
0
15 Apr 2024
In-Context Translation: Towards Unifying Image Recognition, Processing,
  and Generation
In-Context Translation: Towards Unifying Image Recognition, Processing, and Generation
Han Xue
Qianru Sun
Li Song
Wenjun Zhang
Zhiwu Huang
MLLM
72
0
0
15 Apr 2024
State Space Model for New-Generation Network Alternative to
  Transformers: A Survey
State Space Model for New-Generation Network Alternative to Transformers: A Survey
Tianlin Li
Shiao Wang
Yuhe Ding
Yuehang Li
Wentao Wu
...
Bowei Jiang
Chenglong Li
Yaowei Wang
Yonghong Tian
Jin Tang
Mamba
147
53
0
15 Apr 2024
Masked and Shuffled Blind Spot Denoising for Real-World Images
Masked and Shuffled Blind Spot Denoising for Real-World Images
Hamadi Chihaoui
Paolo Favaro
57
7
0
15 Apr 2024
How to build the best medical image segmentation algorithm using foundation models: a comprehensive empirical study with Segment Anything Model
How to build the best medical image segmentation algorithm using foundation models: a comprehensive empirical study with Segment Anything Model
Han Gu
Haoyu Dong
Jichen Yang
Maciej A. Mazurowski
MedImVLM
146
21
0
15 Apr 2024
Weight Copy and Low-Rank Adaptation for Few-Shot Distillation of Vision
  Transformers
Weight Copy and Low-Rank Adaptation for Few-Shot Distillation of Vision Transformers
Diana-Nicoleta Grigore
Mariana-Iuliana Georgescu
J. A. Justo
T. Johansen
Andreea-Iuliana Ionescu
Radu Tudor Ionescu
81
0
0
14 Apr 2024
Arena: A Patch-of-Interest ViT Inference Acceleration System for
  Edge-Assisted Video Analytics
Arena: A Patch-of-Interest ViT Inference Acceleration System for Edge-Assisted Video Analytics
Haosong Peng
Wei Feng
Hao Li
Yufeng Zhan
Qihua Zhou
Yuanqing Xia
56
3
0
14 Apr 2024
MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial
  Expression Recognition in-the-wild
MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild
K. Chumachenko
Alexandros Iosifidis
Moncef Gabbouj
70
8
0
13 Apr 2024
MaSkel: A Model for Human Whole-body X-rays Generation from Human
  Masking Images
MaSkel: A Model for Human Whole-body X-rays Generation from Human Masking Images
Yingjie Xi
Boyuan Cheng
Jingyao Cai
Jian Jun Zhang
Xiaosong Yang
MedIm
87
1
0
13 Apr 2024
AMU-Tuning: Effective Logit Bias for CLIP-based Few-shot Learning
AMU-Tuning: Effective Logit Bias for CLIP-based Few-shot Learning
Yuwei Tang
Zhenyi Lin
Qilong Wang
Pengfei Zhu
Qinghua Hu
69
16
0
13 Apr 2024
Label-free Anomaly Detection in Aerial Agricultural Images with Masked
  Image Modeling
Label-free Anomaly Detection in Aerial Agricultural Images with Masked Image Modeling
Sambal Shikhar
Anupam Sobti
59
1
0
13 Apr 2024
Previous
123...343536...949596
Next