ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.06377
  4. Cited By
Masked Autoencoders Are Scalable Vision Learners
v1v2v3 (latest)

Masked Autoencoders Are Scalable Vision Learners

11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
    ViTTPM
ArXiv (abs)PDFHTML

Papers citing "Masked Autoencoders Are Scalable Vision Learners"

50 / 4,779 papers shown
Title
Never Train from Scratch: Fair Comparison of Long-Sequence Models
  Requires Data-Driven Priors
Never Train from Scratch: Fair Comparison of Long-Sequence Models Requires Data-Driven Priors
Ido Amos
Jonathan Berant
Ankit Gupta
120
29
0
04 Oct 2023
Point-PEFT: Parameter-Efficient Fine-Tuning for 3D Pre-trained Models
Point-PEFT: Parameter-Efficient Fine-Tuning for 3D Pre-trained Models
Yiwen Tang
Ivan Tang
Ray Gu
Dong Wang
Eric Zhang
Bin Zhao
Xuelong Li
3DPC
151
22
0
04 Oct 2023
NOLA: Compressing LoRA using Linear Combination of Random Basis
NOLA: Compressing LoRA using Linear Combination of Random Basis
Soroush Abbasi Koohpayegani
K. Navaneet
Parsa Nooralinejad
Soheil Kolouri
Hamed Pirsiavash
145
16
0
04 Oct 2023
SlowFormer: Universal Adversarial Patch for Attack on Compute and Energy
  Efficiency of Inference Efficient Vision Transformers
SlowFormer: Universal Adversarial Patch for Attack on Compute and Energy Efficiency of Inference Efficient Vision Transformers
K. Navaneet
Soroush Abbasi Koohpayegani
Essam Sleiman
Hamed Pirsiavash
AAMLViT
67
3
0
04 Oct 2023
FairVision: Equitable Deep Learning for Eye Disease Screening via Fair
  Identity Scaling
FairVision: Equitable Deep Learning for Eye Disease Screening via Fair Identity Scaling
Yan Luo
Muhammad Osama Khan
Yu Tian
Minfei Shi
Zehao Dou
T. Elze
Yi Fang
Mengyu Wang
93
10
0
03 Oct 2023
Understanding Masked Autoencoders From a Local Contrastive Perspective
Understanding Masked Autoencoders From a Local Contrastive Perspective
Xiaoyu Yue
Lei Bai
Meng Wei
Jiangmiao Pang
Xihui Liu
Luping Zhou
Wanli Ouyang
SSL
114
4
0
03 Oct 2023
MFOS: Model-Free & One-Shot Object Pose Estimation
MFOS: Model-Free & One-Shot Object Pose Estimation
Jongmin Lee
Yohann Cabon
Romain Brégier
Sungjoo Yoo
Jérôme Revaud
ViT
74
6
0
03 Oct 2023
LanguageBind: Extending Video-Language Pretraining to N-modality by
  Language-based Semantic Alignment
LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
Bin Zhu
Bin Lin
Munan Ning
Yang Yan
Jiaxi Cui
...
Zongwei Li
Wancai Zhang
Zhifeng Li
Wei Liu
Liejie Yuan
VLMMLLM
211
229
0
03 Oct 2023
Selective Feature Adapter for Dense Vision Transformers
Selective Feature Adapter for Dense Vision Transformers
XueQing Deng
Qi Fan
Xiaojie Jin
Linjie Yang
Peng Wang
75
0
0
03 Oct 2023
AI-Generated Images as Data Source: The Dawn of Synthetic Era
AI-Generated Images as Data Source: The Dawn of Synthetic Era
Zuhao Yang
Fangneng Zhan
Kunhao Liu
Muyu Xu
Shijian Lu
EGVM
111
20
0
03 Oct 2023
Keypoint-Augmented Self-Supervised Learning for Medical Image
  Segmentation with Limited Annotation
Keypoint-Augmented Self-Supervised Learning for Medical Image Segmentation with Limited Annotation
Zhangsihao Yang
Mengwei Ren
Kaize Ding
Guido Gerig
Yalin Wang
SSL
69
5
0
02 Oct 2023
Operator Learning Meets Numerical Analysis: Improving Neural Networks
  through Iterative Methods
Operator Learning Meets Numerical Analysis: Improving Neural Networks through Iterative Methods
E. Zappala
Daniel Levine
Shiyang Zhang
S. Rizvi
Sacha Lévy
David van Dijk
69
1
0
02 Oct 2023
H-InDex: Visual Reinforcement Learning with Hand-Informed
  Representations for Dexterous Manipulation
H-InDex: Visual Reinforcement Learning with Hand-Informed Representations for Dexterous Manipulation
Yanjie Ze
Yuyao Liu
Ruizhe Shi
Jiaxin Qin
Zhecheng Yuan
Jiashun Wang
Huazhe Xu
120
1
0
02 Oct 2023
ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to
  Video
ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video
Xinhao Li
Yuhan Zhu
Limin Wang
VLM
117
9
0
02 Oct 2023
Generating 3D Brain Tumor Regions in MRI using Vector-Quantization
  Generative Adversarial Networks
Generating 3D Brain Tumor Regions in MRI using Vector-Quantization Generative Adversarial Networks
Meng Zhou
Matthias W. Wagner
U. Tabori
C. Hawkins
B. Ertl-Wagner
Farzad Khalvati
MedIm
117
5
0
02 Oct 2023
Self-supervised Learning for Anomaly Detection in Computational
  Workflows
Self-supervised Learning for Anomaly Detection in Computational Workflows
Hongwei Jin
Krishnan Raghavan
George Papadimitriou
Cong Wang
A. Mandal
Ewa Deelman
Prasanna Balaprakash
78
1
0
02 Oct 2023
Self-distilled Masked Attention guided masked image modeling with noise
  Regularized Teacher (SMART) for medical image analysis
Self-distilled Masked Attention guided masked image modeling with noise Regularized Teacher (SMART) for medical image analysis
Jue Jiang
Aneesh Rangnekar
Chloe Choi
Harini Veeraraghavan
MedIm
83
0
0
02 Oct 2023
Segment Any Building
Segment Any Building
Lei Li
86
10
0
02 Oct 2023
Modularity in Deep Learning: A Survey
Modularity in Deep Learning: A Survey
Haozhe Sun
Isabelle Guyon
MoMe
113
3
0
02 Oct 2023
Can Pre-trained Networks Detect Familiar Out-of-Distribution Data?
Can Pre-trained Networks Detect Familiar Out-of-Distribution Data?
Atsuyuki Miyai
Qing Yu
Go Irie
Kiyoharu Aizawa
OODD
211
6
0
02 Oct 2023
Large Scale Masked Autoencoding for Reducing Label Requirements on SAR
  Data
Large Scale Masked Autoencoding for Reducing Label Requirements on SAR Data
Matt Allen
Francisco Dorr
Joseph A. Gallego-Mejia
Laura Martínez-Ferrer
Anna Jungbluth
F. Kalaitzis
Raúl Ramos-Pollán
88
9
0
02 Oct 2023
Win-Win: Training High-Resolution Vision Transformers from Two Windows
Win-Win: Training High-Resolution Vision Transformers from Two Windows
Vincent Leroy
Jérôme Revaud
Thomas Lucas
Philippe Weinzaepfel
ViT
111
2
0
01 Oct 2023
PixArt-$α$: Fast Training of Diffusion Transformer for
  Photorealistic Text-to-Image Synthesis
PixArt-ααα: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
Junsong Chen
Jincheng Yu
Chongjian Ge
Lewei Yao
Enze Xie
...
Zhongdao Wang
James T. Kwok
Ping Luo
Huchuan Lu
Zhenguo Li
DiffM
127
460
0
30 Sep 2023
Structural Adversarial Objectives for Self-Supervised Representation
  Learning
Structural Adversarial Objectives for Self-Supervised Representation Learning
Xiao Zhang
Michael Maire
118
1
0
30 Sep 2023
Domain-Controlled Prompt Learning
Domain-Controlled Prompt Learning
Qinglong Cao
Zhengqin Xu
Yuantian Chen
Chao Ma
Xiaokang Yang
VLM
100
18
0
30 Sep 2023
Pixel-Inconsistency Modeling for Image Manipulation Localization
Pixel-Inconsistency Modeling for Image Manipulation Localization
Chenqi Kong
Anwei Luo
Shiqi Wang
Haoliang Li
Anderson de Rezende Rocha
Alex C. Kot
AAML
88
17
0
30 Sep 2023
Region-centric Image-Language Pretraining for Open-Vocabulary Detection
Region-centric Image-Language Pretraining for Open-Vocabulary Detection
Dahun Kim
A. Angelova
Weicheng Kuo
ObjDVLM
82
4
0
29 Sep 2023
Practical Membership Inference Attacks Against Large-Scale Multi-Modal
  Models: A Pilot Study
Practical Membership Inference Attacks Against Large-Scale Multi-Modal Models: A Pilot Study
Myeongseob Ko
Ming Jin
Chenguang Wang
Ruoxi Jia
104
29
0
29 Sep 2023
Towards Free Data Selection with General-Purpose Models
Towards Free Data Selection with General-Purpose Models
Alessandro Mutti
Mingyu Ding
Patrizia Semeraro
Wei Zhan
86
10
0
29 Sep 2023
Scaling Experiments in Self-Supervised Cross-Table Representation
  Learning
Scaling Experiments in Self-Supervised Cross-Table Representation Learning
Maximilian Schambach
Dominique Paul
Wei Le
LMTD
65
2
0
29 Sep 2023
Improving Trajectory Prediction in Dynamic Multi-Agent Environment by
  Dropping Waypoints
Improving Trajectory Prediction in Dynamic Multi-Agent Environment by Dropping Waypoints
Pranav Singh Chib
Pravendra Singh
85
1
0
29 Sep 2023
Information Flow in Self-Supervised Learning
Information Flow in Self-Supervised Learning
Zhiyuan Tan
Jingqin Yang
Weiran Huang
Yang Yuan
Yifan Zhang
SSL
83
14
0
29 Sep 2023
MuSe-GNN: Learning Unified Gene Representation From Multimodal
  Biological Graph Data
MuSe-GNN: Learning Unified Gene Representation From Multimodal Biological Graph Data
Tianyu Liu
Yuge Wang
Rex Ying
Hongyu Zhao
121
15
0
29 Sep 2023
Understanding and Mitigating the Label Noise in Pre-training on
  Downstream Tasks
Understanding and Mitigating the Label Noise in Pre-training on Downstream Tasks
Hao Chen
Jindong Wang
Ankit Shah
Ran Tao
Jianguo Huang
Berfin cSimcsek
Masashi Sugiyama
Bhiksha Raj
116
32
0
29 Sep 2023
CtxMIM: Context-Enhanced Masked Image Modeling for Remote Sensing Image
  Understanding
CtxMIM: Context-Enhanced Masked Image Modeling for Remote Sensing Image Understanding
Mingming Zhang
Qingjie Liu
Yunhong Wang
104
6
0
28 Sep 2023
Training a Large Video Model on a Single Machine in a Day
Training a Large Video Model on a Single Machine in a Day
Yue Zhao
Philipp Krahenbuhl
VLM
104
17
0
28 Sep 2023
Visual In-Context Learning for Few-Shot Eczema Segmentation
Visual In-Context Learning for Few-Shot Eczema Segmentation
Monitirtha Dey
S. K. Bhandari
Venugopal Vasudevan
29
2
0
28 Sep 2023
End-to-End (Instance)-Image Goal Navigation through Correspondence as an
  Emergent Phenomenon
End-to-End (Instance)-Image Goal Navigation through Correspondence as an Emergent Phenomenon
G. Bono
L. Antsfeld
Boris Chidlovskii
Zhi Zheng
Christian Wolf
3DV
72
10
0
28 Sep 2023
Vision Transformers Need Registers
Vision Transformers Need Registers
Zilong Chen
Maxime Oquab
Julien Mairal
Huaping Liu
ViT
205
357
0
28 Sep 2023
FORB: A Flat Object Retrieval Benchmark for Universal Image Embedding
FORB: A Flat Object Retrieval Benchmark for Universal Image Embedding
Ana Ezquerro
Carlos Gómez-Rodríguez
Kevin Dela Rosa
Derek Hao Hu
45
6
0
28 Sep 2023
Cloth2Body: Generating 3D Human Body Mesh from 2D Clothing
Cloth2Body: Generating 3D Human Body Mesh from 2D Clothing
Lu Dai
Liqian Ma
Shenhan Qian
Hao Liu
Ziwei Liu
Hui Xiong
3DH
94
4
0
28 Sep 2023
ELIP: Efficient Language-Image Pre-training with Fewer Vision Tokens
ELIP: Efficient Language-Image Pre-training with Fewer Vision Tokens
Yangyang Guo
Haoyu Zhang
Yongkang Wong
Liqiang Nie
Mohan Kankanhalli
VLM
71
4
0
28 Sep 2023
Feature Normalization Prevents Collapse of Non-contrastive Learning Dynamics
Feature Normalization Prevents Collapse of Non-contrastive Learning Dynamics
Han Bao
SSLMLT
106
1
0
28 Sep 2023
Masked Autoencoders are Scalable Learners of Cellular Morphology
Masked Autoencoders are Scalable Learners of Cellular Morphology
Oren Z. Kraus
Kian Kenyon-Dean
Saber Saberian
Maryam Fallah
Peter McLean
...
Chi Vicky Cheng
Kristen Morse
Maureen Makes
Ben Mabey
Berton Earnshaw
84
15
0
27 Sep 2023
Graph-level Representation Learning with Joint-Embedding Predictive Architectures
Graph-level Representation Learning with Joint-Embedding Predictive Architectures
Geri Skenderi
Hang Li
Jiliang Tang
Marco Cristani
AI4TSGNN
159
5
0
27 Sep 2023
Rapid Network Adaptation: Learning to Adapt Neural Networks Using
  Test-Time Feedback
Rapid Network Adaptation: Learning to Adapt Neural Networks Using Test-Time Feedback
Teresa Yeo
Oğuzhan Fatih Kar
Zahra Sodagar
Amir Zamir
TTAOOD
74
4
0
27 Sep 2023
CAIT: Triple-Win Compression towards High Accuracy, Fast Inference, and
  Favorable Transferability For ViTs
CAIT: Triple-Win Compression towards High Accuracy, Fast Inference, and Favorable Transferability For ViTs
Ao Wang
Hui Chen
Zijia Lin
Sicheng Zhao
Jiawei Han
Guiguang Ding
ViT
58
6
0
27 Sep 2023
Factorized Diffusion Architectures for Unsupervised Image Generation and
  Segmentation
Factorized Diffusion Architectures for Unsupervised Image Generation and Segmentation
Xin Yuan
Michael Maire
DiffM
77
2
0
27 Sep 2023
SGRec3D: Self-Supervised 3D Scene Graph Learning via Object-Level Scene
  Reconstruction
SGRec3D: Self-Supervised 3D Scene Graph Learning via Object-Level Scene Reconstruction
Sebastian Koch
Pedro Hermosilla
Narunas Vaskevicius
Mirco Colosi
Timo Ropinski
3DPCSSL
90
16
0
27 Sep 2023
Improving Facade Parsing with Vision Transformers and Line Integration
Improving Facade Parsing with Vision Transformers and Line Integration
Bowen Wang
Jiaxing Zhang
Ran Zhang
Yunqin Li
Liangzhi Li
Yuta Nakashima
ViT
31
18
0
27 Sep 2023
Previous
123...545556...949596
Next