ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.06377
  4. Cited By
Masked Autoencoders Are Scalable Vision Learners
v1v2v3 (latest)

Masked Autoencoders Are Scalable Vision Learners

11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
    ViTTPM
ArXiv (abs)PDFHTML

Papers citing "Masked Autoencoders Are Scalable Vision Learners"

50 / 4,779 papers shown
Title
DecisionNCE: Embodied Multimodal Representations via Implicit Preference
  Learning
DecisionNCE: Embodied Multimodal Representations via Implicit Preference Learning
Jianxiong Li
Jinliang Zheng
Yinan Zheng
Liyuan Mao
Xiaoming Hu
...
Jihao Liu
Yu Liu
Jingjing Liu
Ya Zhang
Xianyuan Zhan
LM&RoOffRL
93
10
0
28 Feb 2024
Classes Are Not Equal: An Empirical Study on Image Recognition Fairness
Classes Are Not Equal: An Empirical Study on Image Recognition Fairness
Jiequan Cui
Beier Zhu
Xin Wen
Xiaojuan Qi
Bei Yu
Haiqi Zhang
77
8
0
28 Feb 2024
Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis
Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis
Yanzuo Lu
Manlin Zhang
Andy J. Ma
Xiaohua Xie
Jian-Huang Lai
DiffM
68
24
0
28 Feb 2024
Downstream Task Guided Masking Learning in Masked Autoencoders Using Multi-Level Optimization
Downstream Task Guided Masking Learning in Masked Autoencoders Using Multi-Level Optimization
Han Guo
Ramtin Hosseini
Ruiyi Zhang
Sai Ashish Somayajula
Ranak Roy Chowdhury
Rajesh K. Gupta
Pengtao Xie
98
0
0
28 Feb 2024
ShapeLLM: Universal 3D Object Understanding for Embodied Interaction
ShapeLLM: Universal 3D Object Understanding for Embodied Interaction
Zekun Qi
Runpei Dong
Shaochen Zhang
Haoran Geng
Chunrui Han
Zheng Ge
Li Yi
Kaisheng Ma
204
63
0
27 Feb 2024
Massive Activations in Large Language Models
Massive Activations in Large Language Models
Mingjie Sun
Xinlei Chen
J. Zico Kolter
Zhuang Liu
129
81
0
27 Feb 2024
MedContext: Learning Contextual Cues for Efficient Volumetric Medical
  Segmentation
MedContext: Learning Contextual Cues for Efficient Volumetric Medical Segmentation
Hanan Gani
Muzammal Naseer
Fahad Khan
Salman Khan
67
0
0
27 Feb 2024
Masked Gamma-SSL: Learning Uncertainty Estimation via Masked Image
  Modeling
Masked Gamma-SSL: Learning Uncertainty Estimation via Masked Image Modeling
David S. W. Williams
Matthew Gadd
Paul Newman
Daniele De Martini
UQCV
37
1
0
27 Feb 2024
Enhancing EEG-to-Text Decoding through Transferable Representations from
  Pre-trained Contrastive EEG-Text Masked Autoencoder
Enhancing EEG-to-Text Decoding through Transferable Representations from Pre-trained Contrastive EEG-Text Masked Autoencoder
Jiaqi Wang
Zhenxi Song
Zhengyu Ma
Xipeng Qiu
Min Zhang
Zhiguo Zhang
158
8
0
27 Feb 2024
LSPT: Long-term Spatial Prompt Tuning for Visual Representation Learning
LSPT: Long-term Spatial Prompt Tuning for Visual Representation Learning
Shentong Mo
Yansen Wang
Xufang Luo
Dongsheng Li
VLM
99
2
0
27 Feb 2024
LocalGCL: Local-aware Contrastive Learning for Graphs
LocalGCL: Local-aware Contrastive Learning for Graphs
Haojun Jiang
Jiawei Sun
Jie Li
Chentao Wu
SSL
46
1
0
27 Feb 2024
Does Negative Sampling Matter? A Review with Insights into its Theory
  and Applications
Does Negative Sampling Matter? A Review with Insights into its Theory and Applications
Zhen Yang
Ming Ding
Tinglin Huang
Yukuo Cen
Junshuai Song
Bin Xu
Yuxiao Dong
Jie Tang
122
12
0
27 Feb 2024
Sora: A Review on Background, Technology, Limitations, and Opportunities
  of Large Vision Models
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
Yixin Liu
Kai Zhang
Yuan Li
Zhiling Yan
Chujie Gao
...
Yue Huang
Hanchi Sun
Jianfeng Gao
Lifang He
Lichao Sun
VLMVGenEGVM
197
300
0
27 Feb 2024
Lane2Seq: Towards Unified Lane Detection via Sequence Generation
Lane2Seq: Towards Unified Lane Detection via Sequence Generation
Kunyang Zhou
106
5
0
27 Feb 2024
HOISDF: Constraining 3D Hand-Object Pose Estimation with Global Signed
  Distance Fields
HOISDF: Constraining 3D Hand-Object Pose Estimation with Global Signed Distance Fields
Haozhe Qi
Chen Zhao
Mathieu Salzmann
Alexander Mathis
3DH
132
11
0
26 Feb 2024
ConSept: Continual Semantic Segmentation via Adapter-based Vision
  Transformer
ConSept: Continual Semantic Segmentation via Adapter-based Vision Transformer
Bowen Dong
Guanglei Yang
W. Zuo
Lei Zhang
93
1
0
26 Feb 2024
Generative Pretrained Hierarchical Transformer for Time Series
  Forecasting
Generative Pretrained Hierarchical Transformer for Time Series Forecasting
Zhiding Liu
Jiqian Yang
Mingyue Cheng
Yucong Luo
Zhi Li
AI4TS
79
19
0
26 Feb 2024
BLO-SAM: Bi-level Optimization Based Overfitting-Preventing Finetuning
  of SAM
BLO-SAM: Bi-level Optimization Based Overfitting-Preventing Finetuning of SAM
Li Zhang
Youwei Liang
Ruiyi Zhang
Amirhosein Javadi
Pengtao Xie
VLM
74
9
0
26 Feb 2024
StochCA: A Novel Approach for Exploiting Pretrained Models with
  Cross-Attention
StochCA: A Novel Approach for Exploiting Pretrained Models with Cross-Attention
SeungWon Seo
Suho Lee
Sangheum Hwang
83
0
0
25 Feb 2024
Key Design Choices in Source-Free Unsupervised Domain Adaptation: An
  In-depth Empirical Analysis
Key Design Choices in Source-Free Unsupervised Domain Adaptation: An In-depth Empirical Analysis
Andrea Maracani
Raffaello Camoriano
Elisa Maiettini
Davide Talon
Lorenzo Rosasco
Lorenzo Natale
66
1
0
25 Feb 2024
Instance-aware Exploration-Verification-Exploitation for Instance
  ImageGoal Navigation
Instance-aware Exploration-Verification-Exploitation for Instance ImageGoal Navigation
X. Lei
Min Wang
Wen-gang Zhou
Li Li
Houqiang Li
116
6
0
25 Feb 2024
Data-Efficient Operator Learning via Unsupervised Pretraining and In-Context Learning
Data-Efficient Operator Learning via Unsupervised Pretraining and In-Context Learning
Wuyang Chen
Jialin Song
Pu Ren
Shashank Subramanian
Dmitriy Morozov
Michael W. Mahoney
AI4CE
121
12
0
24 Feb 2024
A Study of Shape Modeling Against Noise
A Study of Shape Modeling Against Noise
Cheng Long
Adrian Barbu
MedIm
41
0
0
23 Feb 2024
Cohere3D: Exploiting Temporal Coherence for Unsupervised Representation
  Learning of Vision-based Autonomous Driving
Cohere3D: Exploiting Temporal Coherence for Unsupervised Representation Learning of Vision-based Autonomous Driving
Yichen Xie
Hongge Chen
Gregory P. Meyer
Yong Jae Lee
Eric M. Wolff
Masayoshi Tomizuka
Wei Zhan
Yuning Chai
Xin Huang
3DPC
78
1
0
23 Feb 2024
Attention-aware Semantic Communications for Collaborative Inference
Attention-aware Semantic Communications for Collaborative Inference
Jiwoong Im
Nayoung Kwon
Taewoo Park
Jiheon Woo
Jaeho Lee
Yongjune Kim
80
2
0
23 Feb 2024
Attention-Guided Masked Autoencoders For Learning Image Representations
Attention-Guided Masked Autoencoders For Learning Image Representations
Leon Sick
Dominik Engel
Pedro Hermosilla
Timo Ropinski
61
1
0
23 Feb 2024
Spatially-Aware Transformer for Embodied Agents
Spatially-Aware Transformer for Embodied Agents
Junmo Cho
Jaesik Yoon
Sungjin Ahn
82
1
0
23 Feb 2024
Label-efficient multi-organ segmentation with a diffusion model
Label-efficient multi-organ segmentation with a diffusion model
Yongzhi Huang
Jinxin Zhu
Haseeb Hassan
Liyilei Su
Jingyu Li
Binding Huang
Yun Peng
Jingyu Li
Jun Ma
Bingding Huang
DiffMMedIm
93
0
0
23 Feb 2024
The Common Stability Mechanism behind most Self-Supervised Learning
  Approaches
The Common Stability Mechanism behind most Self-Supervised Learning Approaches
Abhishek Jha
Matthew B. Blaschko
Yuki M. Asano
Tinne Tuytelaars
SSL
40
3
0
22 Feb 2024
CyberDemo: Augmenting Simulated Human Demonstration for Real-World
  Dexterous Manipulation
CyberDemo: Augmenting Simulated Human Demonstration for Real-World Dexterous Manipulation
Jun Wang
Yuzhe Qin
Kaiming Kuang
Yigit Korkmaz
Akhilan Gurumoorthy
Hao Su
Xiaolong Wang
95
20
0
22 Feb 2024
Self-Guided Masked Autoencoders for Domain-Agnostic Self-Supervised
  Learning
Self-Guided Masked Autoencoders for Domain-Agnostic Self-Supervised Learning
Johnathan Xie
Yoonho Lee
Annie S. Chen
Chelsea Finn
81
3
0
22 Feb 2024
Zero-Shot Pediatric Tuberculosis Detection in Chest X-Rays using
  Self-Supervised Learning
Zero-Shot Pediatric Tuberculosis Detection in Chest X-Rays using Self-Supervised Learning
Daniel Capellán-Martín
Abhijeet Parida
Juan J. Gómez-Valverde
Ramon Sanchez-Jacob
Pooneh Roshanitabrizi
M. Linguraru
María J. Ledesma-Carbayo
S. M. Anwar
69
2
0
22 Feb 2024
Multi-HMR: Multi-Person Whole-Body Human Mesh Recovery in a Single Shot
Multi-HMR: Multi-Person Whole-Body Human Mesh Recovery in a Single Shot
Fabien Baradel
M. Armando
Salma Galaaoui
Romain Brégier
Philippe Weinzaepfel
Grégory Rogez
Thomas Lucas
3DH
104
21
0
22 Feb 2024
CLCE: An Approach to Refining Cross-Entropy and Contrastive Learning for
  Optimized Learning Fusion
CLCE: An Approach to Refining Cross-Entropy and Contrastive Learning for Optimized Learning Fusion
Zijun Long
George Killick
Lipeng Zhuang
Gerardo Aragon Camarasa
Zaiqiao Meng
R. McCreadie
VLM
94
2
0
22 Feb 2024
Reading Relevant Feature from Global Representation Memory for Visual
  Object Tracking
Reading Relevant Feature from Global Representation Memory for Visual Object Tracking
Xinyu Zhou
Pinxue Guo
Lingyi Hong
Jinglun Li
Wei Zhang
Weifeng Ge
Wenqiang Zhang
102
12
0
22 Feb 2024
MAPE-PPI: Towards Effective and Efficient Protein-Protein Interaction
  Prediction via Microenvironment-Aware Protein Embedding
MAPE-PPI: Towards Effective and Efficient Protein-Protein Interaction Prediction via Microenvironment-Aware Protein Embedding
Lirong Wu
Yijun Tian
Yufei Huang
Siyuan Li
Haitao Lin
Nitesh Chawla
Stan Z. Li
81
24
0
22 Feb 2024
A Simple Framework Uniting Visual In-context Learning with Masked Image
  Modeling to Improve Ultrasound Segmentation
A Simple Framework Uniting Visual In-context Learning with Masked Image Modeling to Improve Ultrasound Segmentation
Yuyue Zhou
B. Felfeliyan
Shrimanti Ghosh
Jessica Knight
Fatima Alves-Pereira
Christopher Keen
Jessica Küpper
A. Hareendranathan
Jacob L. Jaremko
84
0
0
22 Feb 2024
Swin3D++: Effective Multi-Source Pretraining for 3D Indoor Scene
  Understanding
Swin3D++: Effective Multi-Source Pretraining for 3D Indoor Scene Understanding
Yu-Qi Yang
Yufeng Guo
Yang Liu
3DPC
101
2
0
22 Feb 2024
Compression Robust Synthetic Speech Detection Using Patched Spectrogram
  Transformer
Compression Robust Synthetic Speech Detection Using Patched Spectrogram Transformer
Amit Kumar Singh Yadav
Ziyue Xiang
Kratika Bhagtani
Paolo Bestagini
Stefano Tubaro
Edward J. Delp
ViT
72
2
0
22 Feb 2024
Subobject-level Image Tokenization
Subobject-level Image Tokenization
Delong Chen
Samuel Cahyawijaya
Jianfeng Liu
Baoyuan Wang
Pascale Fung
VLMOCL
282
9
0
22 Feb 2024
Real-time 3D-aware Portrait Editing from a Single Image
Real-time 3D-aware Portrait Editing from a Single Image
Qingyan Bai
Zifan Shi
Yinghao Xu
Hao Ouyang
Qiuyu Wang
Ceyuan Yang
Xuan Wang
Gordon Wetzstein
Yujun Shen
Qifeng Chen
3DHDiffM
122
10
0
21 Feb 2024
Multi-scale Spatio-temporal Transformer-based Imbalanced Longitudinal
  Learning for Glaucoma Forecasting from Irregular Time Series Images
Multi-scale Spatio-temporal Transformer-based Imbalanced Longitudinal Learning for Glaucoma Forecasting from Irregular Time Series Images
Xikai Yang
Jian Wu
Xi Wang
Yuchen Yuan
N. Wang
Pheng-Ann Heng
AI4TSMedIm
56
0
0
21 Feb 2024
Slot-VLM: SlowFast Slots for Video-Language Modeling
Slot-VLM: SlowFast Slots for Video-Language Modeling
Jiaqi Xu
Cuiling Lan
Wenxuan Xie
Xuejin Chen
Yan Lu
MLLMVLM
46
7
0
20 Feb 2024
VGMShield: Mitigating Misuse of Video Generative Models
VGMShield: Mitigating Misuse of Video Generative Models
Yan Pang
Yang Zhang
Yang Zhang
Tianhao Wang
119
3
0
20 Feb 2024
GOOD: Towards Domain Generalized Orientated Object Detection
GOOD: Towards Domain Generalized Orientated Object Detection
Qi Bi
Beichen Zhou
Jingjun Yi
Wei Ji
Haolan Zhan
Gui-Song Xia
ObjDOOD
141
2
0
20 Feb 2024
VideoPrism: A Foundational Visual Encoder for Video Understanding
VideoPrism: A Foundational Visual Encoder for Video Understanding
Long Zhao
N. B. Gundavarapu
Liangzhe Yuan
Hao Zhou
Shen Yan
...
Huisheng Wang
Hartwig Adam
Mikhail Sirotenko
Ting Liu
Boqing Gong
VGen
127
36
0
20 Feb 2024
Spatio-Temporal Few-Shot Learning via Diffusive Neural Network
  Generation
Spatio-Temporal Few-Shot Learning via Diffusive Neural Network Generation
Yuan Yuan
Chenyang Shao
Jingtao Ding
Depeng Jin
Yong Li
AI4TSAI4CE
124
20
0
19 Feb 2024
PhySU-Net: Long Temporal Context Transformer for rPPG with
  Self-Supervised Pre-training
PhySU-Net: Long Temporal Context Transformer for rPPG with Self-Supervised Pre-training
M. Savic
Guoying Zhao
ViT
77
5
0
19 Feb 2024
WildFake: A Large-scale Challenging Dataset for AI-Generated Images
  Detection
WildFake: A Large-scale Challenging Dataset for AI-Generated Images Detection
Yan Hong
Jianfu Zhang
141
13
0
19 Feb 2024
Universal Physics Transformers: A Framework For Efficiently Scaling Neural Operators
Universal Physics Transformers: A Framework For Efficiently Scaling Neural Operators
Benedikt Alkin
Andreas Fürst
Simon Schmid
Lukas Gruber
Markus Holzleitner
Johannes Brandstetter
PINNAI4CE
292
13
0
19 Feb 2024
Previous
123...404142...949596
Next