ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.06377
  4. Cited By
Masked Autoencoders Are Scalable Vision Learners
v1v2v3 (latest)

Masked Autoencoders Are Scalable Vision Learners

11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
    ViTTPM
ArXiv (abs)PDFHTML

Papers citing "Masked Autoencoders Are Scalable Vision Learners"

50 / 4,779 papers shown
Title
Autoencoding Conditional Neural Processes for Representation Learning
Autoencoding Conditional Neural Processes for Representation Learning
Victor Prokhorov
Ivan Titov
N. Siddharth
BDL
70
0
0
29 May 2023
Explicit Visual Prompting for Universal Foreground Segmentations
Explicit Visual Prompting for Universal Foreground Segmentations
Weihuang Liu
Xi Shen
Chi-Man Pun
Xiaodong Cun
VPVLMVLM
80
14
0
29 May 2023
DiffRate : Differentiable Compression Rate for Efficient Vision
  Transformers
DiffRate : Differentiable Compression Rate for Efficient Vision Transformers
Mengzhao Chen
Wenqi Shao
Peng Xu
Mingbao Lin
Kaipeng Zhang
Yong Li
Rongrong Ji
Yu Qiao
Ping Luo
ViT
103
46
0
29 May 2023
Streaming Audio Transformers for Online Audio Tagging
Streaming Audio Transformers for Online Audio Tagging
Heinrich Dinkel
Zhiyong Yan
Yongqing Wang
Junbo Zhang
Yujun Wang
Bin Wang
89
4
0
29 May 2023
Reconstructing Sea Surface Temperature Images: A Masked Autoencoder
  Approach for Cloud Masking and Reconstruction
Reconstructing Sea Surface Temperature Images: A Masked Autoencoder Approach for Cloud Masking and Reconstruction
Angelina Agabin
J. L. U. O. California
15
3
0
28 May 2023
Self-attention Dual Embedding for Graphs with Heterophily
Self-attention Dual Embedding for Graphs with Heterophily
Yurui Lai
Taiyan Zhang
Rui Fan
GNN
108
0
0
28 May 2023
Caterpillar: A Pure-MLP Architecture with Shifted-Pillars-Concatenation
Caterpillar: A Pure-MLP Architecture with Shifted-Pillars-Concatenation
J. Sun
Xiaoshuang Shi
Zhiyuan Weng
Kaidi Xu
Jikang Cheng
Xiao-lan Zhu
MLLM
72
2
0
28 May 2023
A Group Symmetric Stochastic Differential Equation Model for Molecule Multi-modal Pretraining
A Group Symmetric Stochastic Differential Equation Model for Molecule Multi-modal Pretraining
Shengchao Liu
Weitao Du
Zhiming Ma
Hongyu Guo
Jian Tang
107
33
0
28 May 2023
Translatotron 3: Speech to Speech Translation with Monolingual Data
Translatotron 3: Speech to Speech Translation with Monolingual Data
Eliya Nachmani
Alon Levkovitch
Yi-Yang Ding
Chulayutsh Asawaroengchai
Heiga Zen
Michelle Tadmor Ramanovich
91
15
0
27 May 2023
Zero-TPrune: Zero-Shot Token Pruning through Leveraging of the Attention
  Graph in Pre-Trained Transformers
Zero-TPrune: Zero-Shot Token Pruning through Leveraging of the Attention Graph in Pre-Trained Transformers
Hongjie Wang
Bhishma Dedhia
N. Jha
ViTVLM
133
29
0
27 May 2023
On the Importance of Backbone to the Adversarial Robustness of Object Detectors
On the Importance of Backbone to the Adversarial Robustness of Object Detectors
Xiao-Li Li
Hang Chen
Xiaolin Hu
AAML
134
4
0
27 May 2023
Robust Lane Detection through Self Pre-training with Masked Sequential
  Autoencoders and Fine-tuning with Customized PolyLoss
Robust Lane Detection through Self Pre-training with Masked Sequential Autoencoders and Fine-tuning with Customized PolyLoss
Ruohan Li
Yongqi Dong
78
4
0
26 May 2023
Im-Promptu: In-Context Composition from Image Prompts
Im-Promptu: In-Context Composition from Image Prompts
Bhishma Dedhia
Michael Chang
Jake C. Snell
Thomas Griffiths
N. Jha
LRMMLLM
105
2
0
26 May 2023
Contrast, Attend and Diffuse to Decode High-Resolution Images from Brain
  Activities
Contrast, Attend and Diffuse to Decode High-Resolution Images from Brain Activities
Jingyuan Sun
Mingxiao Li
Zijiao Chen
Yunhao Zhang
Shaonan Wang
Marie-Francine Moens
DiffM
127
33
0
26 May 2023
Detect Any Shadow: Segment Anything for Video Shadow Detection
Detect Any Shadow: Segment Anything for Video Shadow Detection
Yonghui Wang
Wen-gang Zhou
Yunyao Mao
Houqiang Li
VLM
98
24
0
26 May 2023
Future-conditioned Unsupervised Pretraining for Decision Transformer
Future-conditioned Unsupervised Pretraining for Decision Transformer
Zhihui Xie
Zichuan Lin
Deheng Ye
Qiang Fu
Wei Yang
Shuai Li
OffRLOnRL
92
23
0
26 May 2023
EgoHumans: An Egocentric 3D Multi-Human Benchmark
EgoHumans: An Egocentric 3D Multi-Human Benchmark
Rawal Khirodkar
Aayush Bansal
Lingni Ma
Richard Newcombe
Minh Vo
Kris Kitani
EgoV
104
36
0
25 May 2023
Image as First-Order Norm+Linear Autoregression: Unveiling Mathematical
  Invariance
Image as First-Order Norm+Linear Autoregression: Unveiling Mathematical Invariance
Yinpeng Chen
Xiyang Dai
Dongdong Chen
Mengchen Liu
Lu Yuan
Zicheng Liu
Youzuo Lin
101
2
0
25 May 2023
Scan and Snap: Understanding Training Dynamics and Token Composition in
  1-layer Transformer
Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer
Yuandong Tian
Yiping Wang
Beidi Chen
S. Du
MLT
113
79
0
25 May 2023
MixFormerV2: Efficient Fully Transformer Tracking
MixFormerV2: Efficient Fully Transformer Tracking
Yutao Cui
Tian-Shu Song
Gangshan Wu
Liming Wang
90
59
0
25 May 2023
Weakly Supervised Vision-and-Language Pre-training with Relative
  Representations
Weakly Supervised Vision-and-Language Pre-training with Relative Representations
Chi Chen
Peng Li
Maosong Sun
Yang Liu
67
2
0
24 May 2023
RoMa: Robust Dense Feature Matching
RoMa: Robust Dense Feature Matching
Johan Edstedt
Qiyu Sun
Georg Bökman
Maarten Wadenback
Michael Felsberg
3DV
98
110
0
24 May 2023
Learning high-level visual representations from a child's perspective
  without strong inductive biases
Learning high-level visual representations from a child's perspective without strong inductive biases
A. Orhan
Brenden M. Lake
SSL
91
19
0
24 May 2023
ViTMatte: Boosting Image Matting with Pretrained Plain Vision
  Transformers
ViTMatte: Boosting Image Matting with Pretrained Plain Vision Transformers
J. Yao
Xinggang Wang
Shusheng Yang
Baoyuan Wang
ViT
108
64
0
24 May 2023
Delving Deeper into Data Scaling in Masked Image Modeling
Delving Deeper into Data Scaling in Masked Image Modeling
Cheng Lu
Xiaojie Jin
Qibin Hou
Jun Hao Liew
Mingg-Ming Cheng
Jiashi Feng
71
4
0
24 May 2023
A Joint Time-frequency Domain Transformer for Multivariate Time Series
  Forecasting
A Joint Time-frequency Domain Transformer for Multivariate Time Series Forecasting
Yushu Chen
Shengzhuo Liu
Jinzhe Yang
Hao Jing
Wenlai Zhao
Guang-Wu Yang
AI4TS
50
22
0
24 May 2023
Difference-Masking: Choosing What to Mask in Continued Pretraining
Difference-Masking: Choosing What to Mask in Continued Pretraining
Alex Wilf
Syeda Nahida Akter
Leena Mathur
Paul Pu Liang
Sheryl Mathew
Mengrou Shou
Eric Nyberg
Louis-Philippe Morency
CLLSSL
61
5
0
23 May 2023
Siamese Masked Autoencoders
Siamese Masked Autoencoders
Agrim Gupta
Jiajun Wu
Jia Deng
Li Fei-Fei
92
56
0
23 May 2023
Weakly-Supervised Learning of Visual Relations in Multimodal Pretraining
Weakly-Supervised Learning of Visual Relations in Multimodal Pretraining
Emanuele Bugliarello
Aida Nematzadeh
Lisa Anne Hendricks
SSL
110
5
0
23 May 2023
Multilingual Pixel Representations for Translation and Effective
  Cross-lingual Transfer
Multilingual Pixel Representations for Translation and Effective Cross-lingual Transfer
Elizabeth Salesky
Neha Verma
Philipp Koehn
Matt Post
95
16
0
23 May 2023
Masked Path Modeling for Vision-and-Language Navigation
Masked Path Modeling for Vision-and-Language Navigation
Zi-Yi Dou
Feng Gao
Nanyun Peng
LM&Ro
83
3
0
23 May 2023
Training Transitive and Commutative Multimodal Transformers with LoReTTa
Training Transitive and Commutative Multimodal Transformers with LoReTTa
Manuel Tran
Yashin Dicente Cid
Amal Lahiani
Fabian J. Theis
Tingying Peng
Eldad Klaiman
64
2
0
23 May 2023
DUBLIN -- Document Understanding By Language-Image Network
DUBLIN -- Document Understanding By Language-Image Network
Kriti Aggarwal
Aditi Khandelwal
Kumar Tanmay
Owais Mohammed Khan
Qiang Liu
Monojit Choudhury
Hardik Hansrajbhai Chauhan
Subhojit Som
Vishrav Chaudhary
Saurabh Tiwary
ObjDVLM
117
0
0
23 May 2023
A multimodal method based on cross-attention and convolution for
  postoperative infection diagnosis
A multimodal method based on cross-attention and convolution for postoperative infection diagnosis
Xianjie Liu
Hon-Yi Shi
39
0
0
23 May 2023
Reparo: Loss-Resilient Generative Codec for Video Conferencing
Reparo: Loss-Resilient Generative Codec for Video Conferencing
Tianhong Li
Vibhaalakshmi Sivaraman
Pantea Karimi
Lijie Fan
M. Alizadeh
Dina Katabi
72
7
0
23 May 2023
Weakly Supervised 3D Open-vocabulary Segmentation
Weakly Supervised 3D Open-vocabulary Segmentation
Kunhao Liu
Fangneng Zhan
Jiahui Zhang
Muyu Xu
Yingchen Yu
Abdulmotaleb El Saddik
Christian Theobalt
Eric P. Xing
Shijian Lu
125
70
0
23 May 2023
Decoupled Kullback-Leibler Divergence Loss
Decoupled Kullback-Leibler Divergence Loss
Jiequan Cui
Zhuotao Tian
Zhisheng Zhong
Xiaojuan Qi
Bei Yu
Hanwang Zhang
83
46
0
23 May 2023
NORM: Knowledge Distillation via N-to-One Representation Matching
NORM: Knowledge Distillation via N-to-One Representation Matching
Xiaolong Liu
Lujun Li
Chao Li
Anbang Yao
116
71
0
23 May 2023
Know Your Self-supervised Learning: A Survey on Image-based Generative
  and Discriminative Training
Know Your Self-supervised Learning: A Survey on Image-based Generative and Discriminative Training
Utku Ozbulak
Hyun Jung Lee
Beril Boga
Esla Timothy Anzaku
Ho-min Park
Arnout Van Messem
W. D. Neve
J. Vankerschaver
DiffM
112
38
0
23 May 2023
A Dive into SAM Prior in Image Restoration
A Dive into SAM Prior in Image Restoration
Zeyu Xiao
Jiawang Bai
Zhihe Lu
Zhiwei Xiong
73
17
0
23 May 2023
Generalizable Synthetic Image Detection via Language-guided Contrastive Learning
Generalizable Synthetic Image Detection via Language-guided Contrastive Learning
Haiwei Wu
Jiantao Zhou
Shile Zhang
212
30
0
23 May 2023
Tied-Augment: Controlling Representation Similarity Improves Data
  Augmentation
Tied-Augment: Controlling Representation Similarity Improves Data Augmentation
Emirhan Kurtuluş
Zichao Li
Yann N. Dauphin
E. D. Cubuk
75
3
0
22 May 2023
Revisiting pre-trained remote sensing model benchmarks: resizing and
  normalization matters
Revisiting pre-trained remote sensing model benchmarks: resizing and normalization matters
Isaac Corley
Caleb Robinson
Rahul Dodhia
J. L. Ferres
Peyman Najafirad
122
19
0
22 May 2023
Efficient Large-Scale Visual Representation Learning And Evaluation
Efficient Large-Scale Visual Representation Learning And Evaluation
Eden Dolev
A. Awad
Denisa Roberts
Zahra Ebrahimzadeh
Marcin Mejran
Vaibhav Malpani
Mahir Yavuz
96
0
0
22 May 2023
Matcher: Segment Anything with One Shot Using All-Purpose Feature
  Matching
Matcher: Segment Anything with One Shot Using All-Purpose Feature Matching
Yang Liu
Muzhi Zhu
Hengtao Li
Hao Chen
Xinlong Wang
Chunhua Shen
VLMMLLM
181
90
0
22 May 2023
Restore Anything Pipeline: Segment Anything Meets Image Restoration
Restore Anything Pipeline: Segment Anything Meets Image Restoration
Jiaxi Jiang
Christian Holz
VLM
89
8
0
22 May 2023
Enhanced Meta Label Correction for Coping with Label Corruption
Enhanced Meta Label Correction for Coping with Label Corruption
Mitchell Keren Taraday
Chaim Baskin
86
1
0
22 May 2023
Contrastive Predictive Autoencoders for Dynamic Point Cloud
  Self-Supervised Learning
Contrastive Predictive Autoencoders for Dynamic Point Cloud Self-Supervised Learning
Xiaoxiao Sheng
Zhiqiang Shen
Gang Xiao
3DPCSSL
50
6
0
22 May 2023
Label Smarter, Not Harder: CleverLabel for Faster Annotation of
  Ambiguous Image Classification with Higher Quality
Label Smarter, Not Harder: CleverLabel for Faster Annotation of Ambiguous Image Classification with Higher Quality
Lars Schmarje
Vasco Grossmann
Tim Michels
Jakob Nazarenus
M. Santarossa
Claudius Zelenka
Reinhard Koch
81
3
0
22 May 2023
Bi-ViT: Pushing the Limit of Vision Transformer Quantization
Bi-ViT: Pushing the Limit of Vision Transformer Quantization
Yanjing Li
Sheng Xu
Mingbao Lin
Xianbin Cao
Chuanjian Liu
Xiao Sun
Baochang Zhang
ViTMQ
97
11
0
21 May 2023
Previous
123...666768...949596
Next