Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.06377
Cited By
v1
v2
v3 (latest)
Masked Autoencoders Are Scalable Vision Learners
11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Masked Autoencoders Are Scalable Vision Learners"
50 / 4,777 papers shown
Title
Next state prediction gives rise to entangled, yet compositional representations of objects
Tankred Saanum
Luca M. Schulze Buschoff
Peter Dayan
Eric Schulz
OCL
CoGe
OOD
65
1
0
07 Oct 2024
A Simple Image Segmentation Framework via In-Context Examples
Yang Liu
Chenchen Jing
Hengtao Li
Muzhi Zhu
Hao Chen
Xinlong Wang
Chunhua Shen
99
8
0
07 Oct 2024
Resource-Efficient Multiview Perception: Integrating Semantic Masking with Masked Autoencoders
Kosta Dakic
Kanchana Thilakarathna
Rodrigo N. Calheiros
Teng Joon Lim
60
0
0
07 Oct 2024
Masked Autoencoder with Swin Transformer Network for Mitigating Electrode Shift in HD-EMG-based Gesture Recognition
Kasra Laamerad
Mehran Shabanpour
Md. Rabiul Islam
Arash Mohammadi
38
0
0
07 Oct 2024
On Efficient Variants of Segment Anything Model: A Survey
Xiaorui Sun
Jing Liu
Jikang Cheng
Xiaofeng Zhu
Ping Hu
VLM
143
7
0
07 Oct 2024
Learning De-Biased Representations for Remote-Sensing Imagery
Zichen Tian
Zhaozheng Chen
Qianru Sun
62
0
0
06 Oct 2024
Self-Supervised Anomaly Detection in the Wild: Favor Joint Embeddings Methods
Daniel Otero
Rafael Mateus
Randall Balestriero
51
0
0
05 Oct 2024
Implicit to Explicit Entropy Regularization: Benchmarking ViT Fine-tuning under Noisy Labels
Maria Marrium
Arif Mahmood
Mohammed Bennamoun
NoLa
AAML
102
0
0
05 Oct 2024
SyllableLM: Learning Coarse Semantic Units for Speech Language Models
Alan Baade
Puyuan Peng
David Harwath
126
8
0
05 Oct 2024
IT
3
^3
3
: Idempotent Test-Time Training
Nikita Durasov
Assaf Shocher
Doruk Öner
Gal Chechik
Alexei A. Efros
Pascal Fua
OOD
VLM
117
1
0
05 Oct 2024
Not All Diffusion Model Activations Have Been Evaluated as Discriminative Features
Benyuan Meng
Qianqian Xu
Zitai Wang
Xiaochun Cao
Qingming Huang
82
7
0
04 Oct 2024
VEDIT: Latent Prediction Architecture For Procedural Video Representation Learning
Han Lin
Tushar Nagarajan
Nicolas Ballas
Mido Assran
Mojtaba Komeili
Joey Tianyi Zhou
Koustuv Sinha
AI4TS
110
5
0
04 Oct 2024
Self-supervised Spatio-Temporal Graph Mask-Passing Attention Network for Perceptual Importance Prediction of Multi-point Tactility
Dazhong He
Qian Liu
28
0
0
04 Oct 2024
Adaptive Masking Enhances Visual Grounding
Sen Jia
Lei Li
75
0
0
04 Oct 2024
ECHOPulse: ECG controlled echocardio-grams video generation
Yiwei Li
Sekeun Kim
Zihao Wu
Hanqi Jiang
Yi Pan
...
Sifan Song
Yucheng Shi
Tianming Liu
Quanzheng Li
Xiang Li
VGen
69
1
0
04 Oct 2024
Predictive Coding for Decision Transformer
Tung M. Luu
Donghoon Lee
Chang D. Yoo
OffRL
129
2
0
04 Oct 2024
AirLetters: An Open Video Dataset of Characters Drawn in the Air
Rishit Dagli
Guillaume Berger
Joanna Materzynska
Ingo Bax
Roland Memisevic
VGen
67
1
0
03 Oct 2024
Task-Decoupled Image Inpainting Framework for Class-specific Object Remover
Changsuk Oh
H. J. Kim
97
0
0
03 Oct 2024
A Foundation Model for the Solar Dynamics Observatory
James Walsh
Daniel G. Gass
Raul Ramos Pollan
P. Wright
Richard Galvez
Noah Kasmanoff
Jason Naradowsky
Anne Spalding
James Parr
Atılım Güneş Baydin
3DGS
14
0
0
03 Oct 2024
Personalized Federated Learning for Generative AI-Assisted Semantic Communications
Yubo Peng
Feibo Jiang
Li Dong
Kezhi Wang
Kun Yang
80
2
0
03 Oct 2024
Unsupervised Meta-Learning via Dynamic Head and Heterogeneous Task Construction for Few-Shot Classification
Yunchuan Guan
Yu Liu
Ketong Liu
Ke Zhou
Zhiqi Shen
78
1
0
03 Oct 2024
EmbedLLM: Learning Compact Representations of Large Language Models
Richard Zhuang
Tianhao Wu
Zhaojin Wen
Andrew Li
Jiantao Jiao
Kannan Ramchandran
AIFin
69
6
0
03 Oct 2024
BiSSL: Enhancing the Alignment Between Self-Supervised Pretraining and Downstream Fine-Tuning via Bilevel Optimization
Gustav Wagner Zakarias
Lars Kai Hansen
Zheng-Hua Tan
81
0
0
03 Oct 2024
MDSGen: Fast and Efficient Masked Diffusion Temporal-Aware Transformers for Open-Domain Sound Generation
T. Pham
Tri Ton
Chang D. Yoo
105
3
0
03 Oct 2024
TAEGAN: Generating Synthetic Tabular Data For Data Augmentation
Jiayu Li
Zilong Zhao
Kevin Yee
Uzair Javaid
Biplab Sikdar
LMTD
73
1
0
02 Oct 2024
Forte : Finding Outliers with Representation Typicality Estimation
Debargha Ganguly
Warren Morningstar
A. Yu
Vipin Chaudhary
OODD
93
2
0
02 Oct 2024
Denoising with a Joint-Embedding Predictive Architecture
Dengsheng Chen
Jie Hu
Xiaoming Wei
Enhua Wu
DiffM
172
3
0
02 Oct 2024
Text2PDE: Latent Diffusion Models for Accessible Physics Simulation
Anthony Zhou
Zijie Li
Michael Schneier
John R Buchanan Jr
Amir Barati Farimani
AI4CE
DiffM
170
8
0
02 Oct 2024
Pre-training with Synthetic Patterns for Audio
Yuchi Ishikawa
Tatsuya Komatsu
Yoshimitsu Aoki
58
0
0
01 Oct 2024
Domain Aware Multi-Task Pretraining of 3D Swin Transformer for T1-weighted Brain MRI
Jonghun Kim
Mansu Kim
Hyunjin Park
MedIm
ViT
54
0
0
01 Oct 2024
CXPMRG-Bench: Pre-training and Benchmarking for X-ray Medical Report Generation on CheXpert Plus Dataset
Xiao Wang
Fuling Wang
Yuehang Li
Qingchuan Ma
Shiao Wang
Bo Jiang
Chuanfu Li
Jin Tang
119
4
0
01 Oct 2024
MAP: Unleashing Hybrid Mamba-Transformer Vision Backbone's Potential with Masked Autoregressive Pretraining
Yunze Liu
Li Yi
Mamba
190
3
0
01 Oct 2024
Advancing Medical Radiograph Representation Learning: A Hybrid Pre-training Paradigm with Multilevel Semantic Granularity
Hanqi Jiang
Xixuan Hao
Yuzhou Huang
Chong Ma
Jiaxun Zhang
Yi Pan
Ruimao Zhang
MedIm
175
0
0
01 Oct 2024
Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained Transformers
Lirui Wang
Xinlei Chen
Jialiang Zhao
Kaiming He
73
44
0
30 Sep 2024
AI Foundation Model for Heliophysics: Applications, Design, and Implementation
Sujit Roy
Talwinder Singh
Marcus Freitag
J. Schmude
Rohit Lal
...
Berkay Aydin
Nikolai Pogorelov
Juan Bernabé-Moreno
M. Maskey
Rahul Ramachandran
MedIm
AI4CE
94
0
0
30 Sep 2024
Task-Oriented Pre-Training for Drivable Area Detection
Fulong Ma
Guoyang Zhao
Weiqing Qi
Ming Liu
Jun Ma
VLM
66
1
0
30 Sep 2024
Masked Autoregressive Model for Weather Forecasting
Doyi Kim
Minseok Seo
Hakjin Lee
Junghoon Seo
77
0
0
30 Sep 2024
SurgPETL: Parameter-Efficient Image-to-Surgical-Video Transfer Learning for Surgical Phase Recognition
Shu Yang
Zhiyuan Cai
Luyang Luo
Ning Ma
Shuchang Xu
Hao Chen
67
1
0
30 Sep 2024
Image Copy Detection for Diffusion Models
Wenhao Wang
Yifan Sun
Zhentao Tan
Yi Yang
76
1
0
30 Sep 2024
MaskMamba: A Hybrid Mamba-Transformer Model for Masked Image Generation
Wenchao Chen
Liqiang Niu
Ziyao Lu
Fandong Meng
Jie Zhou
Mamba
96
4
0
30 Sep 2024
Towards Open-Vocabulary Semantic Segmentation Without Semantic Labels
Heeseong Shin
Chaehyun Kim
Sunghwan Hong
Seokju Cho
Anurag Arnab
Paul Hongsuck Seo
Seungryong Kim
VLM
82
1
0
30 Sep 2024
Annotation-Free Curb Detection Leveraging Altitude Difference Image
Fulong Ma
Peng Hou
Yuxuan Liu
Yang Liu
Ming Liu
Jun Ma
53
0
0
30 Sep 2024
Feature Extractor or Decision Maker: Rethinking the Role of Visual Encoders in Visuomotor Policies
Ruiyu Wang
Zheyu Zhuang
Shutong Jin
Nils Ingelhag
Danica Kragic
Florian T. Pokorny
97
0
0
30 Sep 2024
Vision-Language Models are Strong Noisy Label Detectors
Tong Wei
Haoyang Li
Chun-Shu Li
Jiang-Xin Shi
Yu-Feng Li
Min-Ling Zhang
VLM
76
9
0
29 Sep 2024
Text-driven Human Motion Generation with Motion Masked Diffusion Model
Xingyu Chen
DiffM
VGen
57
2
0
29 Sep 2024
Self-supervised Auxiliary Learning for Texture and Model-based Hybrid Robust and Fair Featuring in Face Analysis
Shukesh Reddy
Nishit Poddar
Srijan Das
Abhijit Das
CVBM
74
0
0
29 Sep 2024
BiPC: Bidirectional Probability Calibration for Unsupervised Domain Adaption
Wenlve Zhou
Zhiheng Zhou
Junyuan Shang
Chang Niu
Mingyue Zhang
Xiyuan Tao
Tianlei Wang
78
0
0
29 Sep 2024
Contrastive ground-level image and remote sensing pre-training improves representation learning for natural world imagery
Andy V. Huynh
Lauren E. Gillespie
Jael Lopez-Saucedo
Claire Tang
Rohan Sikand
Moisés Expósito-Alonso
SSL
120
5
0
28 Sep 2024
Fast Encoding and Decoding for Implicit Video Representation
Hao Chen
Saining Xie
Ser-Nam Lim
Abhinav Shrivastava
83
1
0
28 Sep 2024
Brain-JEPA: Brain Dynamics Foundation Model with Gradient Positioning and Spatiotemporal Masking
Zijian Dong
Ruilin Li
Yilei Wu
Thuan Tinh Nguyen
J. Chong
Fang Ji
Nathanael Ren Jie Tong
Christopher Li Hsian Chen
Juan Helen Zhou
60
9
0
28 Sep 2024
Previous
1
2
3
...
19
20
21
...
94
95
96
Next