Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.06377
Cited By
v1
v2
v3 (latest)
Masked Autoencoders Are Scalable Vision Learners
11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Masked Autoencoders Are Scalable Vision Learners"
50 / 4,779 papers shown
Title
Exploring Intrinsic Properties of Medical Images for Self-Supervised Binary Semantic Segmentation
P. Singh
Jacopo Cirrone
86
0
0
04 Feb 2024
Region-Based Representations Revisited
Michal Shlapentokh-Rothman
Ansel Blume
Yao Xiao
Yuqun Wu
TV Sethuraman
Heyi Tao
Jae Yong Lee
Wilfredo Torres
Yu-Xiong Wang
Derek Hoiem
124
12
0
04 Feb 2024
Learning Semantic Proxies from Visual Prompts for Parameter-Efficient Fine-Tuning in Deep Metric Learning
Li Ren
Chen Chen
Liqiang Wang
Kien Hua
77
5
0
04 Feb 2024
MLIP: Enhancing Medical Visual Representation with Divergence Encoder and Knowledge-guided Contrastive Learning
Zhe Li
Laurence T. Yang
Bocheng Ren
Xin Nie
Zhangyang Gao
Cheng Tan
Stan Z. Li
VLM
79
16
0
03 Feb 2024
SynthCLIP: Are We Ready for a Fully Synthetic CLIP Training?
Hasan Hammoud
Hani Itani
Fabio Pizzati
Philip Torr
Adel Bibi
Guohao Li
CLIP
VLM
227
38
0
02 Feb 2024
Cross-view Masked Diffusion Transformers for Person Image Synthesis
T. Pham
Zhang Kang
Chang D. Yoo
108
6
0
02 Feb 2024
A Probabilistic Model behind Self-Supervised Learning
Alice Bizeul
Bernhard Schölkopf
Carl Allen
SSL
83
2
0
02 Feb 2024
Guiding Masked Representation Learning to Capture Spatio-Temporal Relationship of Electrocardiogram
Yeongyeon Na
Minje Park
Yunwon Tae
S. Joo
92
31
0
02 Feb 2024
Scale Equalization for Multi-Level Feature Fusion
Bum Jun Kim
Sang Woo Kim
26
1
0
02 Feb 2024
Interpretation of Intracardiac Electrograms Through Textual Representations
William Jongwon Han
Diana Gomez
Avi Alok
Chaojing Duan
Michael A. Rosenberg
Douglas Weber
Emerson Liu
Ding Zhao
78
2
0
02 Feb 2024
Building Expressive and Tractable Probabilistic Generative Models: A Review
Sahil Sidheekh
S. Natarajan
TPM
76
5
0
01 Feb 2024
Spectrally Transformed Kernel Regression
Runtian Zhai
Rattana Pukdee
Roger Jin
Maria-Florina Balcan
Pradeep Ravikumar
BDL
74
2
0
01 Feb 2024
Merging Multi-Task Models via Weight-Ensembling Mixture of Experts
Anke Tang
Li Shen
Yong Luo
Nan Yin
Lefei Zhang
Dacheng Tao
MoMe
90
54
0
01 Feb 2024
Multi-scale Traffic Pattern Bank for Cross-city Few-shot Traffic Forecasting
Zhanyu Liu
Guanjie Zheng
Yanwei Yu
AI4TS
121
6
0
01 Feb 2024
Machine Unlearning for Image-to-Image Generative Models
Guihong Li
Hsiang Hsu
Chun-Fu Chen
R. Marculescu
MU
VLM
148
30
0
01 Feb 2024
Self-supervised learning of video representations from a child's perspective
A. Orhan
Wentao Wang
Alex N. Wang
Mengye Ren
Brenden M. Lake
62
4
0
01 Feb 2024
Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation
Maoyuan Ye
Jing Zhang
Juhua Liu
Chenyu Liu
Baocai Yin
Cong Liu
Bo Du
Dacheng Tao
VLM
109
15
0
31 Jan 2024
Convolution Meets LoRA: Parameter Efficient Finetuning for Segment Anything Model
Zihan Zhong
Zhiqiang Tang
Tong He
Haoyang Fang
Chun Yuan
104
49
0
31 Jan 2024
SimAda: A Simple Unified Framework for Adapting Segment Anything Model in Underperformed Scenes
Yiran Song
Qianyu Zhou
Xuequan Lu
Zhiwen Shao
Lizhuang Ma
87
7
0
31 Jan 2024
A Medical Data-Effective Learning Benchmark for Highly Efficient Pre-training of Foundation Models
Wenxuan Yang
Weimin Tan
Yuqi Sun
Bo Yan
46
1
0
31 Jan 2024
Towards Visual Syntactical Understanding
Sayeed Shafayet Chowdhury
Soumyadeep Chandra
Kaushik Roy
NAI
155
0
0
30 Jan 2024
MouSi: Poly-Visual-Expert Vision-Language Models
Xiaoran Fan
Tao Ji
Changhao Jiang
Shuo Li
Senjie Jin
...
Qi Zhang
Xipeng Qiu
Xuanjing Huang
Zuxuan Wu
Yunchun Jiang
VLM
54
17
0
30 Jan 2024
EarthGPT: A Universal Multi-modal Large Language Model for Multi-sensor Image Comprehension in Remote Sensing Domain
Wei Zhang
Miaoxin Cai
Tong Zhang
Zhuang Yin
Xuerui Mao
135
102
0
30 Jan 2024
Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models
Weijiao Zhang
Jindong Han
Zhao Xu
Hang Ni
Hao Liu
Hui Xiong
Hui Xiong
AI4CE
250
18
0
30 Jan 2024
OptiState: State Estimation of Legged Robots using Gated Networks with Transformer-based Vision and Kalman Filtering
Alexander Schperberg
Yusuke Tanaka
S. Mowlavi
Feng Xu
Bharathan Balaji
Dennis W. Hong
71
5
0
30 Jan 2024
Computer Vision for Primate Behavior Analysis in the Wild
Richard Vogg
Timo Lüddecke
Jonathan Henrich
Sharmita Dey
Matthias Nuske
...
Alexander Gail
Stefan Treue
H. Scherberger
Florentin Wörgötter
Alexander S. Ecker
131
6
0
29 Jan 2024
Bridging Generative and Discriminative Models for Unified Visual Perception with Diffusion Priors
Shiyin Dong
Mingrui Zhu
Kun Cheng
Nannan Wang
Xinbo Gao
DiffM
45
3
0
29 Jan 2024
Masked Audio Modeling with CLAP and Multi-Objective Learning
Yifei Xin
Xiulian Peng
Yan Lu
112
8
0
29 Jan 2024
MLEM: Generative and Contrastive Learning as Distinct Modalities for Event Sequences
Viktor Moskvoretskii
Dmitry Osin
Egor Shvetsov
Igor Udovichenko
Maxim Zhelnin
Andrey Dukhovny
Anna Zhimerikina
Evgeny Burnaev
AI4TS
102
2
0
29 Jan 2024
Importance-Aware Adaptive Dataset Distillation
Guang Li
Ren Togo
Takahiro Ogawa
Miki Haseyama
DD
116
9
0
29 Jan 2024
Continuous-Multiple Image Outpainting in One-Step via Positional Query and A Diffusion-based Approach
Shaofeng Zhang
Jinfa Huang
Qiang-feng Zhou
Zhibin Wang
Fan Wang
Jiebo Luo
Junchi Yan
DiffM
114
12
0
28 Jan 2024
Intriguing Equivalence Structures of the Embedding Space of Vision Transformers
Shaeke Salman
M. Shams
Xiuwen Liu
95
6
0
28 Jan 2024
GEM: Boost Simple Network for Glass Surface Segmentation via Segment Anything Model and Data Synthesis
Jing Hao
Moyun Liu
Kuo Feng Hung
DiffM
59
2
0
27 Jan 2024
SAM-based instance segmentation models for the automation of structural damage detection
Zehao Ye
Lucy Lovell
A. Faramarzi
Jelena Ninić
99
15
0
27 Jan 2024
Masked Pre-trained Model Enables Universal Zero-shot Denoiser
Xiaoxiao Ma
Zhixiang Wei
Yi Jin
Pengyang Ling
Tianle Liu
Ben Wang
Junkang Dai
H. Chen
Enhong Chen
VLM
85
0
0
26 Jan 2024
Revisiting Active Learning in the Era of Vision Foundation Models
S. Gupte
Josiah Aklilu
Jeffrey Nirschl
Serena Yeung-Levy
VLM
60
5
0
25 Jan 2024
Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities
Yiyuan Zhang
Xiaohan Ding
Kaixiong Gong
Yixiao Ge
Ying Shan
Xiangyu Yue
ViT
139
7
0
25 Jan 2024
Deconstructing Denoising Diffusion Models for Self-Supervised Learning
Xinlei Chen
Zhuang Liu
Saining Xie
Kaiming He
DiffM
90
60
0
25 Jan 2024
Rethinking Patch Dependence for Masked Autoencoders
Letian Fu
Long Lian
Renhao Wang
Baifeng Shi
Xudong Wang
Adam Yala
Trevor Darrell
Alexei A. Efros
Ken Goldberg
144
17
0
25 Jan 2024
Delocate: Detection and Localization for Deepfake Videos with Randomly-Located Tampered Traces
Juan Hu
Xin Liao
Difei Gao
Satoshi Tsutsui
Qian Wang
Zheng Qin
Mike Zheng Shou
140
5
0
24 Jan 2024
Learning Representations for Clustering via Partial Information Discrimination and Cross-Level Interaction
Hai-Xin Zhang
Dong Huang
Hua-Bao Ling
Guang-Yu Zhang
Wei-jun Sun
Zi-hao Wen
52
0
0
24 Jan 2024
Segment Any Cell: A SAM-based Auto-prompting Fine-tuning Framework for Nuclei Segmentation
Saiyang Na
Yuzhi Guo
Feng Jiang
Hehuan Ma
Junzhou Huang
VLM
MedIm
86
16
0
24 Jan 2024
Facing the Elephant in the Room: Visual Prompt Tuning or Full Finetuning?
Cheng Han
Qifan Wang
Yiming Cui
Wenguan Wang
Lifu Huang
Siyuan Qi
Dongfang Liu
VLM
159
22
0
23 Jan 2024
FedRSU: Federated Learning for Scene Flow Estimation on Roadside Units
Shaoheng Fang
Rui Ye
Wenhao Wang
Zuhong Liu
Yuxiao Wang
Yafei Wang
Siheng Chen
Yanfeng Wang
112
1
0
23 Jan 2024
Correlation-Embedded Transformer Tracking: A Single-Branch Framework
Fei Xie
Wankou Yang
Chunyu Wang
Lei Chu
Yue Cao
Chao Ma
Wenjun Zeng
113
7
0
23 Jan 2024
Interpreting Equivariant Representations
Andreas Abildtrup Hansen
Anna Calissano
Aasa Feragen
161
1
0
23 Jan 2024
Self-supervised Learning of LiDAR 3D Point Clouds via 2D-3D Neural Calibration
Yifan Zhang
Siyu Ren
Junhui Hou
Jinjian Wu
Guangming Shi
Guangming Shi
SSL
3DPC
304
3
0
23 Jan 2024
OCT-SelfNet: A Self-Supervised Framework with Multi-Modal Datasets for Generalized and Robust Retinal Disease Detection
Fatema Jannat
Sina Gholami
Minha Alam
Hamed Tabkhi
54
1
0
22 Jan 2024
Exploring Simple Open-Vocabulary Semantic Segmentation
Zihang Lai
VLM
74
0
0
22 Jan 2024
Less Could Be Better: Parameter-efficient Fine-tuning Advances Medical Vision Foundation Models
Chenyu Lian
Hong-Yu Zhou
Yizhou Yu
Liansheng Wang
MedIm
97
10
0
22 Jan 2024
Previous
1
2
3
...
42
43
44
...
94
95
96
Next