Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2104.14294
Cited By
v1
v2 (latest)
Emerging Properties in Self-Supervised Vision Transformers
29 April 2021
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Emerging Properties in Self-Supervised Vision Transformers"
50 / 4,175 papers shown
Title
Revisiting LRP: Positional Attribution as the Missing Ingredient for Transformer Explainability
Yarden Bakish
Itamar Zimerman
Hila Chefer
Lior Wolf
17
0
0
02 Jun 2025
unMORE: Unsupervised Multi-Object Segmentation via Center-Boundary Reasoning
Yafei Yang
Zihui Zhang
Bo Yang
OCL
75
0
0
02 Jun 2025
MoCA: Multi-modal Cross-masked Autoencoder for Digital Health Measurements
Howon Ryu
Y. Chen
Yacun Wang
Andrea Z. LaCroix
Chongzhi Di
L. Natarajan
Yu Wang
Jingjing Zou
19
0
0
02 Jun 2025
Self-supervised ControlNet with Spatio-Temporal Mamba for Real-world Video Super-resolution
Shijun Shi
Jing Xu
Lijing Lu
Zhihang Li
Kai Hu
37
0
0
01 Jun 2025
ECP-Mamba: An Efficient Multi-scale Self-supervised Contrastive Learning Method with State Space Model for PolSAR Image Classification
Zuzheng Kuang
Haixia Bi
Chen Xu
Jian Sun
Mamba
57
0
0
01 Jun 2025
PCoreSet: Effective Active Learning through Knowledge Distillation from Vision-Language Models
Seongjae Kang
Dong Bok Lee
Hyungjoon Jang
Dongseop Kim
Sung Ju Hwang
VLM
40
0
0
01 Jun 2025
AuralSAM2: Enabling SAM2 Hear Through Pyramid Audio-Visual Feature Prompting
Yuyuan Liu
Yuanhong Chen
Chong Wang
Junlin Han
Junde Wu
Can Peng
Jingkun Chen
Yu Tian
Gustavo Carneiro
VLM
47
0
0
01 Jun 2025
Video Signature: In-generation Watermarking for Latent Video Diffusion Models
Yu Huang
Junhao Chen
Qi Zheng
Hanqian Li
Shuliang Liu
Xuming Hu
DiffM
WIGM
VGen
51
0
0
31 May 2025
SST: Self-training with Self-adaptive Thresholding for Semi-supervised Learning
Shuai Zhao
Heyan Huang
Xinge Li
Xiaokang Chen
Rui Wang
33
0
0
31 May 2025
Tackling View-Dependent Semantics in 3D Language Gaussian Splatting
Jiazhong Cen
Xudong Zhou
Jiemin Fang
Changsong Wen
Lingxi Xie
Xiaopeng Zhang
Wei Shen
Qi Tian
3DGS
36
0
0
30 May 2025
Understanding while Exploring: Semantics-driven Active Mapping
Liyan Chen
Huangying Zhan
Hairong Yin
Yi Tian Xu
Philippos Mordohai
18
0
0
30 May 2025
KairosAD: A SAM-Based Model for Industrial Anomaly Detection on Embedded Devices
Uzair Khan
Franco Fummi
Luigi Capogrosso
18
0
0
30 May 2025
Weakly-Supervised Affordance Grounding Guided by Part-Level Semantic Priors
Peiran Xu
Yadong Mu
65
2
0
30 May 2025
Benchmarking Foundation Models for Zero-Shot Biometric Tasks
Redwan Sony
Parisa Farmanifard
Hamzeh Alzwairy
Nitish Shukla
Arun Ross
CVBM
VLM
47
0
0
30 May 2025
Federated Unsupervised Semantic Segmentation
Evangelos Charalampakis
Vasileios Mygdalis
Ioannis Pitas
FedML
25
0
0
29 May 2025
Mobi-
π
π
π
: Mobilizing Your Robot Learning Policy
Jingyun Yang
Isabella Huang
Brandon Vu
Max Bajracharya
Rika Antonova
Jeannette Bohg
45
0
0
29 May 2025
Skin Lesion Phenotyping via Nested Multi-modal Contrastive Learning
Dionysis Christopoulos
Sotiris Spanos
Eirini Baltzi
Valsamis Ntouskos
Konstantinos Karantzalos
64
0
0
29 May 2025
SpatialSplat: Efficient Semantic 3D from Sparse Unposed Images
Yu Sheng
Jiajun Deng
Xinran Zhang
Yu Zhang
Bei Hua
Yanyong Zhang
Jianmin Ji
3DGS
54
1
0
29 May 2025
BioCLIP 2: Emergent Properties from Scaling Hierarchical Contrastive Learning
Jianyang Gu
Samuel Stevens
Elizabeth G. Campolongo
Matthew J. Thompson
Net Zhang
...
Daniel Rubenstein
Hilmar Lapp
T. Berger-Wolf
Wei-Lun Chao
Yu-Chuan Su
VLM
54
2
0
29 May 2025
Dimension-Reduction Attack! Video Generative Models are Experts on Controllable Image Synthesis
H. Cao
Yutong Feng
Biao Gong
Yijing Tian
Yunhong Lu
Chuang Liu
Bin Wang
DiffM
VGen
35
1
0
29 May 2025
Compressing Sine-Activated Low-Rank Adapters through Post-Training Quantization
Cameron Gordon
Yiping Ji
Hemanth Saratchandran
Paul Albert
Simon Lucey
MQ
61
0
0
28 May 2025
AquaMonitor: A multimodal multi-view image sequence dataset for real-life aquatic invertebrate biodiversity monitoring
Mikko Impio
Philipp M. Rehsen
Tiina Laamanen
Arne J. Beermann
Florian Leese
Jenni Raitoharju
64
0
0
28 May 2025
A Survey on Training-free Open-Vocabulary Semantic Segmentation
Naomi Kombol
Ivan Martinović
Sinisa Segvic
ObjD
VLM
73
0
0
28 May 2025
Anomalies by Synthesis: Anomaly Detection using Generative Diffusion Models for Off-Road Navigation
Siddharth Ancha
Sunshine Jiang
Travis Manderson
Laura Brandt
Yilun Du
Philip R. Osteen
Nicholas Roy
264
0
0
28 May 2025
ConText-CIR: Learning from Concepts in Text for Composed Image Retrieval
Eric Xing
Pranavi Kolouju
Robert Pless
Abby Stylianou
Nathan Jacobs
13
0
0
27 May 2025
Object Concepts Emerge from Motion
H. Liang
Xiaohui Wang
Zhichao Li
Y. Yang
Naiyan Wang
VOS
OCL
47
0
0
27 May 2025
Mentor3AD: Feature Reconstruction-based 3D Anomaly Detection via Multi-modality Mentor Learning
Jinbao Wang
Hanzhe Liang
C. Gao
Chenxi Hu
Jie Zhou
Yunkang Cao
Linlin Shen
Weiming Shen
72
0
0
27 May 2025
Multi-instance Learning as Downstream Task of Self-Supervised Learning-based Pre-trained Model
Koki Matsuishi
Tsuyoshi Okita
SSL
19
0
0
27 May 2025
SANSA: Unleashing the Hidden Semantics in SAM2 for Few-Shot Segmentation
Claudia Cuttano
Gabriele Trivigno
Giuseppe Averta
Carlo Masone
VLM
25
0
0
27 May 2025
Vision Transformers with Self-Distilled Registers
Yinjie Chen
Zipeng Yan
Chong Zhou
Bo Dai
Andrew F. Luo
54
0
0
27 May 2025
A Contrastive Learning Foundation Model Based on Perfectly Aligned Sample Pairs for Remote Sensing Images
Hengtong Shen
Haiyan Gu
Haitao Li
Yi Yang
Agen qiu
SSL
168
0
0
26 May 2025
Exploring the Possibility of TypiClust for Low-Budget Federated Active Learning
Yuta Ono
Hiroshi Nakamura
Hideki Takase
29
0
0
26 May 2025
Regularized Personalization of Text-to-Image Diffusion Models without Distributional Drift
Gihoon Kim
Hyungjin Park
Taesup Kim
DiffM
VLM
190
0
0
26 May 2025
DiEmo-TTS: Disentangled Emotion Representations via Self-Supervised Distillation for Cross-Speaker Emotion Transfer in Text-to-Speech
Deok-Hyeon Cho
Hyung-Seok Oh
Seung-Bin Kim
Seong-Whan Lee
48
0
0
26 May 2025
What Changed? Detecting and Evaluating Instruction-Guided Image Edits with Multimodal Large Language Models
Lorenzo Baraldi
Davide Bucciarelli
Federico Betti
Marcella Cornia
Lorenzo Baraldi
N. Sebe
Rita Cucchiara
225
0
0
26 May 2025
FastCache: Fast Caching for Diffusion Transformer Through Learnable Linear Approximation
Dong Liu
Jiayi Zhang
Yifan Li
Yanxuan Yu
Ben Lengerich
Ying Nian Wu
69
1
0
26 May 2025
The Missing Point in Vision Transformers for Universal Image Segmentation
Sajjad Shahabodini
Mobina Mansoori
Farnoush Bayatmakou
J. Abouei
Konstantinos N. Plataniotis
Arash Mohammadi
ViT
ISeg
31
0
0
26 May 2025
CDPDNet: Integrating Text Guidance with Hybrid Vision Encoders for Medical Image Segmentation
Jiong Wu
Yang Xing
Boxiao Yu
Wei Shao
Kuang Gong
MedIm
186
0
0
25 May 2025
AmorLIP: Efficient Language-Image Pretraining via Amortization
Haotian Sun
Yitong Li
Yuchen Zhuang
Niao He
Hanjun Dai
Bo Dai
VLM
82
0
0
25 May 2025
Step-level Reward for Free in RL-based T2I Diffusion Model Fine-tuning
Xinyao Liao
Wei Wei
Xiaoye Qu
Yu Cheng
EGVM
62
0
0
25 May 2025
Distill CLIP (DCLIP): Enhancing Image-Text Retrieval via Cross-Modal Transformer Distillation
Daniel Csizmadia
Andrei Codreanu
Victor Sim
Vighnesh Prabhu
Michael Lu
Kevin Zhu
Sean O'Brien
Vasu Sharma
CLIP
VLM
71
0
0
25 May 2025
Test-Time Scaling of Diffusion Models via Noise Trajectory Search
Vignav Ramesh
Morteza Mardani
DiffM
27
0
0
24 May 2025
Grounding Bodily Awareness in Visual Representations for Efficient Policy Learning
Junlin Wang
Zhiyun Lin
1.3K
0
0
24 May 2025
Mitigating Context Bias in Domain Adaptation for Object Detection using Mask Pooling
Hojun Son
Asma Almutairi
Arpan Kusari
127
0
0
24 May 2025
C3R: Channel Conditioned Cell Representations for unified evaluation in microscopy imaging
Umar Marikkar
Syed Sameed Husain
Muhammad Awais
Sara Atito
39
0
0
24 May 2025
From Flight to Insight: Semantic 3D Reconstruction for Aerial Inspection via Gaussian Splatting and Language-Guided Segmentation
Mahmoud Chick Zaouali
Todd Charter
Homayoun Najjaran
3DGS
28
0
0
23 May 2025
SpikeGen: Generative Framework for Visual Spike Stream Processing
Gaole Dai
Menghang Dong
Rongyu Zhang
Ruichuan An
Shanghang Zhang
Tiejun Huang
DiffM
3DGS
44
0
0
23 May 2025
REN: Fast and Efficient Region Encodings from Patch-Based Image Encoders
Savya Khosla
Sethuraman TV
Barnett Lee
Alexander Schwing
Derek Hoiem
VGen
167
0
0
23 May 2025
Learning Shared Representations from Unpaired Data
Amitai Yacobi
Nir Ben-Ari
Ronen Talmon
Uri Shaham
SSL
80
0
0
23 May 2025
Imagine Beyond! Distributionally Robust Auto-Encoding for State Space Coverage in Online Reinforcement Learning
Nicolas Castanet
Olivier Sigaud
Sylvain Lamprier
OffRL
108
0
0
23 May 2025
Previous
1
2
3
4
5
6
...
82
83
84
Next