Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2104.14294
Cited By
v1
v2 (latest)
Emerging Properties in Self-Supervised Vision Transformers
29 April 2021
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Emerging Properties in Self-Supervised Vision Transformers"
50 / 4,175 papers shown
Title
ForesightNav: Learning Scene Imagination for Efficient Exploration
Hardik Shah
Jiaxu Xing
Nico Messikommer
Boyang Sun
Marc Pollefeys
Davide Scaramuzza
224
1
0
22 Apr 2025
Pose Optimization for Autonomous Driving Datasets using Neural Rendering Models
Quentin Herau
Nathan Piasco
Moussâb Bennehar
Luis Rolado
D. Tsishkou
Bingbing Liu
Cyrille Migniot
Pascal Vasseur
C. Demonceaux
71
0
0
22 Apr 2025
CLOC: Contrastive Learning for Ordinal Classification with Multi-Margin N-pair Loss
D. Pitawela
Gustavo Carneiro
Hsiang-Ting Chen
88
0
0
22 Apr 2025
Boosting Generative Image Modeling via Joint Image-Feature Synthesis
Theodoros Kouzelis
Efstathios Karypidis
Ioannis Kakogeorgiou
Spyros Gidaris
N. Komodakis
DiffM
94
0
0
22 Apr 2025
DINOv2-powered Few-Shot Semantic Segmentation: A Unified Framework via Cross-Model Distillation and 4D Correlation Mining
Wei Zhuo
Zhiyue Tang
Wufeng Xue
Hao Ding
Linlin Shen
111
0
0
22 Apr 2025
"I Know It When I See It": Mood Spaces for Connecting and Expressing Visual Concepts
Huzheng Yang
Katherine Xu
Michael D. Grossberg
Yutong Bai
Jianbo Shi
78
0
0
21 Apr 2025
Automated Measurement of Eczema Severity with Self-Supervised Learning
Neelesh Kumar
Oya Aran
66
0
0
21 Apr 2025
HyperFlow: Gradient-Free Emulation of Few-Shot Fine-Tuning
Donggyun Kim
Chanwoo Kim
Seunghoon Hong
61
0
0
21 Apr 2025
Exploring Modality Guidance to Enhance VFM-based Feature Fusion for UDA in 3D Semantic Segmentation
Johannes Spoecklberger
W. Lin
Pedro Hermosilla
Sivan Doveh
Horst Possegger
M. Jehanzeb Mirza
84
0
0
19 Apr 2025
Exploring Generalizable Pre-training for Real-world Change Detection via Geometric Estimation
Yitao Zhao
Sen Lei
Nanqing Liu
Heng-Chao Li
Turgay Celik
Qing Zhu
72
0
0
19 Apr 2025
CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning
Yang Yue
Yulin Wang
Chenxin Tao
Pan Liu
Shiji Song
Gao Huang
MedIm
75
0
0
18 Apr 2025
Image Editing with Diffusion Models: A Survey
Jia Wang
Jie Hu
Xiaoqi Ma
Hanghang Ma
Xiaoming Wei
Enhua Wu
144
1
0
17 Apr 2025
Personalized Text-to-Image Generation with Auto-Regressive Models
Kaiyue Sun
Xian Liu
Yao Teng
Xihui Liu
81
1
0
17 Apr 2025
EchoWorld: Learning Motion-Aware World Models for Echocardiography Probe Guidance
Yang Yue
Yulin Wang
Haojun Jiang
Pan Liu
S. Song
Gao Huang
VGen
114
0
0
17 Apr 2025
Self-Supervised Pre-training with Combined Datasets for 3D Perception in Autonomous Driving
Shumin Wang
Zhuoran Yang
Liwen Wang
ZhiPeng Tang
Heng Li
Lehan Pan
Sha Zhang
Jie Peng
Jianmin Ji
Y. Zhang
3DPC
100
0
0
17 Apr 2025
GrabS: Generative Embodied Agent for 3D Object Segmentation without Scene Supervision
Zihui Zhang
Yafei Yang
Hongtao Wen
Bo Yang
3DPC
91
0
0
16 Apr 2025
CAGS: Open-Vocabulary 3D Scene Understanding with Context-Aware Gaussian Splatting
Wei Sun
Yanzhao Zhou
Jianbin Jiao
Yuan Li
3DGS
99
1
0
16 Apr 2025
Search is All You Need for Few-shot Anomaly Detection
Qishan Wang
Jia Guo
Shuyong Gao
Hongru Wang
Li Xiong
J. Hu
Hanqi Guo
Wenqiang Zhang
160
0
0
16 Apr 2025
Crane: Context-Guided Prompt Learning and Attention Refinement for Zero-Shot Anomaly Detections
Alireza Salehi
Mohammadreza Salehi
Reshad Hosseini
Cees G. M. Snoek
Makoto Yamada
Mohammad Sabokrou
VLM
86
0
0
15 Apr 2025
AFiRe: Anatomy-Driven Self-Supervised Learning for Fine-Grained Representation in Radiographic Images
Yihang Liu
Lianghua He
Y. Wen
Longzhen Yang
Hongzhou Chen
MedIm
135
0
0
15 Apr 2025
NormalCrafter: Learning Temporally Consistent Normals from Video Diffusion Priors
Yanrui Bin
Wenbo Hu
Haoyuan Wang
Xinya Chen
Bing Wang
DiffM
92
1
0
15 Apr 2025
ViMo: A Generative Visual GUI World Model for App Agents
Dezhao Luo
Bohan Tang
Kang Li
Georgios Papoudakis
Jifei Song
S. Gong
Haifeng Zhang
Jun Wang
Kun Shao
LM&Ro
VGen
181
1
0
15 Apr 2025
GFT: Gradient Focal Transformer
Boris Kriuk
Simranjit Kaur Gill
Shoaib Aslam
Amir Fakhrutdinov
94
0
0
14 Apr 2025
An Image is Worth
K
K
K
Topics: A Visual Structural Topic Model with Pretrained Image Embeddings
Matías Piqueras
Alexandra Segerberg
Matteo Magnani
Måns Magnusson
Nataša Sladoje
101
0
0
14 Apr 2025
COUNTS: Benchmarking Object Detectors and Multimodal Large Language Models under Distribution Shifts
Jiansheng Li
Xingxuan Zhang
Hao Zou
Yige Guo
Renzhe Xu
Yilong Liu
Chuzhao Zhu
Yue He
Peng Cui
VLM
93
0
0
14 Apr 2025
Pay Attention to What and Where? Interpretable Feature Extractor in Vision-based Deep Reinforcement Learning
Tien Pham
Angelo Cangelosi
67
1
0
14 Apr 2025
FLOSS: Free Lunch in Open-vocabulary Semantic Segmentation
Yasser Benigmim
Mohammad Fahes
Tuan-Hung Vu
Andrei Bursuc
Raoul de Charette
VLM
143
0
0
14 Apr 2025
Omni-Dish: Photorealistic and Faithful Image Generation and Editing for Arbitrary Chinese Dishes
Huijie Liu
Bingcan Wang
Jie Hu
Xiaoming Wei
Guoliang Kang
136
0
0
14 Apr 2025
Efficient Generative Model Training via Embedded Representation Warmup
Deyuan Liu
Peng Sun
Xufeng Li
Tao Lin
72
0
0
14 Apr 2025
Don't Deceive Me: Mitigating Gaslighting through Attention Reallocation in LMMs
Pengkun Jiao
Bin Zhu
Jingjing Chen
Chong-Wah Ngo
Yu Jiang
91
1
0
13 Apr 2025
Causal integration of chemical structures improves representations of microscopy images for morphological profiling
Yemin Yu
Neil A. Tenenholtz
Lester W. Mackey
Ying Wei
David Alvarez-Melis
Ava P. Amini
Alex X. Lu
75
1
0
13 Apr 2025
Flux Already Knows -- Activating Subject-Driven Image Generation without Training
Hao Kang
Stathi Fotiadis
Liming Jiang
Qing Yan
Yumin Jia
Zichuan Liu
Min Jin Chong
Xin Lu
75
1
0
12 Apr 2025
Evolved Hierarchical Masking for Self-Supervised Learning
Zhanzhou Feng
Shiliang Zhang
141
0
0
12 Apr 2025
FocalLens: Instruction Tuning Enables Zero-Shot Conditional Image Representations
Cheng-Yu Hsieh
Pavan Kumar Anasosalu Vasu
Fartash Faghri
Raviteja Vemulapalli
Chun-Liang Li
Ranjay Krishna
Oncel Tuzel
Hadi Pouransari
VLM
472
0
0
11 Apr 2025
GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation
Tianwei Xiong
Jun Hao Liew
Zilong Huang
Jiashi Feng
Xihui Liu
89
1
0
11 Apr 2025
Enhancing knowledge retention for continual learning with domain-specific adapters and features gating
Mohamed Abbas Hedjazi
O. Hadjerci
Adel Hafiane
CLL
47
0
0
11 Apr 2025
Boosting multi-demographic federated learning for chest radiograph analysis using general-purpose self-supervised representations
Mahshad Lotfinia
Arash Tayebiarasteh
Samaneh Samiei
Mehdi Joodaki
Soroosh Tayebi Arasteh
74
0
0
11 Apr 2025
Steering CLIP's vision transformer with sparse autoencoders
Sonia Joseph
Praneet Suresh
Ethan Goldfarb
Lorenz Hufe
Yossi Gandelsman
Robert Graham
Danilo Bzdok
Wojciech Samek
Blake A. Richards
108
4
0
11 Apr 2025
Hypergraph Vision Transformers: Images are More than Nodes, More than Edges
Joshua Fixelle
ViT
84
2
0
11 Apr 2025
Revisiting Likelihood-Based Out-of-Distribution Detection by Modeling Representations
Yifan Ding
Arturas Aleksandrauskas
Amirhossein Ahmadian
Jonas Unger
Fredrik Lindsten
Gabriel Eilertsen
OODD
91
1
0
10 Apr 2025
MARS: a Multimodal Alignment and Ranking System for Few-Shot Segmentation
Nico Catalano
Stefano Samele
Paolo Pertino
Matteo Matteucci
3DPC
94
0
0
10 Apr 2025
GenEAva: Generating Cartoon Avatars with Fine-Grained Facial Expressions from Realistic Diffusion-based Faces
Hao Yu
Rupayan Mallick
Margrit Betke
Sarah Adel Bargal
DiffM
91
0
0
10 Apr 2025
WS-DETR: Robust Water Surface Object Detection through Vision-Radar Fusion with Detection Transformer
Huilin Yin
Pengyu Wang
Senmao Li
Jun Yan
Daniel Watzenig
143
0
0
10 Apr 2025
Impact of Language Guidance: A Reproducibility Study
Cherish Puniani
Advika Sinha
Shree Singhi
Aayan Yadav
VLM
206
0
0
10 Apr 2025
Benchmarking Image Embeddings for E-Commerce: Evaluating Off-the Shelf Foundation Models, Fine-Tuning Strategies and Practical Trade-offs
Urszula Czerwinska
Cenk Bircanoglu
Jeremy Chamoux
66
0
0
10 Apr 2025
Self-Bootstrapping for Versatile Test-Time Adaptation
Shuaicheng Niu
Guohao Chen
P. Zhao
Tianyi Wang
Pengcheng Wu
Zhiqi Shen
ViT
TTA
133
0
0
10 Apr 2025
Memory-efficient Streaming VideoLLMs for Real-time Procedural Video Understanding
Dibyadip Chatterjee
Edoardo Remelli
Yale Song
Bugra Tekin
Abhay Mittal
...
Shreyas Hampali
Eric Sauser
Shugao Ma
Angela Yao
Fadime Sener
VLM
99
0
0
10 Apr 2025
MultiADS: Defect-aware Supervision for Multi-type Anomaly Detection and Segmentation in Zero-Shot Learning
Ylli Sadikaj
Hongkuan Zhou
Lavdim Halilaj
Stefan Schmid
Steffen Staab
Claudia Plant
77
0
0
09 Apr 2025
EDIT: Enhancing Vision Transformers by Mitigating Attention Sink through an Encoder-Decoder Architecture
Wenfeng Feng
Guoying Sun
83
0
0
09 Apr 2025
Perception in Reflection
Yana Wei
Liang Zhao
Kangheng Lin
En Yu
Yuang Peng
...
Jianjian Sun
Haoran Wei
Zheng Ge
Xiangyu Zhang
Vishal M. Patel
129
1
0
09 Apr 2025
Previous
1
2
3
...
5
6
7
...
82
83
84
Next