Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.07193
Cited By
v1
v2 (latest)
DINOv2: Learning Robust Visual Features without Supervision
14 April 2023
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
Vasil Khalidov
Pierre Fernandez
Daniel Haziza
Francisco Massa
Alaaeldin El-Nouby
Mahmoud Assran
Nicolas Ballas
Wojciech Galuba
Russ Howes
Po-Yao (Bernie) Huang
Shang-Wen Li
Ishan Misra
Michael G. Rabbat
Vasu Sharma
Gabriel Synnaeve
Huijiao Xu
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
VLM
CLIP
SSL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"DINOv2: Learning Robust Visual Features without Supervision"
50 / 826 papers shown
Title
Anomalies by Synthesis: Anomaly Detection using Generative Diffusion Models for Off-Road Navigation
Siddharth Ancha
Sunshine Jiang
Travis Manderson
Laura Brandt
Yilun Du
Philip R. Osteen
Nicholas Roy
264
0
0
28 May 2025
CAST: Contrastive Adaptation and Distillation for Semi-Supervised Instance Segmentation
Pardis Taghavi
Tian Liu
Renjie Li
Reza Langari
Zhengzhong Tu
ISeg
84
0
0
28 May 2025
Scan-and-Print: Patch-level Data Summarization and Augmentation for Content-aware Layout Generation in Poster Design
HsiaoYuan Hsu
Yuxin Peng
DiffM
19
0
0
27 May 2025
Object Concepts Emerge from Motion
H. Liang
Xiaohui Wang
Zhichao Li
Y. Yang
Naiyan Wang
VOS
OCL
47
0
0
27 May 2025
Frame In-N-Out: Unbounded Controllable Image-to-Video Generation
Boyang Wang
Xuweiyi Chen
Matheus Gadelha
Zezhou Cheng
DiffM
VGen
74
0
0
27 May 2025
Conditional Diffusion Models with Classifier-Free Gibbs-like Guidance
Badr Moufad
Yazid Janati
Alain Durmus
Ahmed Ghorbel
Eric Moulines
Jimmy Olsson
DiffM
74
0
0
27 May 2025
QuARI: Query Adaptive Retrieval Improvement
Eric Xing
Abby Stylianou
Robert Pless
Nathan Jacobs
VLM
24
0
0
27 May 2025
MetaSlot: Break Through the Fixed Number of Slots in Object-Centric Learning
Hongjia Liu
Rongzhen Zhao
Haohan Chen
Joni Pajarinen
OCL
VLM
117
0
0
27 May 2025
DIPO: Dual-State Images Controlled Articulated Object Generation Powered by Diverse Data
Ruiqi Wu
Xinjie Wang
Liu Liu
Chunle Guo
Jiaxiong Qiu
Chongyi Li
Lichao Huang
Zhizhong Su
Ming-Ming Cheng
VGen
96
1
0
26 May 2025
Exploring the Possibility of TypiClust for Low-Budget Federated Active Learning
Yuta Ono
Hiroshi Nakamura
Hideki Takase
34
0
0
26 May 2025
What Changed? Detecting and Evaluating Instruction-Guided Image Edits with Multimodal Large Language Models
Lorenzo Baraldi
Davide Bucciarelli
Federico Betti
Marcella Cornia
Lorenzo Baraldi
N. Sebe
Rita Cucchiara
225
0
0
26 May 2025
CCL-LGS: Contrastive Codebook Learning for 3D Language Gaussian Splatting
Lei Tian
Xiaomin Li
Liqian Ma
Hefei Huang
Zirui Zheng
Hao Yin
Taiqing Li
Huchuan Lu
Xu Jia
17
0
0
26 May 2025
ReDDiT: Rehashing Noise for Discrete Visual Generation
Tianren Ma
Xiaosong Zhang
Boyu Yang
Junlan Feng
QiXiang Ye
DiffM
120
0
0
26 May 2025
A Contrastive Learning Foundation Model Based on Perfectly Aligned Sample Pairs for Remote Sensing Images
Hengtong Shen
Haiyan Gu
Haitao Li
Yi Yang
Agen qiu
SSL
168
0
0
26 May 2025
Absolute Coordinates Make Motion Generation Easy
Zichong Meng
Zeyu Han
Xiaogang Peng
Yiming Xie
Huaizu Jiang
202
0
0
26 May 2025
SuperAD: A Training-free Anomaly Classification and Segmentation Method for CVPR 2025 VAND 3.0 Workshop Challenge Track 1: Adapt & Detect
Huaiyuan Zhang
H. Chen
Yu Cheng
Shunyi Wu
Linghao Sun
Linao Han
Zeyu Shi
Lei Qi
49
0
0
26 May 2025
What Can RL Bring to VLA Generalization? An Empirical Study
Jijia Liu
Feng Gao
Bingwen Wei
Xinlei Chen
Qingmin Liao
Yi Wu
Chao Yu
Yu Wang
OffRL
292
0
0
26 May 2025
MMIG-Bench: Towards Comprehensive and Explainable Evaluation of Multi-Modal Image Generation Models
Hang Hua
Ziyun Zeng
Yizhi Song
Yunlong Tang
Liu He
Daniel G. Aliaga
Wei Xiong
Jiebo Luo
EGVM
84
0
0
26 May 2025
Diagnosing and Mitigating Modality Interference in Multimodal Large Language Models
Rui Cai
Bangzheng Li
Xiaofei Wen
Muhao Chen
Zhe Zhao
19
0
0
26 May 2025
What You Perceive Is What You Conceive: A Cognition-Inspired Framework for Open Vocabulary Image Segmentation
Jianghang Lin
Yue Hu
Jiangtao Shen
Yunhang Shen
Liujuan Cao
Shengchuan Zhang
Chia-Wen Lin
ObjD
VLM
206
0
0
26 May 2025
CDPDNet: Integrating Text Guidance with Hybrid Vision Encoders for Medical Image Segmentation
Jiong Wu
Yang Xing
Boxiao Yu
Wei Shao
Kuang Gong
MedIm
186
0
0
25 May 2025
Advancing Video Self-Supervised Learning via Image Foundation Models
Jingwei Wu
Zhewei Huang
Chang Liu
39
0
0
25 May 2025
CreatiDesign: A Unified Multi-Conditional Diffusion Transformer for Creative Graphic Design
H. Zhang
Dexiang Hong
Maoke Yang
Yutao Chen
Zhao Zhang
Jie Shao
Xinglong Wu
Zuxuan Wu
Yu Jiang
DiffM
AI4CE
168
0
0
25 May 2025
VLA-RL: Towards Masterful and General Robotic Manipulation with Scalable Reinforcement Learning
Guanxing Lu
Wenkai Guo
Chubin Zhang
Yuheng Zhou
Haonan Jiang
Zifeng Gao
Yansong Tang
Ziwei Wang
OffRL
112
0
0
24 May 2025
REN: Fast and Efficient Region Encodings from Patch-Based Image Encoders
Savya Khosla
Sethuraman TV
Barnett Lee
Alexander Schwing
Derek Hoiem
VGen
167
0
0
23 May 2025
VIBE: Vector Index Benchmark for Embeddings
Elias Jääsaari
Ville Hyvönen
Matteo Ceccarello
Teemu Roos
Martin Aumüller
VLM
84
0
0
23 May 2025
LookWhere? Efficient Visual Recognition by Learning Where to Look and What to See from Self-Supervision
A. Fuller
Yousef Yassin
Junfeng Wen
Daniel G. Kyrollos
Tarek Ibrahim
James R. Green
Evan Shelhamer
ViT
187
0
0
23 May 2025
BiggerGait: Unlocking Gait Recognition with Layer-wise Representations from Large Vision Models
Dingqing Ye
Chao Fan
Zhanbo Huang
Chengwen Luo
Jianqiang Li
Shiqi Yu
Xiaoming Liu
CVBM
VLM
54
0
0
23 May 2025
Learning Shared Representations from Unpaired Data
Amitai Yacobi
Nir Ben-Ari
Ronen Talmon
Uri Shaham
SSL
80
0
0
23 May 2025
SpikeGen: Generative Framework for Visual Spike Stream Processing
Gaole Dai
Menghang Dong
Rongyu Zhang
Ruichuan An
Shanghang Zhang
Tiejun Huang
DiffM
3DGS
44
0
0
23 May 2025
Direct3D-S2: Gigascale 3D Generation Made Easy with Spatial Sparse Attention
Shuang Wu
Youtian Lin
Feihu Zhang
Yifei Zeng
Yikang Yang
...
Jiachen Qian
Siyu Zhu
Xun Cao
Philip Torr
Yao Yao
3DGS
121
1
0
23 May 2025
Is Single-View Mesh Reconstruction Ready for Robotics?
Frederik Nolte
Bernhard Schölkopf
Ingmar Posner
139
0
0
23 May 2025
Self-Organizing Visual Prototypes for Non-Parametric Representation Learning
T. Silva
Hélio Pedrini
Adín Ramirez Rivera
35
0
0
23 May 2025
The Third Pillar of Causal Analysis? A Measurement Perspective on Causal Representations
Dingling Yao
Shimeng Huang
Riccardo Cadei
Kun Zhang
Francesco Locatello
CML
140
0
0
23 May 2025
Semantic Correspondence: Unified Benchmarking and a Strong Baseline
Kaiyan Zhang
Xinghui Li
Jingyi Lu
Kai Han
3DV
87
1
0
23 May 2025
Slot-MLLM: Object-Centric Visual Tokenization for Multimodal LLM
Donghwan Chi
Hyomin Kim
Yoonjin Oh
Yongjin Kim
Donghoon Lee
DaeJin Jo
Jongmin Kim
Junyeob Baek
Sungjin Ahn
Sungwoong Kim
MLLM
VLM
476
0
0
23 May 2025
Panoptic Captioning: Seeking An Equivalency Bridge for Image and Text
Kun-Yu Lin
Hongjun Wang
Weining Ren
Kai Han
291
0
0
22 May 2025
Bootstrapping your behavior: a new pretraining strategy for user behavior sequence data
Weichang Wu
Xiaolu Zhang
Jun Zhou
Yuchen Li
Wenwen Xia
20
0
0
22 May 2025
CDST: Color Disentangled Style Transfer for Universal Style Reference Customization
Shiwen Zhang
Zhuowei Chen
Lang Chen
Yanze Wu
15
0
0
22 May 2025
Saliency-Aware Quantized Imitation Learning for Efficient Robotic Control
Seongmin Park
Hyungmin Kim
Sangwoo kim
Wonseok Jeon
Juyoung Yang
Byeongwook Jeon
Yoonseon Oh
Jungwook Choi
192
0
0
21 May 2025
ReGUIDE: Data Efficient GUI Grounding via Spatial Reasoning and Search
Hyunseok Lee
Jeonghoon Kim
Beomjun Kim
Jihoon Tack
Chansong Jo
Jaehong Lee
Cheonbok Park
Sookyo In
Jinwoo Shin
Kang Min Yoo
124
0
0
21 May 2025
Stronger ViTs With Octic Equivariance
David Nordström
Johan Edstedt
Fredrik Kahl
Georg Bökman
ViT
225
0
0
21 May 2025
OmniStyle: Filtering High Quality Style Transfer Data at Scale
Ye Wang
Ruiqi Liu
Jiang Lin
Fei Liu
Zili Yi
Yilin Wang
Rui Ma
74
0
0
20 May 2025
Place Recognition Meet Multiple Modalitie: A Comprehensive Review, Current Challenges and Future Directions
Zhenyu Li
Tianyi Shang
Pengjie Xu
ZhaoJun Deng
125
0
0
20 May 2025
Generalizable Multispectral Land Cover Classification via Frequency-Aware Mixture of Low-Rank Token Experts
Xi Chen
Shen Yan
Juelin Zhu
Chen Chen
Yu Liu
Maojun Zhang
80
0
0
20 May 2025
Unlocking the Power of SAM 2 for Few-Shot Segmentation
Qianxiong Xu
Lanyun Zhu
Xuanyi Liu
Guosheng Lin
Cheng Long
Ziyue Li
Rui Zhao
VLM
72
0
0
20 May 2025
Policy Contrastive Decoding for Robotic Foundation Models
Shihan Wu
Ji Zhang
Xu Luo
Junlin Xie
Jingkuan Song
Heng Tao Shen
Lianli Gao
OffRL
266
0
0
19 May 2025
VGGT-SLAM: Dense RGB SLAM Optimized on the SL(4) Manifold
Dominic Maggio
Hyungtae Lim
Luca Carlone
129
1
0
18 May 2025
No Free Lunch in Active Learning: LLM Embedding Quality Dictates Query Strategy Success
Lukas Rauch
Moritz Wirth
Denis Huseljic
M. Herde
Bernhard Sick
Matthias Aßenmacher
14
0
0
18 May 2025
CellCLIP -- Learning Perturbation Effects in Cell Painting via Text-Guided Contrastive Learning
Mingyu Lu
Ethan Weinberger
Chanwoo Kim
Su-In Lee
25
0
0
16 May 2025
Previous
1
2
3
4
5
...
15
16
17
Next