Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.07193
Cited By
v1
v2 (latest)
DINOv2: Learning Robust Visual Features without Supervision
14 April 2023
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
Vasil Khalidov
Pierre Fernandez
Daniel Haziza
Francisco Massa
Alaaeldin El-Nouby
Mahmoud Assran
Nicolas Ballas
Wojciech Galuba
Russ Howes
Po-Yao (Bernie) Huang
Shang-Wen Li
Ishan Misra
Michael G. Rabbat
Vasu Sharma
Gabriel Synnaeve
Huijiao Xu
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
VLM
CLIP
SSL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"DINOv2: Learning Robust Visual Features without Supervision"
50 / 826 papers shown
Title
Detecção da Psoríase Utilizando Visão Computacional: Uma Abordagem Comparativa Entre CNNs e Vision Transformers
Natanael Lucena
Fábio S. da Silva
Ricardo Rios
ViT
MedIm
60
0
0
11 Jun 2025
AVA-Bench: Atomic Visual Ability Benchmark for Vision Foundation Models
Zheda Mai
A. Chowdhury
Zihe Wang
Sooyoung Jeon
Lemeng Wang
Jiacheng Hou
Jihyung Kil
Wei-Lun Chao
CoGe
50
0
0
10 Jun 2025
TGRPO :Fine-tuning Vision-Language-Action Model via Trajectory-wise Group Relative Policy Optimization
Zengjue Chen
Runliang Niu
He Kong
Qi Wang
50
0
0
10 Jun 2025
Adapting Vision-Language Foundation Model for Next Generation Medical Ultrasound Image Analysis
Jingguo Qu
Xinyang Han
Tonghuan Xiao
Jia Ai
Juan Wu
...
Jing Qin
Ann Dorothy King
Winnie Chiu-Wing Chu
J. Cai
Michael Tin-Cheung Ying
MedIm
49
0
0
10 Jun 2025
UAD: Unsupervised Affordance Distillation for Generalization in Robotic Manipulation
Yihe Tang
Wenlong Huang
Yingke Wang
Chengshu Li
Roy Yuan
Ruohan Zhang
Jiajun Wu
Li Fei-Fei
41
0
0
10 Jun 2025
Diffuse and Disperse: Image Generation with Representation Regularization
Runqian Wang
Kaiming He
DiffM
41
0
0
10 Jun 2025
Effective Data Pruning through Score Extrapolation
Sebastian Schmidt
Prasanga Dhungel
Christoffer Löffler
Bjorn Nieth
Stephan Günnemann
Leo Schwinn
SyDa
31
0
0
10 Jun 2025
Foundation Models in Medical Imaging -- A Review and Outlook
Vivien van Veldhuizen
Vanessa Botha
C. Lu
Melis Erdal Cesur
Kevin Groot Lipman
...
Cees Snoek
Lodewyk Wessels
Ritse Mann
Eric Marcus
Jonas Teuwen
MedIm
VLM
AI4CE
60
0
0
10 Jun 2025
Time Series Representations for Classification Lie Hidden in Pretrained Vision Transformers
Simon Roschmann
Quentin Bouniot
Vasilii Feofanov
I. Redko
Zeynep Akata
AI4TS
34
0
0
10 Jun 2025
Inherently Faithful Attention Maps for Vision Transformers
Ananthu Aniraj
C. Dantas
Dino Ienco
Diego Marcos
OOD
OCL
23
0
0
10 Jun 2025
StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams
Zike Wu
Qi Yan
Xuanyu Yi
Lele Wang
Renjie Liao
3DGS
21
0
0
10 Jun 2025
Fusing Cross-modal and Uni-modal Representations: A Kronecker Product Approach
Youqi Wu
Jingwei Zhang
Farzan Farnia
23
0
0
10 Jun 2025
Bias Analysis in Unconditional Image Generative Models
Xiaofeng Zhang
Michelle Lin
Simon Lacoste-Julien
Aaron Courville
Yash Goyal
23
0
0
10 Jun 2025
Difference Inversion: Interpolate and Isolate the Difference with Token Consistency for Image Analogy Generation
H. Kim
Donghyun Kim
Suhyun Kim
DiffM
29
1
0
09 Jun 2025
Hierarchical Scoring with 3D Gaussian Splatting for Instance Image-Goal Navigation
Yijie Deng
Shuaihang Yuan
Geeta Chandra Raju Bethala
Anthony Tzes
Yu-Shen Liu
Yi Fang
3DGS
13
0
0
09 Jun 2025
Vision Transformers Don't Need Trained Registers
Nick Jiang
Amil Dravid
Alexei A. Efros
Yossi Gandelsman
24
0
0
09 Jun 2025
OpenSplat3D: Open-Vocabulary 3D Instance Segmentation using Gaussian Splatting
Jens Piekenbrinck
Christian Schmidt
Alexander Hermans
Narunas Vaskevicius
Timm Linder
Bastian Leibe
3DGS
VLM
19
0
0
09 Jun 2025
CuRe: Cultural Gaps in the Long Tail of Text-to-Image Systems
Aniket Rege
Zinnia Nie
Mahesh Ramesh
Unmesh Raskar
Zhuoran Yu
Aditya Kusupati
Yong Jae Lee
Ramya Korlakai Vinayak
20
0
0
09 Jun 2025
DINO-CoDT: Multi-class Collaborative Detection and Tracking with Vision Foundation Models
Xunjie He
Christina Dao Wen Lee
Meiling Wang
Chengran Yuan
Zefan Huang
Yufeng Yue
Marcelo H. Ang Jr
23
0
0
09 Jun 2025
Compressed Feature Quality Assessment: Dataset and Baselines
Changsheng Gao
Wei Zhou
Guosheng Lin
Weisi Lin
18
0
0
09 Jun 2025
CXR-LT 2024: A MICCAI challenge on long-tailed, multi-label, and zero-shot disease classification from chest X-ray
Mingquan Lin
G. Holste
Song Wang
Yiliang Zhou
Yishu Wei
...
Hao Chen
Adam Flanders
George Shih
Zhangyang Wang
Yifan Peng
LM&MA
23
0
0
09 Jun 2025
4DGT: Learning a 4D Gaussian Transformer Using Real-World Monocular Videos
Zhen Xu
Zhengqin Li
Zhao Dong
Xiaowei Zhou
Richard Newcombe
Zhaoyang Lv
3DGS
ViT
20
0
0
09 Jun 2025
PolyVivid: Vivid Multi-Subject Video Generation with Cross-Modal Interaction and Enhancement
Teng Hu
Zhentao Yu
Zhengguang Zhou
Jiangning Zhang
Yuan Zhou
Qinglin Lu
Ran Yi
VGen
15
0
0
09 Jun 2025
GIQ: Benchmarking 3D Geometric Reasoning of Vision Foundation Models with Simulated and Real Polyhedra
Mateusz Michalkiewicz
Anekha Sokhal
Tadeusz Michalkiewicz
Piotr Pawlikowski
Mahsa Baktashmotlagh
Varun Jampani
Guha Balakrishnan
15
0
0
09 Jun 2025
Image Reconstruction as a Tool for Feature Analysis
Eduard Allakhverdov
Dmitrii Tarasov
Elizaveta Goncharova
Andrey Kuznetsov
12
0
0
09 Jun 2025
EgoM2P: Egocentric Multimodal Multitask Pretraining
Gen Li
Yutong Chen
Yiqian Wu
Kaifeng Zhao
Marc Pollefeys
Siyu Tang
EgoV
VLM
38
0
0
09 Jun 2025
LogoSP: Local-global Grouping of Superpoints for Unsupervised Semantic Segmentation of 3D Point Clouds
Zihui Zhang
Weisheng Dai
Hongtao Wen
Bo Yang
3DPC
26
0
0
09 Jun 2025
GoTrack: Generic 6DoF Object Pose Refinement and Tracking
Van Nguyen Nguyen
Christian Forster
Sindi Shkodrani
Vincent Lepetit
Bugra Tekin
Cem Keskin
Tomás Hodan
VOT
28
0
0
08 Jun 2025
Guiding Cross-Modal Representations with MLLM Priors via Preference Alignment
Pengfei Zhao
Rongbo Luan
Wei Zhang
Peng Wu
Sifeng He
25
0
0
08 Jun 2025
TV-LiVE: Training-Free, Text-Guided Video Editing via Layer Informed Vitality Exploitation
M. Kim
Dongjin Kim
Seokju Yun
Jaegul Choo
DiffM
VGen
21
0
0
08 Jun 2025
Technical Report for ICRA 2025 GOOSE 3D Semantic Segmentation Challenge: Adaptive Point Cloud Understanding for Heterogeneous Robotic Systems
Xiaoya Zhang
10
0
0
08 Jun 2025
Experimental Evaluation of Static Image Sub-Region-Based Search Models Using CLIP
Bastian Jäckl
Vojtěch Kloda
Daniel A. Keim
Jakub Lokoč
10
1
0
07 Jun 2025
NeSyPack: A Neuro-Symbolic Framework for Bimanual Logistics Packing
Bowei Li
Peiqi Yu
Zhenran Tang
Han Zhou
Yifan Sun
Ruixuan Liu
Changliu Liu
19
0
0
06 Jun 2025
Rethinking Semi-supervised Segmentation Beyond Accuracy: Reliability and Robustness
S. Landgraf
Markus Hillemann
Markus Ulrich
UQCV
62
0
0
06 Jun 2025
Dynamic Mixture of Progressive Parameter-Efficient Expert Library for Lifelong Robot Learning
Yuheng Lei
Sitong Mao
Shunbo Zhou
Hongyuan Zhang
Xuelong Li
Ping Luo
CLL
39
0
0
06 Jun 2025
O-MaMa @ EgoExo4D Correspondence Challenge: Learning Object Mask Matching between Egocentric and Exocentric Views
Lorenzo Mur-Labadia
Maria Santos-Villafranca
Alejandro Pérez-Yus
J. Bermudez-Cameo
Ruben Martinez-Cantin
Jose J. Guerrero
VLM
45
0
0
06 Jun 2025
TerraFM: A Scalable Foundation Model for Unified Multisensor Earth Observation
M. S. Danish
Muhammad Akhtar Munir
Syed Aziz Shah
M. H. Khan
Rao Muhammad Anwer
Jorma T. Laaksonen
Fahad Shahbaz Khan
Salman Khan
61
0
0
06 Jun 2025
Deep Learning Reforms Image Matching: A Survey and Outlook
Shihua Zhang
Zizhuo Li
Kaining Zhang
Yifan Lu
Yuxin Deng
Linfeng Tang
Xingyu Jiang
Jiayi Ma
3DV
108
0
0
05 Jun 2025
SAM-aware Test-time Adaptation for Universal Medical Image Segmentation
Jianghao Wu
Yicheng Wu
Yutong Xie
Wenjia Bai
You Zhang
Feilong Tang
Yulong Li
Yasmeen George
Imran Razzak
MedIm
154
0
0
05 Jun 2025
Bridging Annotation Gaps: Transferring Labels to Align Object Detection Datasets
Mikhail Kennerley
Angelica E. Avilés-Rivero
Carola-Bibiane Schonlieb
R. Tan
109
0
0
05 Jun 2025
Contrastive Flow Matching
George Stoica
Vivek Ramanujan
Xiang Fan
Ali Farhadi
Ranjay Krishna
Judy Hoffman
82
0
0
05 Jun 2025
Interpretable Few-Shot Image Classification via Prototypical Concept-Guided Mixture of LoRA Experts
Zhong Ji
Rongshuai Wei
Jingren Liu
Yanwei Pang
Jungong Han
95
0
0
05 Jun 2025
Refer to Anything with Vision-Language Prompts
Shengcao Cao
Zijun Wei
Jason Kuen
Kangning Liu
Lingzhi Zhang
Jiuxiang Gu
HyunJoon Jung
Liang-Yan Gui
Yu Wang
VLM
117
0
0
05 Jun 2025
Object-X: Learning to Reconstruct Multi-Modal 3D Object Representations
Gaia Di Lorenzo
F. Tombari
Marc Pollefeys
Daniel Barath
3DPC
101
0
0
05 Jun 2025
Hierarchical Language Models for Semantic Navigation and Manipulation in an Aerial-Ground Robotic System
Haokun Liu
Zhaoqi Ma
Yunong Li
Junichiro Sugihara
Yicheng Chen
Jinjie Li
Moju Zhao
137
0
0
05 Jun 2025
Evaluating Sparse Autoencoders: From Shallow Design to Matching Pursuit
Valérie Costa
Thomas Fel
Ekdeep Singh Lubana
Bahareh Tolooshams
Demba Ba
100
0
0
05 Jun 2025
GP-MoLFormer-Sim: Test Time Molecular Optimization through Contextual Similarity Guidance
Jirí Navrátil
Jarret Ross
Payel Das
Youssef Mroueh
Samuel C. Hoffman
Vijil Chenthamarakshan
Brian M. Belgodere
25
0
0
05 Jun 2025
Do It Yourself: Learning Semantic Correspondence from Pseudo-Labels
Olaf Dünkel
Thomas Wimmer
Christian Theobalt
Christian Rupprecht
Adam Kortylewski
3DPC
100
0
0
05 Jun 2025
Towards Reliable Identification of Diffusion-based Image Manipulations
Alex Costanzino
Woody Bayliss
Juil Sock
Marc Gorriz Blanch
Danijela Horak
Ivan Laptev
Philip Torr
Fabio Pizzati
DiffM
39
0
0
05 Jun 2025
Single GPU Task Adaptation of Pathology Foundation Models for Whole Slide Image Analysis
Neeraj Kumar
Swaraj Nanda
Siddharth Singi
Jamal Benhamida
David Kim
Jie-Fu Chen
Amir Momeni-Boroujeni
Gregory M. Goldgof
Gabriele Campanella
Chad M. Vanderbilt
MedIm
92
0
0
05 Jun 2025
Previous
1
2
3
4
5
...
15
16
17
Next