Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.07193
Cited By
DINOv2: Learning Robust Visual Features without Supervision
14 April 2023
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
Vasil Khalidov
Pierre Fernandez
Daniel Haziza
Francisco Massa
Alaaeldin El-Nouby
Mahmoud Assran
Nicolas Ballas
Wojciech Galuba
Russ Howes
Po-Yao (Bernie) Huang
Shang-Wen Li
Ishan Misra
Michael G. Rabbat
Vasu Sharma
Gabriel Synnaeve
Huijiao Xu
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
VLM
CLIP
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DINOv2: Learning Robust Visual Features without Supervision"
50 / 2,220 papers shown
Title
Tell me why: Visual foundation models as self-explainable classifiers
Hugues Turbé
Mina Bjelogrlic
G. Mengaldo
Christian Lovis
69
0
0
26 Feb 2025
GONet: A Generalizable Deep Learning Model for Glaucoma Detection
Or Abramovich
Hadas Pizem
Jonathan Fhima
Eran Berkowitz
Ben Gofrit
...
Meital Baskin
Jan Van Eijgen
Ingeborg Stalmans
E. Blumenthal
Joachim A. Behar
64
1
0
26 Feb 2025
What are Foundation Models Cooking in the Post-Soviet World?
Anton Lavrouk
Tarek Naous
Alan Ritter
Wei Xu
70
0
0
25 Feb 2025
Escaping The Big Data Paradigm in Self-Supervised Representation Learning
Carlos Vélez García
Miguel Cazorla
Jorge Pomares
54
0
0
25 Feb 2025
From underwater to aerial: a novel multi-scale knowledge distillation approach for coral reef monitoring
Matteo Contini
Victor Illien
Julien Barde
Sylvain Poulain
Serge Bernard
Alexis Joly
Sylvain Bonhommeau
78
0
0
25 Feb 2025
LAM: Large Avatar Model for One-shot Animatable Gaussian Head
Yisheng He
Xiaodong Gu
Xiaodan Ye
Chao Xu
Zhengyi Zhao
Yuan Dong
Weihao Yuan
Zilong Dong
Liefeng Bo
3DGS
90
0
0
25 Feb 2025
Enhancing Reusability of Learned Skills for Robot Manipulation via Gaze and Bottleneck
Ryo Takizawa
Izumi Karino
Koki Nakagawa
Yoshiyuki Ohmura
Yasuo Kuniyoshi
80
1
0
25 Feb 2025
PromptMID: Modal Invariant Descriptors Based on Diffusion and Vision Foundation Models for Optical-SAR Image Matching
Han Nie
B. Luo
Jun Liu
Z. Fu
Huan Zhou
Shuo Zhang
Weixing Liu
DiffM
VLM
79
0
0
25 Feb 2025
DemoGen: Synthetic Demonstration Generation for Data-Efficient Visuomotor Policy Learning
Zhengrong Xue
Shuying Deng
Zhenyang Chen
Yixuan Wang
Zhecheng Yuan
Huazhe Xu
52
5
0
24 Feb 2025
MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Representations
Benedikt Alkin
Lukas Miklautz
Sepp Hochreiter
Johannes Brandstetter
VLM
78
8
0
24 Feb 2025
SwimVG: Step-wise Multimodal Fusion and Adaption for Visual Grounding
Liangtao Shi
Ting Liu
Xiantao Hu
Yue Hu
Quanjun Yin
Richang Hong
ObjD
54
0
0
24 Feb 2025
A Pragmatic Note on Evaluating Generative Models with Fréchet Inception Distance for Retinal Image Synthesis
Yuli Wu
Fucheng Liu
Rüveyda Yilmaz
Henning Konermann
Peter Walter
Johannes Stegmaier
EGVM
MedIm
55
1
0
24 Feb 2025
Introducing Visual Perception Token into Multimodal Large Language Model
Runpeng Yu
Xinyin Ma
Xinchao Wang
MLLM
LRM
86
0
0
24 Feb 2025
FUNCTO: Function-Centric One-Shot Imitation Learning for Tool Manipulation
Chao Tang
Anxing Xiao
Yuhong Deng
Tianrun Hu
Wenlong Dong
Hanbo Zhang
David Hsu
Hong Zhang
73
2
0
24 Feb 2025
VaViM and VaVAM: Autonomous Driving through Video Generative Modeling
Florent Bartoccioni
Elias Ramzi
Victor Besnier
Shashanka Venkataramanan
Tuan-Hung Vu
...
Mickael Chen
Éloi Zablocki
Andrei Bursuc
Eduardo Valle
Matthieu Cord
VGen
88
1
0
24 Feb 2025
Enhancing Image Matting in Real-World Scenes with Mask-Guided Iterative Refinement
Rui Liu
39
0
0
24 Feb 2025
Unveiling Institution-Specific Bias in Pathology Foundation Models: Detriments, Causes, and Potential Solutions
Weiping Lin
Shen Liu
Runchen Zhu
Liansheng Wang
46
1
0
24 Feb 2025
Continuous Wrist Control on the Hannes Prosthesis: a Vision-based Shared Autonomy Framework
Federico Vasile
Elisa Maiettini
Giulia Pasquale
Nicoló Boccardo
Lorenzo Natale
36
0
0
24 Feb 2025
Fair Foundation Models for Medical Image Analysis: Challenges and Perspectives
Dilermando Queiroz
Anderson Carlos
André Anjos
Lilian Berton
52
0
0
24 Feb 2025
Few-shot Species Range Estimation
Christian Lange
Max Hamilton
Elijah Cole
Alexander Shepard
Samuel Heinrich
Angela Zhu
Subhransu Maji
Grant Van Horn
Oisin Mac Aodha
81
0
0
24 Feb 2025
Disentangling Visual Transformers: Patch-level Interpretability for Image Classification
Guillaume Jeanneret
Loïc Simon
F. Jurie
ViT
66
0
0
24 Feb 2025
Vision-LSTM: xLSTM as Generic Vision Backbone
Benedikt Alkin
M. Beck
Korbinian Poppel
Sepp Hochreiter
Johannes Brandstetter
VLM
69
44
0
24 Feb 2025
Human2Robot: Learning Robot Actions from Paired Human-Robot Videos
Sicheng Xie
Haidong Cao
Zejia Weng
Zhen Xing
Shiwei Shen
Jiaqi Leng
Xipeng Qiu
Yanwei Fu
Zuxuan Wu
Yu Jiang
61
0
0
23 Feb 2025
SelaVPR++: Towards Seamless Adaptation of Foundation Models for Efficient Place Recognition
Feng Lu
Tong Jin
X. Lan
Lijun Zhang
Yunpeng Liu
Yaowei Wang
Chun Yuan
44
0
0
23 Feb 2025
Dragen3D: Multiview Geometry Consistent 3D Gaussian Generation with Drag-Based Control
Jinbo Yan
Alan Zhao
Yixin Hu
3DGS
261
0
0
23 Feb 2025
Understanding the Emergence of Multimodal Representation Alignment
Megan Tjandrasuwita
Chanakya Ekbote
Liu Ziyin
Paul Pu Liang
52
1
0
22 Feb 2025
Textured 3D Regenerative Morphing with 3D Diffusion Prior
Songlin Yang
Yushi Lan
Honghua Chen
Xingang Pan
DiffM
71
0
0
21 Feb 2025
Structurally Disentangled Feature Fields Distillation for 3D Understanding and Editing
Yoel Levy
David Shavin
Itai Lang
Sagie Benaim
88
0
0
21 Feb 2025
Data Attribution for Text-to-Image Models by Unlearning Synthesized Images
Sheng-Yu Wang
Aaron Hertzmann
Alexei A. Efros
Jun-Yan Zhu
Richard Zhang
TDI
130
2
0
21 Feb 2025
DynamicGSG: Dynamic 3D Gaussian Scene Graphs for Environment Adaptation
Luzhou Ge
Xiangyu Zhu
Zhuo Yang
Xuesong Li
3DGS
72
0
0
21 Feb 2025
Robin3D: Improving 3D Large Language Model via Robust Instruction Tuning
Weitai Kang
Haifeng Huang
Yuzhang Shang
Mubarak Shah
Yan Yan
53
7
0
21 Feb 2025
Contrastive Localized Language-Image Pre-Training
Hong-You Chen
Zhengfeng Lai
Han Zhang
Xuben Wang
Marcin Eichner
Keen You
Meng Cao
Bowen Zhang
Yue Yang
Zhe Gan
CLIP
VLM
68
7
0
20 Feb 2025
Continually Learning Structured Visual Representations via Network Refinement with Rerelation
Zeki Doruk Erden
Boi Faltings
CLL
77
0
0
20 Feb 2025
UNION: Unsupervised 3D Object Detection using Object Appearance-based Pseudo-Classes
T. Lentsch
Holger Caesar
D. Gavrila
3DPC
97
8
0
20 Feb 2025
Personalized Instance-based Navigation Toward User-Specific Objects in Realistic Environments
Luca Barsellotti
Roberto Bigazzi
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
101
1
0
20 Feb 2025
Computational Safety for Generative AI: A Signal Processing Perspective
Pin-Yu Chen
81
1
0
18 Feb 2025
L4P: Low-Level 4D Vision Perception Unified
Abhishek Badki
Hang Su
Bowen Wen
Orazio Gallo
VLM
92
1
0
18 Feb 2025
CAST: Component-Aligned 3D Scene Reconstruction from an RGB Image
Kaixin Yao
Longwen Zhang
Xinhao Yan
Yan Zeng
Qixuan Zhang
Wei Yang
Lan Xu
Jiayuan Gu
Jingyi Yu
34
3
0
18 Feb 2025
On the Statistical Complexity of Estimating Vendi Scores from Empirical Data
Azim Ospanov
Farzan Farnia
43
1
0
17 Feb 2025
GeoDANO: Geometric VLM with Domain Agnostic Vision Encoder
Seunghyuk Cho
Zhenyue Qin
Yang Liu
Youngbin Choi
Seungbeom Lee
Dongwoo Kim
49
0
0
17 Feb 2025
Masked Latent Prediction and Classification for Self-Supervised Audio Representation Learning
Aurian Quélennec
Pierre Chouteau
Geoffroy Peeters
S. Essid
SSL
59
0
0
17 Feb 2025
SAM-LAD: Segment Anything Model Meets Zero-Shot Logic Anomaly Detection
Yun Peng
Xiao Lin
Nachuan Ma
Jiayuan Du
Chuangwei Liu
Chengju Liu
Qi Chen
46
3
0
17 Feb 2025
Hyperspherical Energy Transformer with Recurrent Depth
Yunzhe Hu
Difan Zou
Dong Xu
50
0
0
17 Feb 2025
TinyEmo: Scaling down Emotional Reasoning via Metric Projection
Cristian Gutierrez
LRM
69
0
0
17 Feb 2025
Object-Centric Image to Video Generation with Language Guidance
Angel Villar-Corrales
Gjergj Plepi
Sven Behnke
DiffM
VGen
OCL
80
1
0
17 Feb 2025
Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering
Yanpeng Zhao
Yiwei Hao
Siyu Gao
Yunbo Wang
Xiaokang Yang
OCL
132
1
0
17 Feb 2025
Without Paired Labeled Data: An End-to-End Self-Supervised Paradigm for UAV-View Geo-Localization
Zhongwei Chen
Zhao-Xu Yang
Hai-Jun Rong
SSL
61
1
0
17 Feb 2025
Why Vision Language Models Struggle with Visual Arithmetic? Towards Enhanced Chart and Geometry Understanding
Kung-Hsiang Huang
Can Qin
Haoyi Qiu
Philippe Laban
Chenyu You
Caiming Xiong
C. Wu
VLM
150
3
0
17 Feb 2025
Differentially Private Prototypes for Imbalanced Transfer Learning
Dariush Wahdany
Matthew Jagielski
Adam Dziedzic
Franziska Boenisch
90
0
0
17 Feb 2025
Simplifying DINO via Coding Rate Regularization
Ziyang Wu
Jingyuan Zhang
Druv Pai
Junfeng Fang
Chandan Singh
Jianwei Yang
Jianfeng Gao
Yi Ma
244
1
0
17 Feb 2025
Previous
1
2
3
...
11
12
13
...
43
44
45
Next