Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.07193
Cited By
DINOv2: Learning Robust Visual Features without Supervision
14 April 2023
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
Vasil Khalidov
Pierre Fernandez
Daniel Haziza
Francisco Massa
Alaaeldin El-Nouby
Mahmoud Assran
Nicolas Ballas
Wojciech Galuba
Russ Howes
Po-Yao (Bernie) Huang
Shang-Wen Li
Ishan Misra
Michael G. Rabbat
Vasu Sharma
Gabriel Synnaeve
Huijiao Xu
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
VLM
CLIP
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DINOv2: Learning Robust Visual Features without Supervision"
50 / 2,220 papers shown
Title
DECOR:Decomposition and Projection of Text Embeddings for Text-to-Image Customization
Geonhui Jang
Jin-Hwa Kim
Yong-Hyun Park
Junho Kim
Gayoung Lee
Yonghyun Jeong
DiffM
98
0
0
12 Dec 2024
Vision Transformers for Efficient Indoor Pathloss Radio Map Prediction
Rafayel Mkrtchyan
Edvard Ghukasyan
Khoren Petrosyan
Hrant Khachatrian
Theofanis P. Raptis
81
0
0
12 Dec 2024
Efficient and Comprehensive Feature Extraction in Large Vision-Language Model for Pathology Analysis
Shengxuming Zhang
Weihan Li
Tianhong Gao
Jiacong Hu
Haoming Luo
Xiuming Zhang
Jing Zhang
Mingli Song
Zunlei Feng
LM&MA
108
0
0
12 Dec 2024
ArtFormer: Controllable Generation of Diverse 3D Articulated Objects
Jiayi Su
Youhe Feng
Zheng Li
Jinhua Song
Yangfan He
Botao Ren
Botian Xu
AI4CE
99
2
0
10 Dec 2024
Is Self-Supervision Enough? Benchmarking Foundation Models Against End-to-End Training for Mitotic Figure Classification
J. Ganz
Jonas Ammeling
Emely Rosbach
Ludwig Lausser
C. Bertram
Katharina Breininger
Marc Aubreville
85
0
0
09 Dec 2024
Detecting Discrepancies Between AI-Generated and Natural Images Using Uncertainty
Jun Nie
Yonggang Zhang
Tongliang Liu
Y. Cheung
Bo Han
Xinmei Tian
UQCV
100
1
0
08 Dec 2024
Global and Dense Embeddings of Earth: Major TOM Floating in the Latent Space
Mikolaj Czerkawski
Marcin Kluczek
Jędrzej S. Bojanowski
87
1
0
07 Dec 2024
Slicing Vision Transformer for Flexible Inference
Yitian Zhang
Huseyin Coskun
Xu Ma
Huan Wang
Ke Ma
Xi
Chen
Derek Hao Hu
Y. Fu
ViT
81
0
0
06 Dec 2024
ARTeFACT: Benchmarking Segmentation Models on Diverse Analogue Media Damage
D. Ivanova
Marco Aversa
Paul Henderson
John Williamson
99
0
0
05 Dec 2024
Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion
Jiuhai Chen
Jianwei Yang
Haiping Wu
Dianqi Li
Jianfeng Gao
Tianyi Zhou
Bin Xiao
VLM
64
5
0
05 Dec 2024
Customize Segment Anything Model for Multi-Modal Semantic Segmentation with Mixture of LoRA Experts
Chenyang Zhu
Bin Xiao
Lin Shi
Shoukun Xu
Xu Zheng
MoE
101
11
0
05 Dec 2024
Coordinate In and Value Out: Training Flow Transformers in Ambient Space
Yuyang Wang
Anurag Ranjan
J. Susskind
Miguel Angel Bautista
3DPC
86
0
0
05 Dec 2024
HybridGS: Decoupling Transients and Statics with 2D and 3D Gaussian Splatting
Jingyu Lin
Jiaqi Gu
Lubin Fan
Bojian Wu
Yujing Lou
Renjie Chen
Ligang Liu
Jieping Ye
3DGS
119
0
0
05 Dec 2024
DualPM: Dual Posed-Canonical Point Maps for 3D Shape and Pose Reconstruction
Ben Kaye
Tomas Jakab
Shangzhe Wu
Christian Rupprecht
Andrea Vedaldi
3DPC
3DH
110
1
0
05 Dec 2024
Distillation of Diffusion Features for Semantic Correspondence
Frank Fundel
Johannes Schusterbauer
Vincent Tao Hu
Bjorn Ommer
DiffM
96
3
0
04 Dec 2024
DIVE: Taming DINO for Subject-Driven Video Editing
Yi Huang
Wei Xiong
He Zhang
Chaoqi Chen
Jianzhuang Liu
Mingfu Yan
Shifeng Chen
VGen
DiffM
86
0
0
04 Dec 2024
AdvDreamer Unveils: Are Vision-Language Models Truly Ready for Real-World 3D Variations?
Shouwei Ruan
Hanqin Liu
Yao Huang
Xiaoqi Wang
Caixin Kang
Hang Su
Yinpeng Dong
Xingxing Wei
VGen
105
0
0
04 Dec 2024
Beyond [cls]: Exploring the true potential of Masked Image Modeling representations
Marcin Przewiȩźlikowski
Randall Balestriero
Wojciech Jasiński
Marek 'Smieja
Bartosz Zieliñski
79
0
0
04 Dec 2024
RELOCATE: A Simple Training-Free Baseline for Visual Query Localization Using Region-Based Representations
Savya Khosla
S. Vallecorsa
Alex Schwing
Derek Hoiem
69
0
0
02 Dec 2024
VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language Models
Byung-Kwan Lee
Ryo Hachiuma
Yu-Chiang Frank Wang
Y. Ro
Yueh-Hua Wu
VLM
86
0
0
02 Dec 2024
Gen-SIS: Generative Self-augmentation Improves Self-supervised Learning
Varun Belagali
Srikar Yellapragada
Alexandros Graikos
S. Kapse
Zilinghan Li
Tarak Nandi
Ravi K. Madduri
Prateek Prasanna
Joel H. Saltz
Dimitris Samaras
DiffM
98
1
0
02 Dec 2024
I Spy With My Little Eye: A Minimum Cost Multicut Investigation of Dataset Frames
Katharina Prasse
Isaac Bravo
Stefanie Walter
Margret Keuper
82
1
0
02 Dec 2024
Quantization-Aware Imitation-Learning for Resource-Efficient Robotic Control
Seongmin Park
Hyungmin Kim
Wonseok Jeon
Juyoung Yang
Byeongwook Jeon
Yoonseon Oh
Jungwook Choi
98
1
0
02 Dec 2024
GFreeDet: Exploiting Gaussian Splatting and Foundation Models for Model-free Unseen Object Detection in the BOP Challenge 2024
Xingyu Liu
Yingyue Li
Chengxi Li
Gu Wang
Chenyangguang Zhang
Ziqin Huang
Xiangyang Ji
3DGS
93
2
0
02 Dec 2024
Second FRCSyn-onGoing: Winning Solutions and Post-Challenge Analysis to Improve Face Recognition with Synthetic Data
Ivan Deandres-Tame
Ruben Tolosana
Pietro Melzi
R. Vera-Rodríguez
Minchul Kim
...
Bernardo Biesseck
Pedro Vidal
Luiz Coelho
Roger Granada
David Menotti
89
2
0
02 Dec 2024
Beyond Text-Visual Attention: Exploiting Visual Cues for Effective Token Pruning in VLMs
Qizhe Zhang
Aosong Cheng
Ming Lu
Zhiyong Zhuo
Minqi Wang
Jiajun Cao
Shaobo Guo
Qi She
Shanghang Zhang
VLM
105
11
0
02 Dec 2024
EDTformer: An Efficient Decoder Transformer for Visual Place Recognition
Tong Jin
Feng Lu
Shuyu Hu
Chun Yuan
Yunpeng Liu
ViT
82
0
0
01 Dec 2024
FiffDepth: Feed-forward Transformation of Diffusion-Based Generators for Detailed Depth Estimation
Yunpeng Bai
Qixing Huang
DiffM
99
0
0
01 Dec 2024
Rethinking Generalizability and Discriminability of Self-Supervised Learning from Evolutionary Game Theory Perspective
Jiangmeng Li
Zehua Zang
Qirui Ji
Chuxiong Sun
Jingyao Wang
Junge Zhang
Changwen Zheng
Gang Hua
Hui Xiong
SSL
76
0
0
30 Nov 2024
TAROT: Targeted Data Selection via Optimal Transport
Lan Feng
Fan Nie
Yuejiang Liu
Alexandre Alahi
OT
146
1
0
30 Nov 2024
FlowCLAS: Enhancing Normalizing Flow Via Contrastive Learning For Anomaly Segmentation
Chang Won Lee
Selina Leveugle
Svetlana Stolpner
Chris Langley
Paul Grouchy
Jonathan Kelly
Steven Waslander
80
0
0
29 Nov 2024
CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation
Qixiu Li
Yaobo Liang
Zeyu Wang
Lin Luo
Xi Chen
...
Jianmin Bao
Dong Chen
Yuanchun Shi
Jiaolong Yang
B. Guo
LM&Ro
83
25
0
29 Nov 2024
FairDD: Fair Dataset Distillation via Synchronized Matching
Qihang Zhou
Shenhao Fang
Shibo He
Wenchao Meng
Jiming Chen
FedML
DD
98
1
0
29 Nov 2024
Curriculum Fine-tuning of Vision Foundation Model for Medical Image Classification Under Label Noise
Yeonguk Yu
Minhwan Ko
Sungho Shin
Kangmin Kim
K. Lee
NoLa
84
1
0
29 Nov 2024
Effective Fine-Tuning of Vision-Language Models for Accurate Galaxy Morphology Analysis
Ruoqi Wang
Haitao Wang
Qiong Luo
79
1
0
29 Nov 2024
T-3DGS: Removing Transient Objects for 3D Scene Reconstruction
Vadim Pryadilshchikov
Alexander Markin
Artem Komarichev
Ruslan Rakhimov
Peter Wonka
Evgeny Burnaev
3DGS
89
1
0
29 Nov 2024
Explaining the Impact of Training on Vision Models via Activation Clustering
Ahcène Boubekki
Samuel G. Fadel
Sebastian Mair
96
0
0
29 Nov 2024
Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentation
Luca Barsellotti
Lorenzo Bianchi
Nicola Messina
F. Carrara
Marcella Cornia
Lorenzo Baraldi
Fabrizio Falchi
Rita Cucchiara
VLM
77
2
0
28 Nov 2024
OMNI-DC: Highly Robust Depth Completion with Multiresolution Depth Integration
Yiming Zuo
Willow Yang
Zeyu Ma
Jia Deng
MDE
95
2
0
28 Nov 2024
Unleashing the Power of Data Synthesis in Visual Localization
Sihang Li
Siqi Tan
Bowen Chang
Jing Zhang
Chen Feng
Yiming Li
95
0
0
28 Nov 2024
Track Anything Behind Everything: Zero-Shot Amodal Video Object Segmentation
Finlay G. C. Hudson
W. Smith
VOS
VLM
83
0
0
28 Nov 2024
ETSM: Automating Dissection Trajectory Suggestion and Confidence Map-Based Safety Margin Prediction for Robot-assisted Endoscopic Submucosal Dissection
Mengya Xu
Wenjin Mo
Guankun Wang
Huxin Gao
An-Chi Wang
Long Bai
Chaoyang Lyu
Xiaoxiao Yang
Zhiyu Li
Hongliang Ren
84
0
0
28 Nov 2024
Any-Resolution AI-Generated Image Detection by Spectral Learning
Dimitrios Karageorgiou
Symeon Papadopoulos
I. Kompatsiaris
Efstratios Gavves
106
0
0
28 Nov 2024
TAMT: Temporal-Aware Model Tuning for Cross-Domain Few-Shot Action Recognition
Yilong Wang
Zilin Gao
Qilong Wang
Zhaofeng Chen
P. Li
Q. Hu
97
1
0
28 Nov 2024
Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation
Yueru Jia
Jiaming Liu
Sixiang Chen
Chenyang Gu
Zihan Wang
...
Lily Lee
Pengwei Wang
Zhongyuan Wang
Renrui Zhang
Shanghang Zhang
94
11
0
27 Nov 2024
VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interaction Format
Yueqian Wang
Xiaojun Meng
Yufei Wang
Jianxin Liang
Jiansheng Wei
Huishuai Zhang
Dongyan Zhao
VGen
85
8
0
27 Nov 2024
G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation
Tianxing Chen
Yao Mu
Zhixuan Liang
Z. Chen
Shijia Peng
...
Mingkun Xu
R. Hu
Han Zhang
Xuelong Li
Ping Luo
AI4CE
110
8
0
27 Nov 2024
Flaws of ImageNet, Computer Vision's Favourite Dataset
Nikita Kisel
Illia Volkov
Katerina Hanzelkova
Klara Janouskova
Jirí Matas
VLM
91
2
0
26 Nov 2024
A Distractor-Aware Memory for Visual Object Tracking with SAM2
Jovana Videnovic
A. Lukežič
Matej Kristan
VLM
91
2
0
26 Nov 2024
Spatially Visual Perception for End-to-End Robotic Learning
Travis Davies
Jiahuan Yan
Xiang Chen
Yu Tian
Yueting Zhuang
Yiqi Huang
Luhui Hu
78
0
0
26 Nov 2024
Previous
1
2
3
...
14
15
16
...
43
44
45
Next