Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.07193
Cited By
DINOv2: Learning Robust Visual Features without Supervision
14 April 2023
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
Vasil Khalidov
Pierre Fernandez
Daniel Haziza
Francisco Massa
Alaaeldin El-Nouby
Mahmoud Assran
Nicolas Ballas
Wojciech Galuba
Russ Howes
Po-Yao (Bernie) Huang
Shang-Wen Li
Ishan Misra
Michael G. Rabbat
Vasu Sharma
Gabriel Synnaeve
Huijiao Xu
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
VLM
CLIP
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DINOv2: Learning Robust Visual Features without Supervision"
50 / 2,220 papers shown
Title
ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions
Chunlong Xia
Xinliang Wang
Feng Lv
Xin Hao
Yifeng Shi
ViT
34
47
0
12 Mar 2024
You'll Never Walk Alone: A Sketch and Text Duet for Fine-Grained Image Retrieval
Subhadeep Koley
A. Bhunia
Aneeshan Sain
Pinaki Nath Chowdhury
Tao Xiang
Yi-Zhe Song
3DV
56
11
0
12 Mar 2024
QUASAR: QUality and Aesthetics Scoring with Advanced Representations
Sergey Kastryulin
Denis Prokopenko
Artem Babenko
Dmitry V. Dylov
41
0
0
11 Mar 2024
EarthLoc: Astronaut Photography Localization by Indexing Earth from Space
Gabriele Berton
Alex Stoken
Barbara Caputo
Carlo Masone
39
3
0
11 Mar 2024
Transferring Relative Monocular Depth to Surgical Vision with Temporal Consistency
C. Budd
Tom Kamiel Magda Vercauteren
MDE
MedIm
54
4
0
11 Mar 2024
Leveraging Foundation Models for Content-Based Medical Image Retrieval in Radiology
Stefan Denner
David Zimmerer
Dimitrios Bounias
Markus Bujotzek
Shuhan Xiao
Lisa Kausch
Philipp Schader
Tobias Penzkofer
Paul F. Jäger
Klaus Maier-Hein
VLM
MedIm
34
8
0
11 Mar 2024
PointSeg: A Training-Free Paradigm for 3D Scene Segmentation via Foundation Models
Qingdong He
Jinlong Peng
Zhengkai Jiang
Xiaobin Hu
Jiangning Zhang
Qiang Nie
Yabiao Wang
Chengjie Wang
3DPC
VLM
51
5
0
11 Mar 2024
Pre-Trained Model Recommendation for Downstream Fine-tuning
Jiameng Bai
Sai Wu
Mingli Song
Junbo Zhao
Gang Chen
47
0
0
11 Mar 2024
Understanding and Mitigating Human-Labelling Errors in Supervised Contrastive Learning
Zijun Long
Lipeng Zhuang
George Killick
R. McCreadie
Gerardo Aragon Camarasa
Paul Henderson
NoLa
36
1
0
10 Mar 2024
Mipha: A Comprehensive Overhaul of Multimodal Assistant with Small Language Models
Minjie Zhu
Yichen Zhu
Xin Liu
Ning Liu
Zhiyuan Xu
Yaxin Peng
Chaomin Shen
Zhicai Ou
Feifei Feng
Jian Tang
VLM
57
20
0
10 Mar 2024
Can Generative Models Improve Self-Supervised Representation Learning?
Sana Ayromlou
Arash Afkanpour
Vahid Reza Khazaie
Fereshteh Forghani
45
3
0
09 Mar 2024
Augmentations vs Algorithms: What Works in Self-Supervised Learning
Warren Morningstar
Alex Bijamov
Chris Duvarney
Luke Friedman
Neha Kalibhat
...
Philip Mansfield
Renan A. Rojas-Gomez
Karan Singhal
Bradley Green
Sushant Prakash
SSL
38
10
0
08 Mar 2024
Part-aware Personalized Segment Anything Model for Patient-Specific Segmentation
Chenhui Zhao
Liyue Shen
VLM
50
3
0
08 Mar 2024
HistGen: Histopathology Report Generation via Local-Global Feature Encoding and Cross-modal Context Interaction
Zhengrui Guo
Jiabo Ma
Ying Xu
Yihui Wang
Liansheng Wang
Hao Chen
58
18
0
08 Mar 2024
Exploring Robust Features for Few-Shot Object Detection in Satellite Imagery
Xavier Bou
Gabriele Facciolo
R. G. V. Gioi
Jean-Michel Morel
T. Ehret
ObjD
49
2
0
08 Mar 2024
Spatiotemporal Predictive Pre-training for Robotic Motor Control
Jiange Yang
Bei Liu
Jianlong Fu
Bocheng Pan
Gangshan Wu
Limin Wang
53
10
0
08 Mar 2024
Tracking Meets LoRA: Faster Training, Larger Model, Stronger Performance
Liting Lin
Heng Fan
Zhipeng Zhang
Yaowei Wang
Yong-mei Xu
Haibin Ling
57
26
0
08 Mar 2024
Stealing Stable Diffusion Prior for Robust Monocular Depth Estimation
Yifan Mao
Jian Liu
Xianming Liu
DiffM
MDE
42
2
0
08 Mar 2024
ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes
H. Malik
Muhammad Huzaifa
Muzammal Naseer
Salman Khan
Fahad Shahbaz Khan
DiffM
47
2
0
07 Mar 2024
DNAct: Diffusion Guided Multi-Task 3D Policy Learning
Ge Yan
Yueh-hua Wu
Xiaolong Wang
VGen
42
20
0
07 Mar 2024
ComFe: An Interpretable Head for Vision Transformers
Evelyn J. Mannix
H. Bondell
Howard Bondell
VLM
ViT
39
1
0
07 Mar 2024
Popeye: A Unified Visual-Language Model for Multi-Source Ship Detection from Remote Sensing Imagery
Wei Zhang
Miaoxin Cai
Tong Zhang
Guoqiang Lei
Zhuang Yin
Xuerui Mao
35
7
0
06 Mar 2024
DINOv2 based Self Supervised Learning For Few Shot Medical Image Segmentation
Lev Ayzenberg
Raja Giryes
H. Greenspan
26
4
0
05 Mar 2024
UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control
Xuweiyi Chen
Tian Xia
Sihan Xu
VGen
DiffM
40
7
0
04 Mar 2024
Modeling Multimodal Social Interactions: New Challenges and Baselines with Densely Aligned Representations
Sangmin Lee
Bolin Lai
Fiona Ryan
Bikram Boote
James M. Rehg
33
8
0
04 Mar 2024
A Simple-but-effective Baseline for Training-free Class-Agnostic Counting
Yuhao Lin
Hai-Ming Xu
Lingqiao Liu
Javen Qinfeng Shi
33
1
0
03 Mar 2024
Feature Alignment: Rethinking Efficient Active Learning via Proxy in the Context of Pre-trained Models
Ziting Wen
Oscar Pizarro
Stefan B. Williams
31
0
0
02 Mar 2024
Tree-Regularized Tabular Embeddings
Xuan Li
Yunhe Wang
Boqian Li
LMTD
51
3
0
01 Mar 2024
Rethinking cluster-conditioned diffusion models
Nikolas Adaloglou
Tim Kaiser
Félix D. P. Michels
M. Kollmann
VLM
42
3
0
01 Mar 2024
Learning and Leveraging World Models in Visual Representation Learning
Q. Garrido
Mahmoud Assran
Nicolas Ballas
Adrien Bardes
Laurent Najman
Yann LeCun
SSL
49
24
0
01 Mar 2024
Revisiting Disentanglement in Downstream Tasks: A Study on Its Necessity for Abstract Visual Reasoning
Ruiqian Nai
Zixin Wen
Ji Li
Yuanzhi Li
Yang Gao
52
2
0
01 Mar 2024
Large Convolutional Model Tuning via Filter Subspace
Wei Chen
Zichen Miao
Qiang Qiu
62
3
0
01 Mar 2024
Assessing Visually-Continuous Corruption Robustness of Neural Networks Relative to Human Performance
Huakun Shen
Boyue Caroline Hu
Krzysztof Czarnecki
Lina Marsso
Marsha Chechik
48
0
0
29 Feb 2024
CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition
Feng Lu
Xiangyuan Lan
Lijun Zhang
Dongmei Jiang
Yaowei Wang
Chun Yuan
47
29
0
29 Feb 2024
A SAM-guided Two-stream Lightweight Model for Anomaly Detection
Chenghao Li
Lei Qi
Xin Geng
40
5
0
29 Feb 2024
BigGait: Learning Gait Representation You Want by Large Vision Models
Dingqiang Ye
Chao Fan
Jingzhe Ma
Xiaoming Liu
Shiqi Yu
CVBM
SLR
49
18
0
29 Feb 2024
Weakly Supervised Co-training with Swapping Assignments for Semantic Segmentation
Xinyu Yang
Hossein Rahmani
Sue Black
Bryan M. Williams
42
2
0
27 Feb 2024
Massive Activations in Large Language Models
Mingjie Sun
Xinlei Chen
J. Zico Kolter
Zhuang Liu
76
67
0
27 Feb 2024
VRP-SAM: SAM with Visual Reference Prompt
Yanpeng Sun
Jiahui Chen
Shan Zhang
Xinyu Zhang
Qiang Chen
Gang Zhang
Errui Ding
Jingdong Wang
Zechao Li
54
32
0
27 Feb 2024
Masked Gamma-SSL: Learning Uncertainty Estimation via Masked Image Modeling
David S. W. Williams
Matthew Gadd
Paul Newman
Daniele De Martini
UQCV
30
1
0
27 Feb 2024
NocPlace: Nocturnal Visual Place Recognition via Generative and Inherited Knowledge Transfer
Bingxi Liu
Yiqun Wang
Huaqi Tao
Tingjun Huang
Fulin Tang
Yihong Wu
Jinqiang Cui
Hong Zhang
47
1
0
27 Feb 2024
GROUNDHOG: Grounding Large Language Models to Holistic Segmentation
Yichi Zhang
Ziqiao Ma
Xiaofeng Gao
Suhaila Shakiah
Qiaozi Gao
Joyce Chai
MLLM
VLM
55
39
0
26 Feb 2024
Key Design Choices in Source-Free Unsupervised Domain Adaptation: An In-depth Empirical Analysis
Andrea Maracani
Raffaello Camoriano
Elisa Maiettini
Davide Talon
Lorenzo Rosasco
Lorenzo Natale
47
1
0
25 Feb 2024
Instance-aware Exploration-Verification-Exploitation for Instance ImageGoal Navigation
X. Lei
Min Wang
Wen-gang Zhou
Li Li
Houqiang Li
55
5
0
25 Feb 2024
General Purpose Image Encoder DINOv2 for Medical Image Registration
Xin Song
Xuanang Xu
Pingkun Yan
MedIm
48
5
0
24 Feb 2024
Genie: Generative Interactive Environments
Jake Bruce
Michael Dennis
Ashley D. Edwards
Jack Parker-Holder
Yuge Shi
...
Konrad Zolna
Jeff Clune
Nando de Freitas
Satinder Singh
Tim Rocktaschel
VGen
VLM
74
149
0
23 Feb 2024
Unsupervised Domain Adaptation within Deep Foundation Latent Spaces
D. Kangin
Plamen Angelov
18
1
0
22 Feb 2024
Cameras as Rays: Pose Estimation via Ray Diffusion
Jason Y. Zhang
Amy Lin
Moneish Kumar
Tzu-Hsuan Yang
Deva Ramanan
Shubham Tulsiani
DiffM
47
55
0
22 Feb 2024
Self-Guided Masked Autoencoders for Domain-Agnostic Self-Supervised Learning
Johnathan Xie
Yoonho Lee
Annie S. Chen
Chelsea Finn
27
3
0
22 Feb 2024
Visual Hallucinations of Multi-modal Large Language Models
Wen Huang
Hongbin Liu
Minxin Guo
Neil Zhenqiang Gong
MLLM
VLM
32
24
0
22 Feb 2024
Previous
1
2
3
...
34
35
36
...
43
44
45
Next