Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.07193
Cited By
DINOv2: Learning Robust Visual Features without Supervision
14 April 2023
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
Vasil Khalidov
Pierre Fernandez
Daniel Haziza
Francisco Massa
Alaaeldin El-Nouby
Mahmoud Assran
Nicolas Ballas
Wojciech Galuba
Russ Howes
Po-Yao (Bernie) Huang
Shang-Wen Li
Ishan Misra
Michael G. Rabbat
Vasu Sharma
Gabriel Synnaeve
Huijiao Xu
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
VLM
CLIP
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DINOv2: Learning Robust Visual Features without Supervision"
50 / 2,220 papers shown
Title
Affine steerers for structured keypoint description
Georg Bökman
Johan Edstedt
Michael Felsberg
Fredrik Kahl
LLMSV
44
2
0
26 Aug 2024
Re-Mix: Optimizing Data Mixtures for Large Scale Imitation Learning
Joey Hejna
Chethan Bhateja
Yichen Jian
Karl Pertsch
Dorsa Sadigh
30
16
0
26 Aug 2024
An Embedding is Worth a Thousand Noisy Labels
Francesco Di Salvo
Sebastian Doerrich
Ines Rieger
Christian Ledig
NoLa
75
0
0
26 Aug 2024
Can Visual Foundation Models Achieve Long-term Point Tracking?
Görkay Aydemir
Weidi Xie
Fatma Guney
45
7
0
24 Aug 2024
Segment Any Mesh
George Tang
William Zhao
Logan Ford
David Benhaim
Paul Zhang
44
8
0
24 Aug 2024
FungiTastic: A multi-modal dataset and benchmark for image categorization
Lukás Picek
Klara Janouskova
Milan Šulc
Jirí Matas
83
1
0
24 Aug 2024
A New Era in Computational Pathology: A Survey on Foundation and Vision-Language Models
Dibaloke Chanda
Milan Aryal
Nasim Yahya Soltani
Masoud Ganji
AI4CE
VLM
49
7
0
23 Aug 2024
Image Segmentation in Foundation Model Era: A Survey
Tianfei Zhou
Fei Zhang
Boyu Chang
Wenguan Wang
Ye Yuan
E. Konukoglu
Daniel Cremers
VLM
45
5
0
23 Aug 2024
State-of-the-Art Fails in the Art of Damage Detection
D. Ivanova
Marco Aversa
Paul Henderson
John Williamson
26
0
0
23 Aug 2024
WildFusion: Individual Animal Identification with Calibrated Similarity Fusion
Vojtěch Cermak
Lukás Picek
Lukáš Adam
Lukáš Neumann
Jiří Matas
FedML
41
2
0
23 Aug 2024
Animal Identification with Independent Foreground and Background Modeling
Lukás Picek
Lukás Neumann
Jirí Matas
VLM
43
2
0
23 Aug 2024
Atlas Gaussians Diffusion for 3D Generation
Haitao Yang
Yuan Dong
Hanwen Jiang
Dejia Xu
Georgios Pavlakos
Qixing Huang
3DGS
81
3
0
23 Aug 2024
Building and better understanding vision-language models: insights and future directions
Hugo Laurençon
Andrés Marafioti
Victor Sanh
Léo Tronchon
VLM
53
63
0
22 Aug 2024
Sapiens: Foundation for Human Vision Models
Rawal Khirodkar
Timur M. Bagautdinov
Julieta Martinez
Su Zhaoen
Austin James
Peter Selednik
Stuart Anderson
Shunsuke Saito
VLM
52
63
0
22 Aug 2024
Scribbles for All: Benchmarking Scribble Supervised Segmentation Across Datasets
Wolfgang Boettcher
Lukas Hoyer
Ozan Unal
J. E. Lenssen
Bernt Schiele
36
0
0
22 Aug 2024
Enhanced Infield Agriculture with Interpretable Machine Learning Approaches for Crop Classification
Sudi Murindanyi
Joyce Nakatumba-Nabende
Rahman Sanya
Rose Nakibuule
Andrew Katumba
VLM
33
0
0
22 Aug 2024
Cross-Domain Foundation Model Adaptation: Pioneering Computer Vision Models for Geophysical Data Analysis
Zhixiang Guo
Xinming Wu
Luming Liang
Hanlin Sheng
Nuo Chen
Zhengfa Bi
AI4CE
59
1
0
22 Aug 2024
VTON-HandFit: Virtual Try-on for Arbitrary Hand Pose Guided by Hand Priors Embedding
Yujie Liang
Xiaobin Hu
Boyuan Jiang
Donghao Luo
Kai WU
Wenhui Han
Taisong Jin
Chengjie Wang
DiffM
39
2
0
22 Aug 2024
SEA: Supervised Embedding Alignment for Token-Level Visual-Textual Integration in MLLMs
Yuanyang Yin
Yaqi Zhao
Yajie Zhang
Ke Lin
Jiahao Wang
Xin Tao
Pengfei Wan
Di Zhang
Baoqun Yin
Wentao Zhang
LRM
50
6
0
21 Aug 2024
E-Bench: Subjective-Aligned Benchmark Suite for Text-Driven Video Editing Quality Assessment
Shangkun Sun
Xiaoyu Liang
S. Fan
Wenxu Gao
Wei-Nan Gao
DiffM
63
0
0
21 Aug 2024
EmbodiedSAM: Online Segment Any 3D Thing in Real Time
Xiuwei Xu
Huangxing Chen
Linqing Zhao
Ziwei Wang
Jie Zhou
Jiwen Lu
47
15
0
21 Aug 2024
Large Point-to-Gaussian Model for Image-to-3D Generation
Longfei Lu
Huachen Gao
Tao Dai
Yaohua Zha
Zhi Hou
Junta Wu
Shu-Tao Xia
3DGS
DiffM
45
4
0
20 Aug 2024
PooDLe: Pooled and dense self-supervised learning from naturalistic videos
Alex N. Wang
Christopher Hoang
Yuwen Xiong
Yann LeCun
Mengye Ren
78
0
0
20 Aug 2024
Learning Precise Affordances from Egocentric Videos for Robotic Manipulation
Gen Li
Nikolaos Tsagkas
Jifei Song
Ruaridh Mon-Williams
S. Vijayakumar
Kun Shao
Laura Sevilla-Lara
43
8
0
19 Aug 2024
3D-Aware Instance Segmentation and Tracking in Egocentric Videos
Yash Bhalgat
Vadim Tschernezki
Iro Laina
João F. Henriques
Andrea Vedaldi
Andrew Zisserman
VOS
44
1
0
19 Aug 2024
Zero-Shot Object-Centric Representation Learning
Aniket Didolkar
Andrii Zadaianchuk
Anirudh Goyal
Mike Mozer
Yoshua Bengio
Georg Martius
Maximilian Seitzer
VLM
OCL
42
4
0
17 Aug 2024
Are CLIP features all you need for Universal Synthetic Image Origin Attribution?
Dario Cioni
Christos Tzelepis
Lorenzo Seidenari
Ioannis Patras
48
2
0
17 Aug 2024
Segment Anything with Multiple Modalities
Aoran Xiao
Weihao Xuan
Heli Qi
Yun Xing
Naoto Yokoya
Shijian Lu
VLM
38
7
0
17 Aug 2024
Retrieval-augmented Few-shot Medical Image Segmentation with Foundation Models
Lin Zhao
Xiao Chen
Eric Z. Chen
Yikang Liu
Terrence Chen
Shanhui Sun
VLM
57
5
0
16 Aug 2024
SpectralEarth: Training Hyperspectral Foundation Models at Scale
Nassim Ait Ali Braham
C. Albrecht
Julien Mairal
J. Chanussot
Yi Wang
X. Zhu
43
13
0
15 Aug 2024
Towards flexible perception with visual memory
Robert Geirhos
P. Jaini
Austin Stone
Sourabh Medapati
Xi Yi
G. Toderici
Abhijit Ogale
Jonathon Shlens
44
1
0
15 Aug 2024
Navigating Data Scarcity using Foundation Models: A Benchmark of Few-Shot and Zero-Shot Learning Approaches in Medical Imaging
S. Woerner
Christian F. Baumgartner
VLM
MedIm
35
0
0
15 Aug 2024
MVInpainter: Learning Multi-View Consistent Inpainting to Bridge 2D and 3D Editing
Chenjie Cao
Chaohui Yu
Yanwei Fu
Fan Wang
Xiangyang Xue
VGen
55
7
0
15 Aug 2024
General-purpose Clothes Manipulation with Semantic Keypoints
Yuhong Deng
David Hsu
64
2
0
15 Aug 2024
Connecting Dreams with Visual Brainstorming Instruction
Yasheng Sun
Bohan Li
Mingchen Zhuge
Deng-Ping Fan
Salman Khan
Fahad Shahbaz Khan
Hideki Koike
DiffM
44
0
0
14 Aug 2024
SlotLifter: Slot-guided Feature Lifting for Learning Object-centric Radiance Fields
Yu Liu
Baoxiong Jia
Yixin Chen
Siyuan Huang
OCL
50
4
0
13 Aug 2024
Towards Robust Monocular Depth Estimation in Non-Lambertian Surfaces
Junrui Zhang
Jiaqi Li
Yachuan Huang
Yiran Wang
Jinghong Zheng
Liao Shen
Z. Cao
MDE
39
3
0
12 Aug 2024
BooW-VTON: Boosting In-the-Wild Virtual Try-On via Mask-Free Pseudo Data Training
Xuanpu Zhang
Dan Song
Pengxin Zhan
Qingguo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
Anan Liu
DiffM
43
4
0
12 Aug 2024
BI-MDRG: Bridging Image History in Multimodal Dialogue Response Generation
Hee Suk Yoon
Eunseop Yoon
Joshua Tian Jin Tee
Kang Zhang
Yu-Jung Heo
Du-Seong Chang
Chang D. Yoo
41
3
0
12 Aug 2024
Efficient Diffusion Transformer with Step-wise Dynamic Attention Mediators
Yifan Pu
Zhuofan Xia
Jiayi Guo
Dongchen Han
Qixiu Li
...
Ji Li
Yizeng Han
Shiji Song
Gao Huang
Xiu Li
69
12
0
11 Aug 2024
PS-TTL: Prototype-based Soft-labels and Test-Time Learning for Few-shot Object Detection
Yingjie Gao
Yanan Zhang
Ziyue Huang
Nanqing Liu
Di Huang
ObjD
53
1
0
11 Aug 2024
UNIC: Universal Classification Models via Multi-teacher Distillation
Mert Bulent Sariyildiz
Philippe Weinzaepfel
Thomas Lucas
Diane Larlus
Yannis Kalantidis
47
7
0
09 Aug 2024
In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation
Dahyun Kang
Minsu Cho
ObjD
VLM
45
9
0
09 Aug 2024
Depth Any Canopy: Leveraging Depth Foundation Models for Canopy Height Estimation
Daniele Rege Cambrin
Isaac Corley
Paolo Garza
34
2
0
08 Aug 2024
SegXAL: Explainable Active Learning for Semantic Segmentation in Driving Scene Scenarios
Sriram Mandalika
Athira Nambiar
35
1
0
08 Aug 2024
Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling
Zilyu Ye
Yu Lei
Ruotian Peng
Jinjin Cao
Zhiyang Chen
...
Mingyuan Zhou
Xiaoqian Shen
Mohamed Elhoseiny
Nan Zhuang
Guo-Jun Qi
VGen
VLM
42
1
0
07 Aug 2024
Concept Conductor: Orchestrating Multiple Personalized Concepts in Text-to-Image Synthesis
Zebin Yao
Fangxiang Feng
Ruifan Li
Xiaojie Wang
DiffM
44
1
0
07 Aug 2024
AMES: Asymmetric and Memory-Efficient Similarity Estimation for Instance-level Retrieval
Pavel Suma
Giorgos Kordopatis-Zilos
Ahmet Iscen
Giorgos Tolias
VLM
50
3
0
06 Aug 2024
Evaluation of Segment Anything Model 2: The Role of SAM2 in the Underwater Environment
Shijie Lian
Hua Li
VLM
43
5
0
06 Aug 2024
From Recognition to Prediction: Leveraging Sequence Reasoning for Action Anticipation
Xin Liu
Chao Hao
Zitong Yu
Huanjing Yue
Jingyu Yang
43
1
0
05 Aug 2024
Previous
1
2
3
...
22
23
24
...
43
44
45
Next