Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.07193
Cited By
DINOv2: Learning Robust Visual Features without Supervision
14 April 2023
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
Vasil Khalidov
Pierre Fernandez
Daniel Haziza
Francisco Massa
Alaaeldin El-Nouby
Mahmoud Assran
Nicolas Ballas
Wojciech Galuba
Russ Howes
Po-Yao (Bernie) Huang
Shang-Wen Li
Ishan Misra
Michael G. Rabbat
Vasu Sharma
Gabriel Synnaeve
Huijiao Xu
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
VLM
CLIP
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DINOv2: Learning Robust Visual Features without Supervision"
50 / 2,220 papers shown
Title
From Recognition to Prediction: Leveraging Sequence Reasoning for Action Anticipation
Xin Liu
Chao Hao
Zitong Yu
Huanjing Yue
Jingyu Yang
45
1
0
05 Aug 2024
Toward Attention-based TinyML: A Heterogeneous Accelerated Architecture and Automated Deployment Flow
Philip Wiese
Gamze İslamoğlu
Moritz Scherer
Luka Macan
Victor J. B. Jung
Luca Bompani
Francesco Conti
Luca Benini
47
0
0
05 Aug 2024
Unsupervised Representation Learning by Balanced Self Attention Matching
Daniel Shalam
Simon Korman
SSL
43
0
0
04 Aug 2024
What Happens Without Background? Constructing Foreground-Only Data for Fine-Grained Tasks
Yuetian Wang
W. Hou
Qinmu Peng
Xinge You
47
0
0
04 Aug 2024
Downstream Transfer Attack: Adversarial Attacks on Downstream Models with Pre-trained Vision Transformers
Weijie Zheng
Xingjun Ma
Hanxun Huang
Zuxuan Wu
Yu-Gang Jiang
AAML
45
0
0
03 Aug 2024
POA: Pre-training Once for Models of All Sizes
Yingying Zhang
Xin Guo
Jiangwei Lao
Lei Yu
Lixiang Ru
Jian Wang
Guo Ye
Huimei He
Jingdong Chen
Ming Yang
78
1
0
02 Aug 2024
Virchow2: Scaling Self-Supervised Mixed Magnification Models in Pathology
Eric Zimmermann
Eugene Vorontsov
Julian Viret
Adam Casson
Michal Zelechowski
...
Razik Yousfi
Thomas J. Fuchs
Nicolò Fusi
Siqi Liu
Kristen Severson
MedIm
51
30
0
01 Aug 2024
Privacy-preserving datasets by capturing feature distributions with Conditional VAEs
Francesco Di Salvo
David Tafler
Sebastian Doerrich
Christian Ledig
CML
45
0
0
01 Aug 2024
Alleviating Hallucination in Large Vision-Language Models with Active Retrieval Augmentation
Cephas Mpungu
Qiyuan Chen
Xiaoye Qu
Jiashuo Sun
G. Mapp
VLM
RALM
LRM
46
16
0
01 Aug 2024
IN-Sight: Interactive Navigation through Sight
Philipp Schoch
Fan Yang
Yuntao Ma
Stefan Leutenegger
Marco Hutter
Quentin Leboutet
49
3
0
01 Aug 2024
Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?
Richard Ren
Steven Basart
Adam Khoja
Alice Gatti
Long Phan
...
Alexander Pan
Gabriel Mukobi
Ryan H. Kim
Stephen Fitz
Dan Hendrycks
ELM
36
22
0
31 Jul 2024
EZSR: Event-based Zero-Shot Recognition
Yan Yang
Sehwan Kim
Dongxu Li
Y. Sun
43
0
0
31 Jul 2024
Evaluating SAM2's Role in Camouflaged Object Detection: From SAM to SAM2
Lv Tang
Bo Li
VLM
40
7
0
31 Jul 2024
ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models
Ming-Kuan Wu
Xinyue Cai
Jiayi Ji
Jiale Li
Oucheng Huang
Gen Luo
Hao Fei
Xiaoshuai Sun
Rongrong Ji
MLLM
65
7
0
31 Jul 2024
StreetSurfaceVis: a dataset of crowdsourced street-level imagery with semi-automated annotations of road surface type and quality
Alexandra Kapp
Edith Hoffmann
Esther Weigmann
Helena Mihaljević
35
1
0
31 Jul 2024
Small Object Few-shot Segmentation for Vision-based Industrial Inspection
Zilong Zhang
Chang Niu
Yi Lin
Jingchi Jiang
Xuefeng Chen
49
1
0
31 Jul 2024
Chat2Layout: Interactive 3D Furniture Layout with a Multimodal LLM
Can Wang
Hongliang Zhong
Menglei Chai
Mingming He
DongDong Chen
Jing Liao
LM&Ro
3DV
LRM
40
4
0
31 Jul 2024
Segment Anything for Videos: A Systematic Survey
Chunhui Zhang
Yawen Cui
Weilin Lin
Guanjie Huang
Yan Rong
Li Liu
Shiguang Shan
VLM
52
6
0
31 Jul 2024
CLEFT: Language-Image Contrastive Learning with Efficient Large Language Model and Prompt Fine-Tuning
Yuexi Du
Brian Chang
Nicha Dvornek
MedIm
VLM
50
2
0
30 Jul 2024
dopanim: A Dataset of Doppelganger Animals with Noisy Annotations from Multiple Humans
M. Herde
Denis Huseljic
Lukas Rauch
Bernhard Sick
49
1
0
30 Jul 2024
Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities
Lorenzo Baraldi
Federico Cocchi
Marcella Cornia
Lorenzo Baraldi
Alessandro Nicolosi
Rita Cucchiara
43
8
0
29 Jul 2024
Improving 2D Feature Representations by 3D-Aware Fine-Tuning
Yuanwen Yue
Anurag Das
Francis Engelmann
Siyu Tang
J. E. Lenssen
57
24
0
29 Jul 2024
SANGRIA: Surgical Video Scene Graph Optimization for Surgical Workflow Prediction
cCaughan Koksal
Ghazal Ghazaei
Felix Holm
Azade Farshad
Nassir Navab
MedIm
51
2
0
29 Jul 2024
Theia: Distilling Diverse Vision Foundation Models for Robot Learning
Jinghuan Shang
Karl Schmeckpeper
Brandon B. May
M. Minniti
Tarik Kelestemur
David Watkins
Laura Herlant
VLM
41
23
0
29 Jul 2024
ML-Mamba: Efficient Multi-Modal Large Language Model Utilizing Mamba-2
Wenjun Huang
Jiakai Pan
Jiahao Tang
Yanyu Ding
Yifei Xing
Yuhe Wang
Zhengzhuo Wang
Jianguo Hu
Mamba
58
5
0
29 Jul 2024
Rethinking RGB-D Fusion for Semantic Segmentation in Surgical Datasets
Muhammad Abdullah Jamal
Omid Mohareri
49
1
0
29 Jul 2024
Faster Image2Video Generation: A Closer Look at CLIP Image Embedding's Impact on Spatio-Temporal Cross-Attentions
Ashkan Taghipour
Morteza Ghahremani
Bennamoun
Aref Miri Rekavandi
Zinuo Li
Hamid Laga
F. Boussaïd
VGen
84
2
0
27 Jul 2024
PromptCCD: Learning Gaussian Mixture Prompt Pool for Continual Category Discovery
Fernando Julio Cendra
Bingchen Zhao
Kai Han
VLM
CLL
56
6
0
26 Jul 2024
SHIC: Shape-Image Correspondences with no Keypoint Supervision
Aleksandar Shtedritski
Christian Rupprecht
Andrea Vedaldi
3DPC
3DH
3DV
35
3
0
26 Jul 2024
QT-TDM: Planning with Transformer Dynamics Model and Autoregressive Q-Learning
Mostafa Kotb
C. Weber
Muhammad Burhan Hafez
Stefan Wermter
46
1
0
26 Jul 2024
From 2D to 3D: AISG-SLA Visual Localization Challenge
Jialin Gao
Bill Ong
Darld Lwi
Zhen Hao Ng
Xun Wei Yee
...
Johan Edstedt
Kirill Brodt
Clémentin Boittiaux
Maxime Ferrera
S. Konev
22
0
0
26 Jul 2024
Learning Spectral-Decomposed Tokens for Domain Generalized Semantic Segmentation
Jingjun Yi
Qi Bi
Hao Zheng
Haolan Zhan
Wei Ji
Yawen Huang
Yuexiang Li
Yefeng Zheng
43
12
0
26 Jul 2024
Trajectory-aligned Space-time Tokens for Few-shot Action Recognition
Pulkit Kumar
Namitha Padmanabhan
Luke Luo
Sai Saketh Rambhatla
Abhinav Shrivastava
50
4
0
25 Jul 2024
Automated Ensemble Multimodal Machine Learning for Healthcare
F. Imrie
Stefan Denner
Lucas S. Brunschwig
Klaus H. Maier-Hein
M. Schaar
29
2
1
25 Jul 2024
IRIS: Wireless Ring for Vision-based Smart Home Interaction
Maruchi Kim
Antonio Glenn
Bandhav Veluri
Yunseo Lee
Eyoel Gebre
Aditya Bagaria
Shwetak Patel
Shyamnath Gollakota
31
3
0
25 Jul 2024
The Curious Case of Representational Alignment: Unravelling Visio-Linguistic Tasks in Emergent Communication
Tom Kouwenhoven
Max Peeperkorn
Bram van Dijk
Tessa Verhoef
34
3
0
25 Jul 2024
Unified Lexical Representation for Interpretable Visual-Language Alignment
Yifan Li
Yikai Wang
Yanwei Fu
Dongyu Ru
Zheng Zhang
Tong He
VLM
42
4
0
25 Jul 2024
Unsqueeze [CLS] Bottleneck to Learn Rich Representations
Qing Su
Shihao Ji
36
0
0
24 Jul 2024
Pretrained Visual Representations in Reinforcement Learning
Emlyn Williams
Athanasios Polydoros
SSL
20
1
0
24 Jul 2024
Graph Neural Networks: A suitable Alternative to MLPs in Latent 3D Medical Image Classification?
Johannes Kiechle
Daniel M. Lang
Stefan M. Fischer
Lina Felsner
J. Peeken
Julia A. Schnabel
MedIm
51
0
0
24 Jul 2024
Nonverbal Immediacy Analysis in Education: A Multimodal Computational Model
Urovs Petković
Jonas Frenkel
Olaf Hellwich
Rebecca Lazarides
41
1
0
24 Jul 2024
PlantTrack: Task-Driven Plant Keypoint Tracking with Zero-Shot Sim2Real Transfer
Samhita Marri
A. N. Sivakumar
N. Uppalapati
Girish Chowdhary
26
0
0
23 Jul 2024
SINDER: Repairing the Singular Defects of DINOv2
Haoqian Wang
Tong Zhang
Mathieu Salzmann
39
1
0
23 Jul 2024
INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model
Yiwei Ma
Zhibin Wang
Xiaoshuai Sun
Weihuang Lin
Qiang-feng Zhou
Jiayi Ji
Rongrong Ji
MLLM
VLM
59
1
0
23 Jul 2024
Reconstructing Training Data From Real World Models Trained with Transfer Learning
Yakir Oz
Gilad Yehudai
Gal Vardi
Itai Antebi
Michal Irani
Niv Haim
43
2
0
22 Jul 2024
MILAN: Milli-Annotations for Lidar Semantic Segmentation
Nermin Samet
Gilles Puy
Oriane Siméoni
Renaud Marlet
3DPC
47
0
0
22 Jul 2024
AdaCLIP: Adapting CLIP with Hybrid Learnable Prompts for Zero-Shot Anomaly Detection
Yunkang Cao
Jiangning Zhang
Luca Frittoli
Yuqi Cheng
Weiming Shen
Giacomo Boracchi
VLM
61
29
0
22 Jul 2024
MSSPlace: Multi-Sensor Place Recognition with Visual and Text Semantics
Alexander Melekhin
Dmitry Yudin
Ilia Petryashin
Vitaly Bezuglyj
53
1
0
22 Jul 2024
Probing Fine-Grained Action Understanding and Cross-View Generalization of Foundation Models
Thinesh Thiyakesan Ponbagavathi
Kunyu Peng
Alina Roitberg
56
1
0
22 Jul 2024
Exploring the Effectiveness of Object-Centric Representations in Visual Question Answering: Comparative Insights with Foundation Models
Amir Mohammad Karimi Mamaghan
Samuele Papa
Karl Henrik Johansson
Stefan Bauer
Andrea Dittadi
OCL
56
5
0
22 Jul 2024
Previous
1
2
3
...
23
24
25
...
43
44
45
Next