ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.07193
  4. Cited By
DINOv2: Learning Robust Visual Features without Supervision
v1v2 (latest)

DINOv2: Learning Robust Visual Features without Supervision

14 April 2023
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
Vasil Khalidov
Pierre Fernandez
Daniel Haziza
Francisco Massa
Alaaeldin El-Nouby
Mahmoud Assran
Nicolas Ballas
Wojciech Galuba
Russ Howes
Po-Yao (Bernie) Huang
Shang-Wen Li
Ishan Misra
Michael G. Rabbat
Vasu Sharma
Gabriel Synnaeve
Huijiao Xu
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
    VLMCLIPSSL
ArXiv (abs)PDFHTML

Papers citing "DINOv2: Learning Robust Visual Features without Supervision"

50 / 826 papers shown
Title
A Large-Scale Evaluation of Speech Foundation Models
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
99
27
0
15 Apr 2024
Flatness Improves Backbone Generalisation in Few-shot Classification
Flatness Improves Backbone Generalisation in Few-shot Classification
Rui Li
Martin Trapp
Talal Alrawajfeh
Arno Solin
118
0
0
11 Apr 2024
A Lightweight Measure of Classification Difficulty from Application
  Dataset Characteristics
A Lightweight Measure of Classification Difficulty from Application Dataset Characteristics
Bryan Bo Cao
Abhinav Sharma
Lawrence O'Gorman
Michael J. Coss
Shubham Jain
87
2
0
09 Apr 2024
Spatial Cognition from Egocentric Video: Out of Sight, Not Out of Mind
Spatial Cognition from Egocentric Video: Out of Sight, Not Out of Mind
Chiara Plizzari
Shubham Goel
Toby Perrett
Jacob Chalk
Angjoo Kanazawa
Dima Damen
97
12
0
07 Apr 2024
Dissecting Query-Key Interaction in Vision Transformers
Dissecting Query-Key Interaction in Vision Transformers
Xu Pan
Aaron Philip
Ziqian Xie
Odelia Schwartz
115
1
0
04 Apr 2024
3D Congealing: 3D-Aware Image Alignment in the Wild
3D Congealing: 3D-Aware Image Alignment in the Wild
Yunzhi Zhang
Zizhang Li
Amit Raj
Andreas Engelhardt
Yuanzhen Li
Tingbo Hou
Jiajun Wu
Varun Jampani
3DV
69
0
0
02 Apr 2024
Large Language Models for Orchestrating Bimanual Robots
Large Language Models for Orchestrating Bimanual Robots
Kun-Mo Chu
Xufeng Zhao
C. Weber
Mengdi Li
Wenhao Lu
Stefan Wermter
LM&RoLLMAG
97
8
0
02 Apr 2024
CHOSEN: Contrastive Hypothesis Selection for Multi-View Depth Refinement
CHOSEN: Contrastive Hypothesis Selection for Multi-View Depth Refinement
Di Qiu
Yinda Zhang
Thabo Beeler
V. Tankovich
Christian Häne
S. Fanello
Christoph Rhemann
S. Orts-Escolano
42
1
0
02 Apr 2024
Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want
Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want
Weifeng Lin
Xinyu Wei
Ruichuan An
Peng Gao
Bocheng Zou
Yulin Luo
Siyuan Huang
Shanghang Zhang
Hongsheng Li
VLM
184
47
0
29 Mar 2024
Automated Black-box Prompt Engineering for Personalized Text-to-Image Generation
Automated Black-box Prompt Engineering for Personalized Text-to-Image Generation
Yutong He
Alexander Robey
Naoki Murata
Yiding Jiang
J. Williams
George Pappas
Hamed Hassani
Yuki Mitsufuji
Ruslan Salakhutdinov
J. Zico Kolter
DiffM
147
5
0
28 Mar 2024
PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition
PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition
Chenhongyi Yang
Zehui Chen
Miguel Espinosa
Linus Ericsson
Zhenyu Wang
Jiaming Liu
Elliot J. Crowley
Mamba
119
99
0
26 Mar 2024
Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art
Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art
Neeloy Chakraborty
Melkior Ornik
Katherine Driggs-Campbell
LRM
243
12
0
25 Mar 2024
Continuous, Subject-Specific Attribute Control in T2I Models by Identifying Semantic Directions
Continuous, Subject-Specific Attribute Control in T2I Models by Identifying Semantic Directions
S. A. Baumann
Felix Krause
Michael Neumayr
Nick Stracke
Vincent Tao Hu
Bjorn Ommer
Björn Ommer
DiffMLM&Ro
121
12
0
25 Mar 2024
Segment Anything Model for Road Network Graph Extraction
Segment Anything Model for Road Network Graph Extraction
Congrui Hetang
Haoru Xue
Cindy X. Le
Tianwei Yue
Wenping Wang
Yihui He
141
17
0
24 Mar 2024
Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation
Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation
Mu Hu
Wei Yin
C. Zhang
Zhipeng Cai
Xiaoxiao Long
Kaixuan Wang
Kaixuan Wang
Gang Yu
Chunhua Shen
Shaojie Shen
3DGS
267
143
0
22 Mar 2024
DINO-Tracker: Taming DINO for Self-Supervised Point Tracking in a Single
  Video
DINO-Tracker: Taming DINO for Self-Supervised Point Tracking in a Single Video
Narek Tumanyan
Assaf Singer
Shai Bagon
Tali Dekel
MQ
100
32
0
21 Mar 2024
Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference
Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference
Han Zhao
Min Zhang
Wei Zhao
Pengxiang Ding
Siteng Huang
Donglin Wang
Mamba
119
74
0
21 Mar 2024
VXP: Voxel-Cross-Pixel Large-scale Image-LiDAR Place Recognition
VXP: Voxel-Cross-Pixel Large-scale Image-LiDAR Place Recognition
Yun-Jin Li
M. Gladkova
Yan Xia
Rui Wang
Daniel Cremers
112
5
0
21 Mar 2024
Universal Semi-Supervised Domain Adaptation by Mitigating Common-Class
  Bias
Universal Semi-Supervised Domain Adaptation by Mitigating Common-Class Bias
Wenyu Zhang
Qingmu Liu
Felix Ong Wei Cong
Mohamed Ragab
Chuan-Sheng Foo
76
0
0
17 Mar 2024
MaskDiffusion: Exploiting Pre-trained Diffusion Models for Semantic
  Segmentation
MaskDiffusion: Exploiting Pre-trained Diffusion Models for Semantic Segmentation
Yasufumi Kawano
Yoshimitsu Aoki
DiffM
85
8
0
17 Mar 2024
3D Human Reconstruction in the Wild with Synthetic Data Using Generative
  Models
3D Human Reconstruction in the Wild with Synthetic Data Using Generative Models
Yongtao Ge
Wenjia Wang
Yongfan Chen
Hao Chen
Chunhua Shen
3DH
72
8
0
17 Mar 2024
StainDiffuser: MultiTask Dual Diffusion Model for Virtual Staining
StainDiffuser: MultiTask Dual Diffusion Model for Virtual Staining
Tushar Kataria
Beatrice Knudsen
Shireen Y. Elhabian
DiffMMedIm
105
10
0
17 Mar 2024
Annotation Free Semantic Segmentation with Vision Foundation Models
Annotation Free Semantic Segmentation with Vision Foundation Models
Soroush Seifi
Daniel Olmeda Reino
Fabien Despinoy
Rahaf Aljundi
VLM
101
1
0
14 Mar 2024
CART: Caltech Aerial RGB-Thermal Dataset in the Wild
CART: Caltech Aerial RGB-Thermal Dataset in the Wild
Connor T. Lee
Matthew O. Anderson
Nikhil Raganathan
Xingxing Zuo
Kevin Do
Georgia Gkioxari
Soon-Jo Chung
80
8
0
13 Mar 2024
SemGauss-SLAM: Dense Semantic Gaussian Splatting SLAM
SemGauss-SLAM: Dense Semantic Gaussian Splatting SLAM
Siting Zhu
Renjie Qin
Guangming Wang
Jiuming Liu
Hesheng Wang
86
31
0
12 Mar 2024
QUASAR: QUality and Aesthetics Scoring with Advanced Representations
QUASAR: QUality and Aesthetics Scoring with Advanced Representations
Sergey Kastryulin
Denis Prokopenko
Artem Babenko
Dmitry V. Dylov
61
0
0
11 Mar 2024
Leveraging Foundation Models for Content-Based Image Retrieval in Radiology
Leveraging Foundation Models for Content-Based Image Retrieval in Radiology
Stefan Denner
David Zimmerer
Dimitrios Bounias
Markus Bujotzek
Shuhan Xiao
...
Lisa Kausch
Philipp Schader
Tobias Penzkofer
Paul F. Jäger
Klaus H. Maier-Hein
MedImVLM
59
8
0
11 Mar 2024
Tracking Meets LoRA: Faster Training, Larger Model, Stronger Performance
Tracking Meets LoRA: Faster Training, Larger Model, Stronger Performance
Liting Lin
Heng Fan
Zhipeng Zhang
Yaowei Wang
Yong-mei Xu
Haibin Ling
129
35
0
08 Mar 2024
Modeling Multimodal Social Interactions: New Challenges and Baselines
  with Densely Aligned Representations
Modeling Multimodal Social Interactions: New Challenges and Baselines with Densely Aligned Representations
Sangmin Lee
Bolin Lai
Fiona Ryan
Bikram Boote
James M. Rehg
91
9
0
04 Mar 2024
A Simple-but-effective Baseline for Training-free Class-Agnostic Counting
A Simple-but-effective Baseline for Training-free Class-Agnostic Counting
Yuhao Lin
Hai-Ming Xu
Lingqiao Liu
Javen Qinfeng Shi
112
1
0
03 Mar 2024
Rethinking cluster-conditioned diffusion models
Rethinking cluster-conditioned diffusion models
Nikolas Adaloglou
Tim Kaiser
Félix D. P. Michels
M. Kollmann
VLM
75
3
0
01 Mar 2024
Large Convolutional Model Tuning via Filter Subspace
Large Convolutional Model Tuning via Filter Subspace
Wei Chen
Zichen Miao
Qiang Qiu
227
4
0
01 Mar 2024
NocPlace: Nocturnal Visual Place Recognition via Generative and
  Inherited Knowledge Transfer
NocPlace: Nocturnal Visual Place Recognition via Generative and Inherited Knowledge Transfer
Bingxi Liu
Yiqun Wang
Huaqi Tao
Tingjun Huang
Fulin Tang
Yihong Wu
Jinqiang Cui
Hong Zhang
84
1
0
27 Feb 2024
Cameras as Rays: Pose Estimation via Ray Diffusion
Cameras as Rays: Pose Estimation via Ray Diffusion
Jason Y. Zhang
Amy Lin
Moneish Kumar
Tzu-Hsuan Yang
Deva Ramanan
Shubham Tulsiani
DiffM
134
63
0
22 Feb 2024
Visual Hallucinations of Multi-modal Large Language Models
Visual Hallucinations of Multi-modal Large Language Models
Wen Huang
Hongbin Liu
Minxin Guo
Neil Zhenqiang Gong
MLLMVLM
116
35
0
22 Feb 2024
Multi-HMR: Multi-Person Whole-Body Human Mesh Recovery in a Single Shot
Multi-HMR: Multi-Person Whole-Body Human Mesh Recovery in a Single Shot
Fabien Baradel
M. Armando
Salma Galaaoui
Romain Brégier
Philippe Weinzaepfel
Grégory Rogez
Thomas Lucas
3DH
104
21
0
22 Feb 2024
CLCE: An Approach to Refining Cross-Entropy and Contrastive Learning for
  Optimized Learning Fusion
CLCE: An Approach to Refining Cross-Entropy and Contrastive Learning for Optimized Learning Fusion
Zijun Long
George Killick
Lipeng Zhuang
Gerardo Aragon Camarasa
Zaiqiao Meng
R. McCreadie
VLM
94
2
0
22 Feb 2024
Subobject-level Image Tokenization
Subobject-level Image Tokenization
Delong Chen
Samuel Cahyawijaya
Jianfeng Liu
Baoyuan Wang
Pascale Fung
VLMOCL
278
9
0
22 Feb 2024
How NeRFs and 3D Gaussian Splatting are Reshaping SLAM: a Survey
How NeRFs and 3D Gaussian Splatting are Reshaping SLAM: a Survey
Fabio Tosi
Youming Zhang
Ziren Gong
Erik Sandström
S. Mattoccia
Martin R. Oswald
Matteo Poggi
3DGS
205
63
0
20 Feb 2024
VideoPrism: A Foundational Visual Encoder for Video Understanding
VideoPrism: A Foundational Visual Encoder for Video Understanding
Long Zhao
N. B. Gundavarapu
Liangzhe Yuan
Hao Zhou
Shen Yan
...
Huisheng Wang
Hartwig Adam
Mikhail Sirotenko
Ting Liu
Boqing Gong
VGen
123
36
0
20 Feb 2024
Unsupervised Discovery of Object-Centric Neural Fields
Unsupervised Discovery of Object-Centric Neural Fields
Rundong Luo
Hong-Xing Yu
Jiajun Wu
3DPCOCL
171
5
0
12 Feb 2024
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
Chris Liu
Renrui Zhang
Longtian Qiu
Siyuan Huang
Weifeng Lin
...
Hao Shao
Pan Lu
Hongsheng Li
Yu Qiao
Peng Gao
MLLM
237
116
0
08 Feb 2024
PQMass: Probabilistic Assessment of the Quality of Generative Models using Probability Mass Estimation
PQMass: Probabilistic Assessment of the Quality of Generative Models using Probability Mass Estimation
Pablo Lemos
Sammy N. Sharief
Nikolay Malkin
Laurence Perreault Levasseur
Y. Hezaveh
Laurence Perreault-Levasseur
Yashar Hezaveh
86
3
0
06 Feb 2024
Careful with that Scalpel: Improving Gradient Surgery with an EMA
Careful with that Scalpel: Improving Gradient Surgery with an EMA
Yu-Guan Hsieh
James Thornton
Eugène Ndiaye
Michal Klein
Marco Cuturi
Pierre Ablin
MedIm
96
0
0
05 Feb 2024
Segment Any Change
Segment Any Change
Zhuo Zheng
Yanfei Zhong
Liangpei Zhang
Stefano Ermon
VLM
88
13
0
02 Feb 2024
Rethinking Patch Dependence for Masked Autoencoders
Rethinking Patch Dependence for Masked Autoencoders
Letian Fu
Long Lian
Renhao Wang
Baifeng Shi
Xudong Wang
Adam Yala
Trevor Darrell
Alexei A. Efros
Ken Goldberg
142
16
0
25 Jan 2024
Self-supervised Learning of LiDAR 3D Point Clouds via 2D-3D Neural Calibration
Self-supervised Learning of LiDAR 3D Point Clouds via 2D-3D Neural Calibration
Yifan Zhang
Siyu Ren
Junhui Hou
Jinjian Wu
Guangming Shi
Guangming Shi
SSL3DPC
293
3
0
23 Jan 2024
A Novel Benchmark for Few-Shot Semantic Segmentation in the Era of Foundation Models
A Novel Benchmark for Few-Shot Semantic Segmentation in the Era of Foundation Models
Reda Bensaid
Vincent Gripon
Franccois Leduc-Primeau
Lukas Mauch
G. B. Hacene
Fabien Cardinaux
VLM
95
7
0
20 Jan 2024
Dense 3D Reconstruction Through Lidar: A Comparative Study on Ex-vivo
  Porcine Tissue
Dense 3D Reconstruction Through Lidar: A Comparative Study on Ex-vivo Porcine Tissue
Guido Caccianiga
Julian Nubert
Marco Hutter
Katherine J. Kuchenbecker
60
1
0
19 Jan 2024
Visual Robotic Manipulation with Depth-Aware Pretraining
Visual Robotic Manipulation with Depth-Aware Pretraining
Wanying Wang
Jinming Li
Yichen Zhu
Zhiyuan Xu
Zhengping Che
Chaomin Shen
Yaxin Peng
Dong Liu
Feifei Feng
Jian Tang
MDE
87
4
0
17 Jan 2024
Previous
123...14151617
Next