ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.07193
  4. Cited By
DINOv2: Learning Robust Visual Features without Supervision

DINOv2: Learning Robust Visual Features without Supervision

14 April 2023
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
Vasil Khalidov
Pierre Fernandez
Daniel Haziza
Francisco Massa
Alaaeldin El-Nouby
Mahmoud Assran
Nicolas Ballas
Wojciech Galuba
Russ Howes
Po-Yao (Bernie) Huang
Shang-Wen Li
Ishan Misra
Michael G. Rabbat
Vasu Sharma
Gabriel Synnaeve
Huijiao Xu
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
    VLM
    CLIP
    SSL
ArXivPDFHTML

Papers citing "DINOv2: Learning Robust Visual Features without Supervision"

50 / 2,193 papers shown
Title
ChA-MAEViT: Unifying Channel-Aware Masked Autoencoders and Multi-Channel Vision Transformers for Improved Cross-Channel Learning
ChA-MAEViT: Unifying Channel-Aware Masked Autoencoders and Multi-Channel Vision Transformers for Improved Cross-Channel Learning
Chau Pham
Juan C. Caicedo
Bryan A. Plummer
47
0
0
25 Mar 2025
The Coralscapes Dataset: Semantic Scene Understanding in Coral Reefs
The Coralscapes Dataset: Semantic Scene Understanding in Coral Reefs
Jonathan Sauder
Viktor Domazetoski
G. Banc-Prandi
Gabriela Perna
Anders Meibom
D. Tuia
58
0
0
25 Mar 2025
Surg-3M: A Dataset and Foundation Model for Perception in Surgical Settings
Surg-3M: A Dataset and Foundation Model for Perception in Surgical Settings
Chengan Che
Chao Wang
Tom Vercauteren
Sophia Tsoka
Luis C. García-Peraza-Herrera
MedIm
46
0
0
25 Mar 2025
Scaling Vision Pre-Training to 4K Resolution
Scaling Vision Pre-Training to 4K Resolution
Baifeng Shi
Boyi Li
Han Cai
Yunfan LU
Sifei Liu
...
Jan Kautz
Enze Xie
Trevor Darrell
Pavlo Molchanov
Hongxu Yin
CLIP
166
0
0
25 Mar 2025
AvatarArtist: Open-Domain 4D Avatarization
AvatarArtist: Open-Domain 4D Avatarization
Hongyu Liu
Xuan Wang
Bo Liu
Yue Ma
Jingye Chen
Yanbo Fan
Yujun Shen
Yibing Song
Qifeng Chen
41
0
0
25 Mar 2025
LRSCLIP: A Vision-Language Foundation Model for Aligning Remote Sensing Image with Longer Text
LRSCLIP: A Vision-Language Foundation Model for Aligning Remote Sensing Image with Longer Text
Weizhi Chen
Jingbo Chen
Yupeng Deng
Jiansheng Chen
Yuman Feng
Zhihao Xi
Diyou Liu
Kai Li
Yu Meng
VLM
51
0
0
25 Mar 2025
Zero-Shot Human-Object Interaction Synthesis with Multimodal Priors
Zero-Shot Human-Object Interaction Synthesis with Multimodal Priors
Yuke Lou
Yiming Wang
Zhen Wu
Rui Zhao
Wenjia Wang
Mingyi Shi
Taku Komura
44
0
0
25 Mar 2025
FG$^2$: Fine-Grained Cross-View Localization by Fine-Grained Feature Matching
FG2^22: Fine-Grained Cross-View Localization by Fine-Grained Feature Matching
Zimin Xia
Alexandre Alahi
63
0
0
24 Mar 2025
Context-Enhanced Memory-Refined Transformer for Online Action Detection
Context-Enhanced Memory-Refined Transformer for Online Action Detection
Zhanzhong Pang
Fadime Sener
Angela Yao
OffRL
62
1
0
24 Mar 2025
Revisiting Automatic Data Curation for Vision Foundation Models in Digital Pathology
Revisiting Automatic Data Curation for Vision Foundation Models in Digital Pathology
Boqi Chen
Cédric Vincent-Cuaz
Lydia A. Schoenpflug
Manuel Madeira
Lisa Fournier
...
D. Thanou
V. Koelzer
Pascal Frossard
Gabriele Campanella
Gunnar Rätsch
51
1
0
24 Mar 2025
U-REPA: Aligning Diffusion U-Nets to ViTs
U-REPA: Aligning Diffusion U-Nets to ViTs
Yuchuan Tian
Hanting Chen
Mengyu Zheng
Yuchen Liang
Chao Xu
Yunhe Wang
56
0
0
24 Mar 2025
HunyuanPortrait: Implicit Condition Control for Enhanced Portrait Animation
HunyuanPortrait: Implicit Condition Control for Enhanced Portrait Animation
Zunnan Xu
Zhentao Yu
Zixiang Zhou
Jun Zhou
Xiaoyu Jin
...
Chengfei Cai
Shiyu Tang
Qin Lin
Xiu Li
Qinglin Lu
DiffM
VGen
91
8
0
24 Mar 2025
Your ViT is Secretly an Image Segmentation Model
Your ViT is Secretly an Image Segmentation Model
Tommie Kerssies
Niccolò Cavagnero
Alexander Hermans
Narges Norouzi
Giuseppe Averta
Bastian Leibe
Gijs Dubbelman
Daan de Geus
ViT
VLM
67
1
0
24 Mar 2025
Training-Free Personalization via Retrieval and Reasoning on Fingerprints
Training-Free Personalization via Retrieval and Reasoning on Fingerprints
Deepayan Das
Davide Talon
Yiming Wang
Massimiliano Mancini
Elisa Ricci
VLM
LRM
50
0
0
24 Mar 2025
SPMTrack: Spatio-Temporal Parameter-Efficient Fine-Tuning with Mixture of Experts for Scalable Visual Tracking
SPMTrack: Spatio-Temporal Parameter-Efficient Fine-Tuning with Mixture of Experts for Scalable Visual Tracking
Wenrui Cai
Qingjie Liu
Yansen Wang
MoE
65
0
0
24 Mar 2025
Surface-Aware Distilled 3D Semantic Features
Surface-Aware Distilled 3D Semantic Features
Lukas Uzolas
E. Eisemann
Petr Kellnhofer
3DPC
3DH
83
0
0
24 Mar 2025
Self-Supervised Learning based on Transformed Image Reconstruction for Equivariance-Coherent Feature Representation
Self-Supervised Learning based on Transformed Image Reconstruction for Equivariance-Coherent Feature Representation
Qin Wang
Benjamin Bruns
Hanno Scharr
Kai Krajsek
58
0
0
24 Mar 2025
Coeff-Tuning: A Graph Filter Subspace View for Tuning Attention-Based Large Models
Coeff-Tuning: A Graph Filter Subspace View for Tuning Attention-Based Large Models
Zichen Miao
Wei Chen
Qiang Qiu
92
1
0
24 Mar 2025
RoboEngine: Plug-and-Play Robot Data Augmentation with Semantic Robot Segmentation and Background Generation
RoboEngine: Plug-and-Play Robot Data Augmentation with Semantic Robot Segmentation and Background Generation
Chengbo Yuan
Suraj Joshi
Shaoting Zhu
Hang Su
Hang Zhao
Yang Gao
VGen
48
4
0
24 Mar 2025
Towards Training-free Anomaly Detection with Vision and Language Foundation Models
Towards Training-free Anomaly Detection with Vision and Language Foundation Models
Jinjin Zhang
Guodong Wang
Yizhou Jin
Di Huang
42
1
0
24 Mar 2025
PALATE: Peculiar Application of the Law of Total Expectation to Enhance the Evaluation of Deep Generative Models
PALATE: Peculiar Application of the Law of Total Expectation to Enhance the Evaluation of Deep Generative Models
Tadeusz Dziarmaga
Marcin Kądziołka
Artur Kasymov
Marcin Mazur
EGVM
105
0
0
24 Mar 2025
Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning
Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning
Sherry X. Chen
Misha Sra
Pradeep Sen
55
0
0
24 Mar 2025
Foundation Model for Whole-Heart Segmentation: Leveraging Student-Teacher Learning in Multi-Modal Medical Imaging
Foundation Model for Whole-Heart Segmentation: Leveraging Student-Teacher Learning in Multi-Modal Medical Imaging
Abdul Qayyum
Moona Mazher
Devran Ugurlu
J. Solís-Lemus
C. Rodero
Steven A Niederer
45
0
0
24 Mar 2025
Out-of-distribution evaluations of channel agnostic masked autoencoders in fluorescence microscopy
Out-of-distribution evaluations of channel agnostic masked autoencoders in fluorescence microscopy
Christian John Hurry
Jinjie Zhang
Olubukola Ishola
Emma Slade
Cuong Q. Nguyen
OOD
OODD
60
0
0
24 Mar 2025
Expanding the Boundaries of Vision Prior Knowledge in Multi-modal Large Language Models
Expanding the Boundaries of Vision Prior Knowledge in Multi-modal Large Language Models
Qiao Liang
Yanjiang Liu
Xianpei Han
Yunfan LU
Hongyu Lin
Jia Zheng
Xianpei Han
Le Sun
Yingfei Sun
39
0
0
23 Mar 2025
FisherTune: Fisher-Guided Robust Tuning of Vision Foundation Models for Domain Generalized Segmentation
FisherTune: Fisher-Guided Robust Tuning of Vision Foundation Models for Domain Generalized Segmentation
Dong Zhao
Jinlong Li
Shuang Wang
Mengyao Wu
Qi Zang
N. Sebe
Zhun Zhong
182
0
0
23 Mar 2025
Histomorphology-driven multi-instance learning for breast cancer WSI classification
Histomorphology-driven multi-instance learning for breast cancer WSI classification
Baizhi Wang
Rui Yan
Wenxin Ma
Xu Zhang
Yuhao Wang
X. Li
Yunjie Gu
Zihang Jiang
Shuoling Zhou
51
0
0
23 Mar 2025
SceneSplat: Gaussian Splatting-based Scene Understanding with Vision-Language Pretraining
SceneSplat: Gaussian Splatting-based Scene Understanding with Vision-Language Pretraining
Yue Li
Qi Ma
Runyi Yang
Huapeng Li
Mengjiao Ma
...
E. Konukoglu
Theo Gevers
Luc Van Gool
Martin R. Oswald
Danda Pani Paudel
3DGS
VLM
88
0
0
23 Mar 2025
Co-op: Correspondence-based Novel Object Pose Estimation
Co-op: Correspondence-based Novel Object Pose Estimation
Sungphill Moon
Hyeontae Son
Dongcheol Hur
Sangwook Kim
3DH
66
1
0
22 Mar 2025
BackMix: Regularizing Open Set Recognition by Removing Underlying Fore-Background Priors
BackMix: Regularizing Open Set Recognition by Removing Underlying Fore-Background Priors
Yu Wang
Junxian Mu
Hongzhi Huang
Qilong Wang
Pengfei Zhu
Q. Hu
57
0
0
22 Mar 2025
EMPLACE: Self-Supervised Urban Scene Change Detection
EMPLACE: Self-Supervised Urban Scene Change Detection
Tim Alpherts
Sennay Ghebreab
Nanne van Noord
43
0
0
22 Mar 2025
InstructVEdit: A Holistic Approach for Instructional Video Editing
InstructVEdit: A Holistic Approach for Instructional Video Editing
Chi Zhang
C. Feng
Feng Yan
Qiming Zhang
Mingjin Zhang
Yujie Zhong
Jing Zhang
Lin Ma
DiffM
VGen
57
0
0
22 Mar 2025
Beyond Accuracy: What Matters in Designing Well-Behaved Models?
Beyond Accuracy: What Matters in Designing Well-Behaved Models?
Robin Hesse
Doğukan Bağcı
Bernt Schiele
Simone Schaub-Meyer
Stefan Roth
VLM
62
0
0
21 Mar 2025
ModalTune: Fine-Tuning Slide-Level Foundation Models with Multi-Modal Information for Multi-task Learning in Digital Pathology
ModalTune: Fine-Tuning Slide-Level Foundation Models with Multi-Modal Information for Multi-task Learning in Digital Pathology
Vishwesh Ramanathan
Tony Xu
Pushpak Pati
Faruk Ahmed
Maged Goubran
Anne L. Martel
48
0
0
21 Mar 2025
MagicColor: Multi-Instance Sketch Colorization
MagicColor: Multi-Instance Sketch Colorization
Yuyao Zhang
Yue Ma
Bingyuan Wang
Qifeng Chen
Zeyu Wang
DiffM
73
0
0
21 Mar 2025
Pow3R: Empowering Unconstrained 3D Reconstruction with Camera and Scene Priors
Pow3R: Empowering Unconstrained 3D Reconstruction with Camera and Scene Priors
Wonbong Jang
Philippe Weinzaepfel
Vincent Leroy
Lourdes Agapito
Jérôme Revaud
51
1
0
21 Mar 2025
Exploring Few-Shot Object Detection on Blood Smear Images: A Case Study of Leukocytes and Schistocytes
Exploring Few-Shot Object Detection on Blood Smear Images: A Case Study of Leukocytes and Schistocytes
Davide Antonio Mura
Michela Pinna
Lorenzo Putzu
A. Loddo
Alessandra Perniciano
Olga Mulas
Cecilia Di Ruberto
42
0
0
21 Mar 2025
Generating, Fast and Slow: Scalable Parallel Video Generation with Video Interface Networks
Generating, Fast and Slow: Scalable Parallel Video Generation with Video Interface Networks
Bhishma Dedhia
David Bourgin
Krishna Kumar Singh
Yuheng Li
Yan Kang
Zhan Xu
N. Jha
Y. Liu
DiffM
VGen
72
0
0
21 Mar 2025
Is there anything left? Measuring semantic residuals of objects removed from 3D Gaussian Splatting
Is there anything left? Measuring semantic residuals of objects removed from 3D Gaussian Splatting
Simona Kocour
Assia Benbihi
Aikaterini Adam
Torsten Sattler
3DPC
41
0
0
21 Mar 2025
Single Image Iterative Subject-driven Generation and Editing
Single Image Iterative Subject-driven Generation and Editing
Yair Shpitzer
Gal Chechik
Idan Schwartz
53
0
0
20 Mar 2025
Animating the Uncaptured: Humanoid Mesh Animation with Video Diffusion Models
Animating the Uncaptured: Humanoid Mesh Animation with Video Diffusion Models
Marc Benedí San Millán
Angela Dai
Matthias Nießner
DiffM
72
0
0
20 Mar 2025
TruthLens: Explainable DeepFake Detection for Face Manipulated and Fully Synthetic Data
TruthLens: Explainable DeepFake Detection for Face Manipulated and Fully Synthetic Data
Rohit Kundu
Athula Balachandran
A. Roy-Chowdhury
45
0
0
20 Mar 2025
A Vision Centric Remote Sensing Benchmark
A Vision Centric Remote Sensing Benchmark
Abduljaleel Adejumo
Faegheh Yeganli
Clifford Broni-bediako
Aoran Xiao
Naoto Yokoya
Mennatullah Siam
67
0
0
20 Mar 2025
Learning 3D Scene Analogies with Neural Contextual Scene Maps
Learning 3D Scene Analogies with Neural Contextual Scene Maps
Junho Kim
Gwangtak Bae
E. Lee
Young Min Kim
3DPC
3DV
62
0
0
20 Mar 2025
M3: 3D-Spatial MultiModal Memory
M3: 3D-Spatial MultiModal Memory
Xueyan Zou
Yuchen Song
Ri-Zhao Qiu
Xuanbin Peng
Jianglong Ye
Sifei Liu
Xiaolong Wang
3DGS
62
0
0
20 Mar 2025
GAIR: Improving Multimodal Geo-Foundation Model with Geo-Aligned Implicit Representations
GAIR: Improving Multimodal Geo-Foundation Model with Geo-Aligned Implicit Representations
Ziqiang Liu
Fan Zhang
Junfeng Jiao
Ni Lao
Gengchen Mai
55
2
0
20 Mar 2025
MapGlue: Multimodal Remote Sensing Image Matching
MapGlue: Multimodal Remote Sensing Image Matching
Peihao Wu
Yongxiang Yao
Wenfei Zhang
Dong Wei
Y. Wan
Yansheng Li
Yongjun Zhang
44
0
0
20 Mar 2025
Learning to Efficiently Adapt Foundation Models for Self-Supervised Endoscopic 3D Scene Reconstruction from Any Cameras
Learning to Efficiently Adapt Foundation Models for Self-Supervised Endoscopic 3D Scene Reconstruction from Any Cameras
Beilei Cui
Long Bai
Mobarakol Islam
An-Chi Wang
Z. Ma
...
Feng Li
Zhen Chen
Zhongliang Jiang
Nassir Navab
Hongliang Ren
MedIm
65
0
0
20 Mar 2025
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
Jinlong Li
Cristiano Saltori
Fabio Poiesi
N. Sebe
201
0
0
20 Mar 2025
UniK3D: Universal Camera Monocular 3D Estimation
UniK3D: Universal Camera Monocular 3D Estimation
Luigi Piccinelli
Daniel Gehrig
Mattia Segu
Yifan Yang
Siyuan Li
Wim Abbeloos
Luc Van Gool
MDE
47
0
0
20 Mar 2025
Previous
123...678...424344
Next