ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.07193
  4. Cited By
DINOv2: Learning Robust Visual Features without Supervision
v1v2 (latest)

DINOv2: Learning Robust Visual Features without Supervision

14 April 2023
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
Vasil Khalidov
Pierre Fernandez
Daniel Haziza
Francisco Massa
Alaaeldin El-Nouby
Mahmoud Assran
Nicolas Ballas
Wojciech Galuba
Russ Howes
Po-Yao (Bernie) Huang
Shang-Wen Li
Ishan Misra
Michael G. Rabbat
Vasu Sharma
Gabriel Synnaeve
Huijiao Xu
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
    VLMCLIPSSL
ArXiv (abs)PDFHTML

Papers citing "DINOv2: Learning Robust Visual Features without Supervision"

50 / 826 papers shown
Title
Diffusion Models for Robotic Manipulation: A Survey
Diffusion Models for Robotic Manipulation: A Survey
Rosa Wolf
Yitian Shi
Sheng Liu
Rania Rayyes
125
2
0
01 Jul 2025
LatentMove: Towards Complex Human Movement Video Generation
LatentMove: Towards Complex Human Movement Video Generation
Ashkan Taghipour
Morteza Ghahremani
Mohammed Bennamoun
F. Boussaïd
Aref Miri Rekavandi
Zinuo Li
Qiuhong Ke
Hamid Laga
3DHVGen
74
0
0
01 Jul 2025
USP: Unified Self-Supervised Pretraining for Image Generation and Understanding
USP: Unified Self-Supervised Pretraining for Image Generation and Understanding
Xiangxiang Chu
Renda Li
Yong Wang
257
1
0
01 Jul 2025
RGBTrack: Fast, Robust Depth-Free 6D Pose Estimation and Tracking
RGBTrack: Fast, Robust Depth-Free 6D Pose Estimation and Tracking
Teng Guo
Jingjin Yu
3DPC3DV
30
0
0
20 Jun 2025
Class Agnostic Instance-level Descriptor for Visual Instance Search
Class Agnostic Instance-level Descriptor for Visual Instance Search
Qi-Ying Sun
Wan-Lei Zhao
Yi-Bo Miao
Chong-Wah Ngo
OCL
27
0
0
20 Jun 2025
Emergent Temporal Correspondences from Video Diffusion Transformers
Emergent Temporal Correspondences from Video Diffusion Transformers
Jisu Nam
Soowon Son
Dahyun Chung
Jiyoung Kim
Siyoon Jin
Junhwa Hur
Seungryong Kim
VGen
23
0
0
20 Jun 2025
Loupe: A Generalizable and Adaptive Framework for Image Forgery Detection
Loupe: A Generalizable and Adaptive Framework for Image Forgery Detection
Yuchu Jiang
Jiaming Chu
Jian Zhao
Xin Zhang
Xu Yang
Lei Jin
C. Zhang
Xuelong Li
14
0
0
20 Jun 2025
LunarLoc: Segment-Based Global Localization on the Moon
LunarLoc: Segment-Based Global Localization on the Moon
Annika Thomas
Robaire Galliath
Aleksander Garbuz
Luke Anger
Cormac OÑeill
Trevor Johst
Dami Thomas
George Lordos
Jonathan P. How
10
0
0
20 Jun 2025
With Limited Data for Multimodal Alignment, Let the STRUCTURE Guide You
With Limited Data for Multimodal Alignment, Let the STRUCTURE Guide You
Fabian Gröger
Shuo Wen
Huyen Le
Maria Brbic
17
0
0
20 Jun 2025
Assembler: Scalable 3D Part Assembly via Anchor Point Diffusion
Assembler: Scalable 3D Part Assembly via Anchor Point Diffusion
Wang Zhao
Yan-Pei Cao
Jiale Xu
Yuejiang Dong
Ying Shan
15
0
0
20 Jun 2025
Seeing What Matters: Generalizable AI-generated Video Detection with Forensic-Oriented Augmentation
Seeing What Matters: Generalizable AI-generated Video Detection with Forensic-Oriented Augmentation
Riccardo Corvi
D. Cozzolino
Ekta Prashnani
Shalini De Mello
Koki Nagano
L. Verdoliva
ViT
7
0
0
20 Jun 2025
AutoV: Learning to Retrieve Visual Prompt for Large Vision-Language Models
AutoV: Learning to Retrieve Visual Prompt for Large Vision-Language Models
Yuan Zhang
Chun-Kai Fan
Tao Huang
Ming Lu
Sicheng Yu
Junwen Pan
Kuan Cheng
Qi She
Shanghang Zhang
VLMLRM
17
0
0
19 Jun 2025
CodeDiffuser: Attention-Enhanced Diffusion Policy via VLM-Generated Code for Instruction Ambiguity
CodeDiffuser: Attention-Enhanced Diffusion Policy via VLM-Generated Code for Instruction Ambiguity
Guang Yin
Yitong Li
Yixuan Wang
D. Mcconachie
Paarth Shah
Kunimatsu Hashimoto
Huan Zhang
Katherine Liu
Yunzhu Li
LM&Ro
5
0
0
19 Jun 2025
LBMamba: Locally Bi-directional Mamba
LBMamba: Locally Bi-directional Mamba
Jingwei Zhang
Xi Han
Hong Qin
Mahdi S. Hosseini
Dimitris Samaras
Mamba
33
0
0
19 Jun 2025
Reimagination with Test-time Observation Interventions: Distractor-Robust World Model Predictions for Visual Model Predictive Control
Reimagination with Test-time Observation Interventions: Distractor-Robust World Model Predictions for Visual Model Predictive Control
Yuxin Chen
Jianglan Wei
Chenfeng Xu
Boyi Li
Masayoshi Tomizuka
Andrea V. Bajcsy
Ran Tian
10
0
0
19 Jun 2025
DT-UFC: Universal Large Model Feature Coding via Peaky-to-Balanced Distribution Transformation
DT-UFC: Universal Large Model Feature Coding via Peaky-to-Balanced Distribution Transformation
Changsheng Gao
Zijie Liu
L. Li
Dong Liu
Xiaoyan Sun
Weisi Lin
OffRL
7
0
0
19 Jun 2025
MapFM: Foundation Model-Driven HD Mapping with Multi-Task Contextual Learning
MapFM: Foundation Model-Driven HD Mapping with Multi-Task Contextual Learning
Leonid Ivanov
Vasily Yuryev
Dmitry Yudin
12
0
0
18 Jun 2025
Vision in Action: Learning Active Perception from Human Demonstrations
Vision in Action: Learning Active Perception from Human Demonstrations
Haoyu Xiong
Xiaomeng Xu
Jimmy Wu
Yifan Hou
Jeannette Bohg
Shuran Song
36
0
0
18 Jun 2025
SynPo: Boosting Training-Free Few-Shot Medical Segmentation via High-Quality Negative Prompts
SynPo: Boosting Training-Free Few-Shot Medical Segmentation via High-Quality Negative Prompts
Yufei Liu
Haoke Xiao
Jiaxing Chai
Yongcun Zhang
Rong Wang
Zijie Meng
Zhiming Luo
MedImVLM
13
0
0
18 Jun 2025
GenRecal: Generation after Recalibration from Large to Small Vision-Language Models
GenRecal: Generation after Recalibration from Large to Small Vision-Language Models
Byung-Kwan Lee
Ryo Hachiuma
Yong Man Ro
Yu-Chun Wang
Yueh-Hua Wu
VLM
38
0
0
18 Jun 2025
Hunyuan3D 2.1: From Images to High-Fidelity 3D Assets with Production-Ready PBR Material
Hunyuan3D 2.1: From Images to High-Fidelity 3D Assets with Production-Ready PBR Material
Team Hunyuan3D
Shuhui Yang
M. Yang
Yifei Feng
Xin Huang
...
Yuhong Liu
Linus
Jie Jiang
J. Huang
Chunchao Guo
3DH
34
1
0
18 Jun 2025
Discrete JEPA: Learning Discrete Token Representations without Reconstruction
Discrete JEPA: Learning Discrete Token Representations without Reconstruction
Junyeob Baek
Hosung Lee
Christopher Hoang
Mengye Ren
Sungjin Ahn
22
0
0
17 Jun 2025
DepthSeg: Depth prompting in remote sensing semantic segmentation
DepthSeg: Depth prompting in remote sensing semantic segmentation
Ning Zhou
Shanxiong Chen
Mingting Zhou
Haigang Sui
Lieyun Hu
Han Li
Li Hua
Qiming Zhou
VLMMDE
33
0
0
17 Jun 2025
Foundation Model Insights and a Multi-Model Approach for Superior Fine-Grained One-shot Subset Selection
Foundation Model Insights and a Multi-Model Approach for Superior Fine-Grained One-shot Subset Selection
Zhijing Wan
Zhixiang Wang
Zheng Wang
Xin Xu
Shiníchi Satoh
29
0
0
17 Jun 2025
DeSPITE: Exploring Contrastive Deep Skeleton-Pointcloud-IMU-Text Embeddings for Advanced Point Cloud Human Activity Understanding
DeSPITE: Exploring Contrastive Deep Skeleton-Pointcloud-IMU-Text Embeddings for Advanced Point Cloud Human Activity Understanding
Thomas Kreutz
M. Mühlhäuser
Alejandro Sánchez Guinea
34
0
0
16 Jun 2025
TR2M: Transferring Monocular Relative Depth to Metric Depth with Language Descriptions and Scale-Oriented Contrast
TR2M: Transferring Monocular Relative Depth to Metric Depth with Language Descriptions and Scale-Oriented Contrast
Beilei Cui
Yiming Huang
Long Bai
Hongliang Ren
31
0
0
16 Jun 2025
Scaling Algorithm Distillation for Continuous Control with Mamba
Scaling Algorithm Distillation for Continuous Control with Mamba
Samuel Beaussant
Mehdi Mounsif
22
0
0
16 Jun 2025
EmbodiedPlace: Learning Mixture-of-Features with Embodied Constraints for Visual Place Recognition
EmbodiedPlace: Learning Mixture-of-Features with Embodied Constraints for Visual Place Recognition
Bingxi Liu
Hao Chen
Shiyi Guo
Yihong Wu
Jinqiang Cui
Hong Zhang
12
0
0
16 Jun 2025
DynaGuide: Steering Diffusion Polices with Active Dynamic Guidance
DynaGuide: Steering Diffusion Polices with Active Dynamic Guidance
Maximilian Du
Shuran Song
25
0
0
16 Jun 2025
Bridging Unsupervised and Semi-Supervised Anomaly Detection: A Theoretically-Grounded and Practical Framework with Synthetic Anomalies
Bridging Unsupervised and Semi-Supervised Anomaly Detection: A Theoretically-Grounded and Practical Framework with Synthetic Anomalies
Matthew Lau
Tian-Yi Zhou
Xiangchi Yuan
Jizhou Chen
Wenke Lee
Xiaoming Huo
20
0
0
16 Jun 2025
Adapting by Analogy: OOD Generalization of Visuomotor Policies via Functional Correspondence
Adapting by Analogy: OOD Generalization of Visuomotor Policies via Functional Correspondence
Pranay Gupta
H. Admoni
Andrea Bajcsy
12
0
0
15 Jun 2025
Comparative Analysis of Deep Learning Strategies for Hypertensive Retinopathy Detection from Fundus Images: From Scratch and Pre-trained Models
Comparative Analysis of Deep Learning Strategies for Hypertensive Retinopathy Detection from Fundus Images: From Scratch and Pre-trained Models
Yanqiao Zhu
10
0
0
14 Jun 2025
Manager: Aggregating Insights from Unimodal Experts in Two-Tower VLMs and MLLMs
Manager: Aggregating Insights from Unimodal Experts in Two-Tower VLMs and MLLMs
Xiao Xu
L. Qin
Wanxiang Che
Min-Yen Kan
MoEVLM
30
0
0
13 Jun 2025
PiPViT: Patch-based Visual Interpretable Prototypes for Retinal Image Analysis
PiPViT: Patch-based Visual Interpretable Prototypes for Retinal Image Analysis
Marzieh Oghbaie
Teresa Araújoa
Hrvoje Bogunović
ViTMedIm
122
0
0
12 Jun 2025
HyBiomass: Global Hyperspectral Imagery Benchmark Dataset for Evaluating Geospatial Foundation Models in Forest Aboveground Biomass Estimation
HyBiomass: Global Hyperspectral Imagery Benchmark Dataset for Evaluating Geospatial Foundation Models in Forest Aboveground Biomass Estimation
Aaron Banze
Timothée Stassin
Nassim Ait Ali Braham
Rıdvan Salih Kuzu
Simon Besnard
Michael Schmitt
13
0
0
12 Jun 2025
BioClinical ModernBERT: A State-of-the-Art Long-Context Encoder for Biomedical and Clinical NLP
BioClinical ModernBERT: A State-of-the-Art Long-Context Encoder for Biomedical and Clinical NLP
Thomas Sounack
Joshua Davis
Brigitte N Durieux
Antoine Chaffin
Tom Pollard
Eric P. Lehman
Alistair E. W. Johnson
Matthew B. A. McDermott
Tristan Naumann
Charlotta Lindvall
MedIm
114
0
0
12 Jun 2025
SNR and Resource Adaptive Deep JSCC for Distributed IoT Image Classification
SNR and Resource Adaptive Deep JSCC for Distributed IoT Image Classification
Ali Waqas
Sinem Coleri
104
0
0
12 Jun 2025
AIR: Zero-shot Generative Model Adaptation with Iterative Refinement
AIR: Zero-shot Generative Model Adaptation with Iterative Refinement
Guimeng Liu
Milad Abdollahzadeh
Ngai-Man Cheung
VLM
117
0
0
12 Jun 2025
A theoretical framework for self-supervised contrastive learning for continuous dependent data
A theoretical framework for self-supervised contrastive learning for continuous dependent data
Alexander Marusov
Alexander Yuhay
Alexey Zaytsev
72
0
0
11 Jun 2025
Efficient Part-level 3D Object Generation via Dual Volume Packing
Jiaxiang Tang
Ruijie Lu
Zhaoshuo Li
Zekun Hao
Xuan Li
Fangyin Wei
Shuran Song
Gang Zeng
Ming-Yu Liu
Tsung-Yi Lin
OCL
90
0
0
11 Jun 2025
Detecção da Psoríase Utilizando Visão Computacional: Uma Abordagem Comparativa Entre CNNs e Vision Transformers
Detecção da Psoríase Utilizando Visão Computacional: Uma Abordagem Comparativa Entre CNNs e Vision Transformers
Natanael Lucena
Fábio S. da Silva
Ricardo Rios
ViTMedIm
60
0
0
11 Jun 2025
The Less You Depend, The More You Learn: Synthesizing Novel Views from Sparse, Unposed Images without Any 3D Knowledge
Haoru Wang
Kai Ye
Yangyan Li
Wenzheng Chen
Baoquan Chen
69
0
0
11 Jun 2025
EfficientVLA: Training-Free Acceleration and Compression for Vision-Language-Action Models
EfficientVLA: Training-Free Acceleration and Compression for Vision-Language-Action Models
Yantai Yang
Yuhao Wang
Zichen Wen
Luo Zhongwei
Chang Zou
Zhipeng Zhang
Chuan Wen
Linfeng Zhang
VLM
64
0
0
11 Jun 2025
Urban1960SatSeg: Unsupervised Semantic Segmentation of Mid-20$^{th}$ century Urban Landscapes with Satellite Imageries
Urban1960SatSeg: Unsupervised Semantic Segmentation of Mid-20th^{th}th century Urban Landscapes with Satellite Imageries
Tianxiang Hao
Lixian Zhang
Yingjia Zhang
Mengxuan Chen
Jinxiao Zhang
Haohuan Fu
72
0
0
11 Jun 2025
3DGeoDet: General-purpose Geometry-aware Image-based 3D Object Detection
Yi Zhang
Y. X. R. Wang
Yawen Cui
Lap-Pui Chau
3DPC
67
0
0
11 Jun 2025
Accurate and efficient zero-shot 6D pose estimation with frozen foundation models
Andrea Caraffa
Davide Boscaini
Fabio Poiesi
85
0
0
11 Jun 2025
From Intention to Execution: Probing the Generalization Boundaries of Vision-Language-Action Models
Irving Fang
Juexiao Zhang
Shengbang Tong
Chen Feng
LM&Ro
56
1
0
11 Jun 2025
V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning
V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning
Mido Assran
Adrien Bardes
David Fan
Q. Garrido
Russell Howes
...
Sarath Chandar
Franziska Meier
Yann LeCun
Michael G. Rabbat
Nicolas Ballas
68
0
0
11 Jun 2025
Leveraging Depth and Language for Open-Vocabulary Domain-Generalized Semantic Segmentation
Leveraging Depth and Language for Open-Vocabulary Domain-Generalized Semantic Segmentation
Siyu Chen
Ting Han
Chengzheng Fu
Changshe Zhang
Chaolei Wang
Jinhe Su
Guorong Cai
Meiliu Wu
ObjDVLM
93
0
0
11 Jun 2025
GLD-Road:A global-local decoding road network extraction model for remote sensing images
Ligao Deng
Yupeng Deng
Yu Meng
Jingbo Chen
Zhihao Xi
Diyou Liu
Qifeng Chu
61
0
0
11 Jun 2025
1234...151617
Next