ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.07193
  4. Cited By
DINOv2: Learning Robust Visual Features without Supervision

DINOv2: Learning Robust Visual Features without Supervision

14 April 2023
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
Vasil Khalidov
Pierre Fernandez
Daniel Haziza
Francisco Massa
Alaaeldin El-Nouby
Mahmoud Assran
Nicolas Ballas
Wojciech Galuba
Russ Howes
Po-Yao (Bernie) Huang
Shang-Wen Li
Ishan Misra
Michael G. Rabbat
Vasu Sharma
Gabriel Synnaeve
Huijiao Xu
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
    VLM
    CLIP
    SSL
ArXivPDFHTML

Papers citing "DINOv2: Learning Robust Visual Features without Supervision"

50 / 2,189 papers shown
Title
MIEB: Massive Image Embedding Benchmark
MIEB: Massive Image Embedding Benchmark
Chenghao Xiao
Isaac Chung
Imene Kerboua
Jamie Stirling
Xin Zhang
Márton Kardos
Roman Solomatin
Noura Al Moubayed
K. Enevoldsen
Niklas Muennighoff
VLM
37
0
0
14 Apr 2025
Efficient Generative Model Training via Embedded Representation Warmup
Efficient Generative Model Training via Embedded Representation Warmup
Deyuan Liu
Peng Sun
Xufeng Li
Tao Lin
30
0
0
14 Apr 2025
Negate or Embrace: On How Misalignment Shapes Multimodal Representation Learning
Negate or Embrace: On How Misalignment Shapes Multimodal Representation Learning
Yichao Cai
Yuhang Liu
Erdun Gao
T. Jiang
Zhen Zhang
Anton van den Hengel
J. Shi
62
0
0
14 Apr 2025
REPA-E: Unlocking VAE for End-to-End Tuning with Latent Diffusion Transformers
REPA-E: Unlocking VAE for End-to-End Tuning with Latent Diffusion Transformers
Xingjian Leng
Jaskirat Singh
Yunzhong Hou
Zhenchang Xing
Saining Xie
Liang Zheng
39
0
0
14 Apr 2025
BabyVLM: Data-Efficient Pretraining of VLMs Inspired by Infant Learning
BabyVLM: Data-Efficient Pretraining of VLMs Inspired by Infant Learning
Shengao Wang
Arjun Chandra
Aoming Liu
Venkatesh Saligrama
Boqing Gong
MLLM
VLM
47
0
0
13 Apr 2025
CamMimic: Zero-Shot Image To Camera Motion Personalized Video Generation Using Diffusion Models
CamMimic: Zero-Shot Image To Camera Motion Personalized Video Generation Using Diffusion Models
P. Guhan
D. Kothandaraman
Tsung-Wei Huang
Guan-Ming Su
Dinesh Manocha
DiffM
VGen
34
0
0
13 Apr 2025
Evolved Hierarchical Masking for Self-Supervised Learning
Evolved Hierarchical Masking for Self-Supervised Learning
Zhanzhou Feng
Shiliang Zhang
42
0
0
12 Apr 2025
crowd-hpo: Realistic Hyperparameter Optimization and Benchmarking for Learning from Crowds with Noisy Labels
crowd-hpo: Realistic Hyperparameter Optimization and Benchmarking for Learning from Crowds with Noisy Labels
M. Herde
Lukas Lührs
Denis Huseljic
Bernhard Sick
22
0
0
12 Apr 2025
SCFlow2: Plug-and-Play Object Pose Refiner with Shape-Constraint Scene Flow
SCFlow2: Plug-and-Play Object Pose Refiner with Shape-Constraint Scene Flow
Qingyuan Wang
Rui Song
Jiaojiao Li
Kerui Cheng
David Ferstl
Yinlin Hu
3DPC
45
0
0
12 Apr 2025
MASH: Masked Anchored SpHerical Distances for 3D Shape Representation and Generation
MASH: Masked Anchored SpHerical Distances for 3D Shape Representation and Generation
Changhao Li
Yu Xin
Xiaowei Zhou
Ariel Shamir
Hao Zhang
Ligang Liu
R. Hu
48
0
0
12 Apr 2025
VideoAds for Fast-Paced Video Understanding: Where Opensource Foundation Models Beat GPT-4o & Gemini-1.5 Pro
VideoAds for Fast-Paced Video Understanding: Where Opensource Foundation Models Beat GPT-4o & Gemini-1.5 Pro
Zheyuan Zhang
Monica Dou
Linkai Peng
Hongyi Pan
Ulas Bagci
Boqing Gong
VLM
56
0
0
12 Apr 2025
Hypergraph Vision Transformers: Images are More than Nodes, More than Edges
Hypergraph Vision Transformers: Images are More than Nodes, More than Edges
Joshua Fixelle
ViT
27
0
0
11 Apr 2025
SARFormer -- An Acquisition Parameter Aware Vision Transformer for Synthetic Aperture Radar Data
SARFormer -- An Acquisition Parameter Aware Vision Transformer for Synthetic Aperture Radar Data
Jonathan Prexl
M. Recla
M. Schmitt
34
0
0
11 Apr 2025
FocalLens: Instruction Tuning Enables Zero-Shot Conditional Image Representations
FocalLens: Instruction Tuning Enables Zero-Shot Conditional Image Representations
Cheng-Yu Hsieh
Pavan Kumar Anasosalu Vasu
Fartash Faghri
Raviteja Vemulapalli
Chun-Liang Li
Ranjay Krishna
Oncel Tuzel
Hadi Pouransari
VLM
146
0
0
11 Apr 2025
DSM: Building A Diverse Semantic Map for 3D Visual Grounding
DSM: Building A Diverse Semantic Map for 3D Visual Grounding
Qinghongbing Xie
Zijian Liang
Long Zeng
29
0
0
11 Apr 2025
GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation
GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation
Tianwei Xiong
Jun Hao Liew
Zilong Huang
Jiashi Feng
Xihui Liu
36
0
0
11 Apr 2025
Boosting multi-demographic federated learning for chest x-ray analysis using general-purpose self-supervised representations
Boosting multi-demographic federated learning for chest x-ray analysis using general-purpose self-supervised representations
Mahshad Lotfinia
Arash Tayebiarasteh
Samaneh Samiei
Mehdi Joodaki
Soroosh Tayebi Arasteh
30
0
0
11 Apr 2025
Diffusion Models for Robotic Manipulation: A Survey
Diffusion Models for Robotic Manipulation: A Survey
Rosa Wolf
Yitian Shi
Sheng Liu
Rania Rayyes
51
1
0
11 Apr 2025
Parameter-Free Fine-tuning via Redundancy Elimination for Vision Foundation Models
Parameter-Free Fine-tuning via Redundancy Elimination for Vision Foundation Models
Jiahuan Long
Tingsong Jiang
Wen Yao
Yizhe Xiong
Zhengqin Xu
Shuai Jia
Chao Ma
24
0
0
11 Apr 2025
Gen3DEval: Using vLLMs for Automatic Evaluation of Generated 3D Objects
Gen3DEval: Using vLLMs for Automatic Evaluation of Generated 3D Objects
Shalini Maiti
Lourdes Agapito
Filippos Kokkinos
40
0
0
10 Apr 2025
Leveraging LLMs for Multimodal Retrieval-Augmented Radiology Report Generation via Key Phrase Extraction
Leveraging LLMs for Multimodal Retrieval-Augmented Radiology Report Generation via Key Phrase Extraction
Kyoyun Choi
Byungmu Yoon
Soobum Kim
Jonggwon Park
33
0
0
10 Apr 2025
Memory-efficient Streaming VideoLLMs for Real-time Procedural Video Understanding
Memory-efficient Streaming VideoLLMs for Real-time Procedural Video Understanding
Dibyadip Chatterjee
Edoardo Remelli
Yale Song
Bugra Tekin
Abhay Mittal
...
Shreyas Hampali
Eric Sauser
Shugao Ma
Angela Yao
Fadime Sener
VLM
44
0
0
10 Apr 2025
Revisiting Likelihood-Based Out-of-Distribution Detection by Modeling Representations
Revisiting Likelihood-Based Out-of-Distribution Detection by Modeling Representations
Yifan Ding
Arturas Aleksandrauskas
Amirhossein Ahmadian
Jonas Unger
Fredrik Lindsten
Gabriel Eilertsen
OODD
38
1
0
10 Apr 2025
MARS: a Multimodal Alignment and Ranking System for Few-Shot Segmentation
MARS: a Multimodal Alignment and Ranking System for Few-Shot Segmentation
Nico Catalano
Stefano Samele
Paolo Pertino
Matteo Matteucci
3DPC
48
0
0
10 Apr 2025
FlexIP: Dynamic Control of Preservation and Personality for Customized Image Generation
FlexIP: Dynamic Control of Preservation and Personality for Customized Image Generation
Linyan Huang
Haonan Lin
Yanning Zhou
Kaiwen Xiao
42
0
0
10 Apr 2025
RadZero: Similarity-Based Cross-Attention for Explainable Vision-Language Alignment in Radiology with Zero-Shot Multi-Task Capability
RadZero: Similarity-Based Cross-Attention for Explainable Vision-Language Alignment in Radiology with Zero-Shot Multi-Task Capability
Jonggwon Park
Soobum Kim
Byungmu Yoon
Kyoyun Choi
MedIm
33
0
0
10 Apr 2025
On Model and Data Scaling for Skeleton-based Self-Supervised Gait Recognition
On Model and Data Scaling for Skeleton-based Self-Supervised Gait Recognition
Adrian Cosma
Andy Catruna
Emilian Radoi
31
0
0
10 Apr 2025
Exploring a Patch-Wise Approach for Privacy-Preserving Fake ID Detection
Exploring a Patch-Wise Approach for Privacy-Preserving Fake ID Detection
Javier Muñoz-Haro
Ruben Tolosana
R. Vera-Rodríguez
Aythami Morales
Julian Fierrez
50
0
0
10 Apr 2025
ID-Booth: Identity-consistent Face Generation with Diffusion Models
ID-Booth: Identity-consistent Face Generation with Diffusion Models
Darian Tomašević
Fadi Boutros
Chenhao Lin
Naser Damer
Vitomir Štruc
Peter Peer
DiffM
55
1
0
10 Apr 2025
How Can Objects Help Video-Language Understanding?
How Can Objects Help Video-Language Understanding?
Zitian Tang
Shijie Wang
Junho Cho
Jaewook Yoo
Chen Sun
42
0
0
10 Apr 2025
VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning
VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning
Zhong-Yu Li
Ruoyi Du
Juncheng Yan
Le Zhuo
Zhen Li
Peng Gao
Zhanyu Ma
Ming-Ming Cheng
VLM
68
2
0
10 Apr 2025
GenEAva: Generating Cartoon Avatars with Fine-Grained Facial Expressions from Realistic Diffusion-based Faces
GenEAva: Generating Cartoon Avatars with Fine-Grained Facial Expressions from Realistic Diffusion-based Faces
Hao Yu
Rupayan Mallick
Margrit Betke
Sarah Adel Bargal
DiffM
45
0
0
10 Apr 2025
Latent Diffusion U-Net Representations Contain Positional Embeddings and Anomalies
Latent Diffusion U-Net Representations Contain Positional Embeddings and Anomalies
Jonas Loos
Lorenz Linhardt
26
0
0
09 Apr 2025
Generalized Semantic Contrastive Learning via Embedding Side Information for Few-Shot Object Detection
Generalized Semantic Contrastive Learning via Embedding Side Information for Few-Shot Object Detection
Ruoyu Chen
Hua Zhang
Jingzhi Li
Li Liu
Zhen Huang
Xiaochun Cao
37
0
0
09 Apr 2025
RayFronts: Open-Set Semantic Ray Frontiers for Online Scene Understanding and Exploration
RayFronts: Open-Set Semantic Ray Frontiers for Online Scene Understanding and Exploration
Omar Alama
A. Bhattacharya
Haoyang He
Seungchan Kim
Yuheng Qiu
Wenshan Wang
Cherie Ho
Nikhil Varma Keetha
Sebastian A. Scherer
26
0
0
09 Apr 2025
Efficient Self-Supervised Learning for Earth Observation via Dynamic Dataset Curation
Efficient Self-Supervised Learning for Earth Observation via Dynamic Dataset Curation
Thomas Kerdreux
A. Tuel
Quentin Febvre
A. Mouche
Bertrand Chapron
73
0
0
09 Apr 2025
PosterMaker: Towards High-Quality Product Poster Generation with Accurate Text Rendering
PosterMaker: Towards High-Quality Product Poster Generation with Accurate Text Rendering
Y. Gao
Zihang Lin
Chuanbin Liu
Min Zhou
T. Ge
Bo Zheng
Hongtao Xie
DiffM
35
0
0
09 Apr 2025
RAGME: Retrieval Augmented Video Generation for Enhanced Motion Realism
RAGME: Retrieval Augmented Video Generation for Enhanced Motion Realism
E. Peruzzo
Dejia Xu
Xingqian Xu
Humphrey Shi
N. Sebe
DiffM
VGen
54
0
0
09 Apr 2025
SIGMAN:Scaling 3D Human Gaussian Generation with Millions of Assets
SIGMAN:Scaling 3D Human Gaussian Generation with Millions of Assets
Yuhang Yang
Fengqi Liu
Yixing Lu
Qin Zhao
Pingyu Wu
...
Ran Yi
Yang Cao
Lizhuang Ma
Zheng-jun Zha
Junting Dong
3DGS
42
0
0
09 Apr 2025
Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception
Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception
Ruotian Peng
Haiying He
Yake Wei
Yandong Wen
D. Hu
VLM
39
0
0
09 Apr 2025
Domain Generalization through Attenuation of Domain-Specific Information
Domain Generalization through Attenuation of Domain-Specific Information
Reiji Saito
Kazuhiro Hotta
26
0
0
09 Apr 2025
Masked Scene Modeling: Narrowing the Gap Between Supervised and Self-Supervised Learning in 3D Scene Understanding
Masked Scene Modeling: Narrowing the Gap Between Supervised and Self-Supervised Learning in 3D Scene Understanding
Pedro Hermosilla
Christian Stippel
Leon Sick
SSL
3DPC
79
0
0
09 Apr 2025
Prototype-Based Continual Learning with Label-free Replay Buffer and Cluster Preservation Loss
Prototype-Based Continual Learning with Label-free Replay Buffer and Cluster Preservation Loss
Agil Aghasanli
Yi Li
Plamen Angelov
CLL
VLM
50
0
0
09 Apr 2025
MultiADS: Defect-aware Supervision for Multi-type Anomaly Detection and Segmentation in Zero-Shot Learning
MultiADS: Defect-aware Supervision for Multi-type Anomaly Detection and Segmentation in Zero-Shot Learning
Ylli Sadikaj
Hongkuan Zhou
Lavdim Halilaj
Stefan Schmid
Steffen Staab
Claudia Plant
21
0
0
09 Apr 2025
Analyzing the Impact of Low-Rank Adaptation for Cross-Domain Few-Shot Object Detection in Aerial Images
Analyzing the Impact of Low-Rank Adaptation for Cross-Domain Few-Shot Object Detection in Aerial Images
Hicham Talaoubrid
Anissa Mokraoui
Ismail Ben Ayed
Axel Prouvost
Sonimith Hang
Monit Korn
Rémi Harvey
ObjD
52
1
0
08 Apr 2025
Hyperbolic Category Discovery
Hyperbolic Category Discovery
Yuanpei Liu
Zhenqi He
Kai Han
26
0
0
08 Apr 2025
Earth-Adapter: Bridge the Geospatial Domain Gaps with Mixture of Frequency Adaptation
Earth-Adapter: Bridge the Geospatial Domain Gaps with Mixture of Frequency Adaptation
Xiaoxing Hu
Ziyang Gong
Y. Wang
Yuru Jia
Gen Luo
Xue Yang
121
0
0
08 Apr 2025
To Match or Not to Match: Revisiting Image Matching for Reliable Visual Place Recognition
To Match or Not to Match: Revisiting Image Matching for Reliable Visual Place Recognition
Davide Sferrazza
Gabriele Berton
Gabriele Trivigno
Carlo Masone
25
0
0
08 Apr 2025
On the Importance of Conditioning for Privacy-Preserving Data Augmentation
On the Importance of Conditioning for Privacy-Preserving Data Augmentation
Julian Lorenz
K. Ludwig
Valentin Haug
Rainer Lienhart
DiffM
38
0
0
08 Apr 2025
OmniSVG: A Unified Scalable Vector Graphics Generation Model
OmniSVG: A Unified Scalable Vector Graphics Generation Model
Yiying Yang
Wei Cheng
Sijin Chen
Xianfang Zeng
Jiaxu Zhang
Liao Wang
Gang Yu
Xingjun Ma
Yu Jiang
VLM
40
0
0
08 Apr 2025
Previous
12345...424344
Next