ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.07193
  4. Cited By
DINOv2: Learning Robust Visual Features without Supervision

DINOv2: Learning Robust Visual Features without Supervision

14 April 2023
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
Vasil Khalidov
Pierre Fernandez
Daniel Haziza
Francisco Massa
Alaaeldin El-Nouby
Mahmoud Assran
Nicolas Ballas
Wojciech Galuba
Russ Howes
Po-Yao (Bernie) Huang
Shang-Wen Li
Ishan Misra
Michael G. Rabbat
Vasu Sharma
Gabriel Synnaeve
Huijiao Xu
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
    VLM
    CLIP
    SSL
ArXivPDFHTML

Papers citing "DINOv2: Learning Robust Visual Features without Supervision"

50 / 2,220 papers shown
Title
Affine steerers for structured keypoint description
Affine steerers for structured keypoint description
Georg Bökman
Johan Edstedt
Michael Felsberg
Fredrik Kahl
LLMSV
44
2
0
26 Aug 2024
Re-Mix: Optimizing Data Mixtures for Large Scale Imitation Learning
Re-Mix: Optimizing Data Mixtures for Large Scale Imitation Learning
Joey Hejna
Chethan Bhateja
Yichen Jian
Karl Pertsch
Dorsa Sadigh
30
16
0
26 Aug 2024
An Embedding is Worth a Thousand Noisy Labels
An Embedding is Worth a Thousand Noisy Labels
Francesco Di Salvo
Sebastian Doerrich
Ines Rieger
Christian Ledig
NoLa
75
0
0
26 Aug 2024
Can Visual Foundation Models Achieve Long-term Point Tracking?
Can Visual Foundation Models Achieve Long-term Point Tracking?
Görkay Aydemir
Weidi Xie
Fatma Guney
45
7
0
24 Aug 2024
Segment Any Mesh
Segment Any Mesh
George Tang
William Zhao
Logan Ford
David Benhaim
Paul Zhang
44
8
0
24 Aug 2024
FungiTastic: A multi-modal dataset and benchmark for image categorization
FungiTastic: A multi-modal dataset and benchmark for image categorization
Lukás Picek
Klara Janouskova
Milan Šulc
Jirí Matas
83
1
0
24 Aug 2024
A New Era in Computational Pathology: A Survey on Foundation and
  Vision-Language Models
A New Era in Computational Pathology: A Survey on Foundation and Vision-Language Models
Dibaloke Chanda
Milan Aryal
Nasim Yahya Soltani
Masoud Ganji
AI4CE
VLM
49
7
0
23 Aug 2024
Image Segmentation in Foundation Model Era: A Survey
Image Segmentation in Foundation Model Era: A Survey
Tianfei Zhou
Fei Zhang
Boyu Chang
Wenguan Wang
Ye Yuan
E. Konukoglu
Daniel Cremers
VLM
45
5
0
23 Aug 2024
State-of-the-Art Fails in the Art of Damage Detection
State-of-the-Art Fails in the Art of Damage Detection
D. Ivanova
Marco Aversa
Paul Henderson
John Williamson
26
0
0
23 Aug 2024
WildFusion: Individual Animal Identification with Calibrated Similarity
  Fusion
WildFusion: Individual Animal Identification with Calibrated Similarity Fusion
Vojtěch Cermak
Lukás Picek
Lukáš Adam
Lukáš Neumann
Jiří Matas
FedML
41
2
0
23 Aug 2024
Animal Identification with Independent Foreground and Background
  Modeling
Animal Identification with Independent Foreground and Background Modeling
Lukás Picek
Lukás Neumann
Jirí Matas
VLM
43
2
0
23 Aug 2024
Atlas Gaussians Diffusion for 3D Generation
Atlas Gaussians Diffusion for 3D Generation
Haitao Yang
Yuan Dong
Hanwen Jiang
Dejia Xu
Georgios Pavlakos
Qixing Huang
3DGS
81
3
0
23 Aug 2024
Building and better understanding vision-language models: insights and
  future directions
Building and better understanding vision-language models: insights and future directions
Hugo Laurençon
Andrés Marafioti
Victor Sanh
Léo Tronchon
VLM
53
63
0
22 Aug 2024
Sapiens: Foundation for Human Vision Models
Sapiens: Foundation for Human Vision Models
Rawal Khirodkar
Timur M. Bagautdinov
Julieta Martinez
Su Zhaoen
Austin James
Peter Selednik
Stuart Anderson
Shunsuke Saito
VLM
52
63
0
22 Aug 2024
Scribbles for All: Benchmarking Scribble Supervised Segmentation Across
  Datasets
Scribbles for All: Benchmarking Scribble Supervised Segmentation Across Datasets
Wolfgang Boettcher
Lukas Hoyer
Ozan Unal
J. E. Lenssen
Bernt Schiele
36
0
0
22 Aug 2024
Enhanced Infield Agriculture with Interpretable Machine Learning
  Approaches for Crop Classification
Enhanced Infield Agriculture with Interpretable Machine Learning Approaches for Crop Classification
Sudi Murindanyi
Joyce Nakatumba-Nabende
Rahman Sanya
Rose Nakibuule
Andrew Katumba
VLM
33
0
0
22 Aug 2024
Cross-Domain Foundation Model Adaptation: Pioneering Computer Vision
  Models for Geophysical Data Analysis
Cross-Domain Foundation Model Adaptation: Pioneering Computer Vision Models for Geophysical Data Analysis
Zhixiang Guo
Xinming Wu
Luming Liang
Hanlin Sheng
Nuo Chen
Zhengfa Bi
AI4CE
59
1
0
22 Aug 2024
VTON-HandFit: Virtual Try-on for Arbitrary Hand Pose Guided by Hand
  Priors Embedding
VTON-HandFit: Virtual Try-on for Arbitrary Hand Pose Guided by Hand Priors Embedding
Yujie Liang
Xiaobin Hu
Boyuan Jiang
Donghao Luo
Kai WU
Wenhui Han
Taisong Jin
Chengjie Wang
DiffM
39
2
0
22 Aug 2024
SEA: Supervised Embedding Alignment for Token-Level Visual-Textual
  Integration in MLLMs
SEA: Supervised Embedding Alignment for Token-Level Visual-Textual Integration in MLLMs
Yuanyang Yin
Yaqi Zhao
Yajie Zhang
Ke Lin
Jiahao Wang
Xin Tao
Pengfei Wan
Di Zhang
Baoqun Yin
Wentao Zhang
LRM
50
6
0
21 Aug 2024
E-Bench: Subjective-Aligned Benchmark Suite for Text-Driven Video
  Editing Quality Assessment
E-Bench: Subjective-Aligned Benchmark Suite for Text-Driven Video Editing Quality Assessment
Shangkun Sun
Xiaoyu Liang
S. Fan
Wenxu Gao
Wei-Nan Gao
DiffM
63
0
0
21 Aug 2024
EmbodiedSAM: Online Segment Any 3D Thing in Real Time
EmbodiedSAM: Online Segment Any 3D Thing in Real Time
Xiuwei Xu
Huangxing Chen
Linqing Zhao
Ziwei Wang
Jie Zhou
Jiwen Lu
47
15
0
21 Aug 2024
Large Point-to-Gaussian Model for Image-to-3D Generation
Large Point-to-Gaussian Model for Image-to-3D Generation
Longfei Lu
Huachen Gao
Tao Dai
Yaohua Zha
Zhi Hou
Junta Wu
Shu-Tao Xia
3DGS
DiffM
45
4
0
20 Aug 2024
PooDLe: Pooled and dense self-supervised learning from naturalistic videos
PooDLe: Pooled and dense self-supervised learning from naturalistic videos
Alex N. Wang
Christopher Hoang
Yuwen Xiong
Yann LeCun
Mengye Ren
78
0
0
20 Aug 2024
Learning Precise Affordances from Egocentric Videos for Robotic
  Manipulation
Learning Precise Affordances from Egocentric Videos for Robotic Manipulation
Gen Li
Nikolaos Tsagkas
Jifei Song
Ruaridh Mon-Williams
S. Vijayakumar
Kun Shao
Laura Sevilla-Lara
43
8
0
19 Aug 2024
3D-Aware Instance Segmentation and Tracking in Egocentric Videos
3D-Aware Instance Segmentation and Tracking in Egocentric Videos
Yash Bhalgat
Vadim Tschernezki
Iro Laina
João F. Henriques
Andrea Vedaldi
Andrew Zisserman
VOS
44
1
0
19 Aug 2024
Zero-Shot Object-Centric Representation Learning
Zero-Shot Object-Centric Representation Learning
Aniket Didolkar
Andrii Zadaianchuk
Anirudh Goyal
Mike Mozer
Yoshua Bengio
Georg Martius
Maximilian Seitzer
VLM
OCL
42
4
0
17 Aug 2024
Are CLIP features all you need for Universal Synthetic Image Origin
  Attribution?
Are CLIP features all you need for Universal Synthetic Image Origin Attribution?
Dario Cioni
Christos Tzelepis
Lorenzo Seidenari
Ioannis Patras
48
2
0
17 Aug 2024
Segment Anything with Multiple Modalities
Segment Anything with Multiple Modalities
Aoran Xiao
Weihao Xuan
Heli Qi
Yun Xing
Naoto Yokoya
Shijian Lu
VLM
38
7
0
17 Aug 2024
Retrieval-augmented Few-shot Medical Image Segmentation with Foundation Models
Retrieval-augmented Few-shot Medical Image Segmentation with Foundation Models
Lin Zhao
Xiao Chen
Eric Z. Chen
Yikang Liu
Terrence Chen
Shanhui Sun
VLM
57
5
0
16 Aug 2024
SpectralEarth: Training Hyperspectral Foundation Models at Scale
SpectralEarth: Training Hyperspectral Foundation Models at Scale
Nassim Ait Ali Braham
C. Albrecht
Julien Mairal
J. Chanussot
Yi Wang
X. Zhu
43
13
0
15 Aug 2024
Towards flexible perception with visual memory
Towards flexible perception with visual memory
Robert Geirhos
P. Jaini
Austin Stone
Sourabh Medapati
Xi Yi
G. Toderici
Abhijit Ogale
Jonathon Shlens
44
1
0
15 Aug 2024
Navigating Data Scarcity using Foundation Models: A Benchmark of
  Few-Shot and Zero-Shot Learning Approaches in Medical Imaging
Navigating Data Scarcity using Foundation Models: A Benchmark of Few-Shot and Zero-Shot Learning Approaches in Medical Imaging
S. Woerner
Christian F. Baumgartner
VLM
MedIm
35
0
0
15 Aug 2024
MVInpainter: Learning Multi-View Consistent Inpainting to Bridge 2D and
  3D Editing
MVInpainter: Learning Multi-View Consistent Inpainting to Bridge 2D and 3D Editing
Chenjie Cao
Chaohui Yu
Yanwei Fu
Fan Wang
Xiangyang Xue
VGen
55
7
0
15 Aug 2024
General-purpose Clothes Manipulation with Semantic Keypoints
General-purpose Clothes Manipulation with Semantic Keypoints
Yuhong Deng
David Hsu
64
2
0
15 Aug 2024
Connecting Dreams with Visual Brainstorming Instruction
Connecting Dreams with Visual Brainstorming Instruction
Yasheng Sun
Bohan Li
Mingchen Zhuge
Deng-Ping Fan
Salman Khan
Fahad Shahbaz Khan
Hideki Koike
DiffM
44
0
0
14 Aug 2024
SlotLifter: Slot-guided Feature Lifting for Learning Object-centric
  Radiance Fields
SlotLifter: Slot-guided Feature Lifting for Learning Object-centric Radiance Fields
Yu Liu
Baoxiong Jia
Yixin Chen
Siyuan Huang
OCL
50
4
0
13 Aug 2024
Towards Robust Monocular Depth Estimation in Non-Lambertian Surfaces
Towards Robust Monocular Depth Estimation in Non-Lambertian Surfaces
Junrui Zhang
Jiaqi Li
Yachuan Huang
Yiran Wang
Jinghong Zheng
Liao Shen
Z. Cao
MDE
39
3
0
12 Aug 2024
BooW-VTON: Boosting In-the-Wild Virtual Try-On via Mask-Free Pseudo Data
  Training
BooW-VTON: Boosting In-the-Wild Virtual Try-On via Mask-Free Pseudo Data Training
Xuanpu Zhang
Dan Song
Pengxin Zhan
Qingguo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
Anan Liu
DiffM
43
4
0
12 Aug 2024
BI-MDRG: Bridging Image History in Multimodal Dialogue Response
  Generation
BI-MDRG: Bridging Image History in Multimodal Dialogue Response Generation
Hee Suk Yoon
Eunseop Yoon
Joshua Tian Jin Tee
Kang Zhang
Yu-Jung Heo
Du-Seong Chang
Chang D. Yoo
41
3
0
12 Aug 2024
Efficient Diffusion Transformer with Step-wise Dynamic Attention
  Mediators
Efficient Diffusion Transformer with Step-wise Dynamic Attention Mediators
Yifan Pu
Zhuofan Xia
Jiayi Guo
Dongchen Han
Qixiu Li
...
Ji Li
Yizeng Han
Shiji Song
Gao Huang
Xiu Li
69
12
0
11 Aug 2024
PS-TTL: Prototype-based Soft-labels and Test-Time Learning for Few-shot
  Object Detection
PS-TTL: Prototype-based Soft-labels and Test-Time Learning for Few-shot Object Detection
Yingjie Gao
Yanan Zhang
Ziyue Huang
Nanqing Liu
Di Huang
ObjD
53
1
0
11 Aug 2024
UNIC: Universal Classification Models via Multi-teacher Distillation
UNIC: Universal Classification Models via Multi-teacher Distillation
Mert Bulent Sariyildiz
Philippe Weinzaepfel
Thomas Lucas
Diane Larlus
Yannis Kalantidis
47
7
0
09 Aug 2024
In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic
  Segmentation
In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation
Dahyun Kang
Minsu Cho
ObjD
VLM
45
9
0
09 Aug 2024
Depth Any Canopy: Leveraging Depth Foundation Models for Canopy Height
  Estimation
Depth Any Canopy: Leveraging Depth Foundation Models for Canopy Height Estimation
Daniele Rege Cambrin
Isaac Corley
Paolo Garza
34
2
0
08 Aug 2024
SegXAL: Explainable Active Learning for Semantic Segmentation in Driving
  Scene Scenarios
SegXAL: Explainable Active Learning for Semantic Segmentation in Driving Scene Scenarios
Sriram Mandalika
Athira Nambiar
35
1
0
08 Aug 2024
Openstory++: A Large-scale Dataset and Benchmark for Instance-aware
  Open-domain Visual Storytelling
Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling
Zilyu Ye
Yu Lei
Ruotian Peng
Jinjin Cao
Zhiyang Chen
...
Mingyuan Zhou
Xiaoqian Shen
Mohamed Elhoseiny
Nan Zhuang
Guo-Jun Qi
VGen
VLM
42
1
0
07 Aug 2024
Concept Conductor: Orchestrating Multiple Personalized Concepts in
  Text-to-Image Synthesis
Concept Conductor: Orchestrating Multiple Personalized Concepts in Text-to-Image Synthesis
Zebin Yao
Fangxiang Feng
Ruifan Li
Xiaojie Wang
DiffM
44
1
0
07 Aug 2024
AMES: Asymmetric and Memory-Efficient Similarity Estimation for
  Instance-level Retrieval
AMES: Asymmetric and Memory-Efficient Similarity Estimation for Instance-level Retrieval
Pavel Suma
Giorgos Kordopatis-Zilos
Ahmet Iscen
Giorgos Tolias
VLM
50
3
0
06 Aug 2024
Evaluation of Segment Anything Model 2: The Role of SAM2 in the
  Underwater Environment
Evaluation of Segment Anything Model 2: The Role of SAM2 in the Underwater Environment
Shijie Lian
Hua Li
VLM
43
5
0
06 Aug 2024
From Recognition to Prediction: Leveraging Sequence Reasoning for Action
  Anticipation
From Recognition to Prediction: Leveraging Sequence Reasoning for Action Anticipation
Xin Liu
Chao Hao
Zitong Yu
Huanjing Yue
Jingyu Yang
43
1
0
05 Aug 2024
Previous
123...222324...434445
Next