ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.07193
  4. Cited By
DINOv2: Learning Robust Visual Features without Supervision

DINOv2: Learning Robust Visual Features without Supervision

14 April 2023
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
Vasil Khalidov
Pierre Fernandez
Daniel Haziza
Francisco Massa
Alaaeldin El-Nouby
Mahmoud Assran
Nicolas Ballas
Wojciech Galuba
Russ Howes
Po-Yao (Bernie) Huang
Shang-Wen Li
Ishan Misra
Michael G. Rabbat
Vasu Sharma
Gabriel Synnaeve
Huijiao Xu
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
    VLM
    CLIP
    SSL
ArXivPDFHTML

Papers citing "DINOv2: Learning Robust Visual Features without Supervision"

50 / 2,220 papers shown
Title
On Pretraining Data Diversity for Self-Supervised Learning
On Pretraining Data Diversity for Self-Supervised Learning
Hasan Hammoud
Tuhin Das
Fabio Pizzati
Philip Torr
Adel Bibi
Guohao Li
103
2
0
20 Mar 2024
Consistent Diffusion Meets Tweedie: Training Exact Ambient Diffusion
  Models with Noisy Data
Consistent Diffusion Meets Tweedie: Training Exact Ambient Diffusion Models with Noisy Data
Giannis Daras
Alexandros G. Dimakis
Constantinos Daskalakis
49
23
0
20 Mar 2024
Improved Baselines for Data-efficient Perceptual Augmentation of LLMs
Improved Baselines for Data-efficient Perceptual Augmentation of LLMs
Théophane Vallaeys
Mustafa Shukor
Matthieu Cord
Jakob Verbeek
59
12
0
20 Mar 2024
LUWA Dataset: Learning Lithic Use-Wear Analysis on Microscopic Images
LUWA Dataset: Learning Lithic Use-Wear Analysis on Microscopic Images
Jing Zhang
Irving Fang
Juexiao Zhang
Hao Wu
Akshat Kaushik
Alice Rodriguez
Hanwen Zhao
Zhuo Zheng
Radu Iovita
Chen Feng
30
3
0
19 Mar 2024
Magic Fixup: Streamlining Photo Editing by Watching Dynamic Videos
Magic Fixup: Streamlining Photo Editing by Watching Dynamic Videos
Hadi Alzayer
Zhihao Xia
Xuaner Zhang
Eli Shechtman
Jia-Bin Huang
Michael Gharbi
DiffM
VGen
37
19
0
19 Mar 2024
Wear-Any-Way: Manipulable Virtual Try-on via Sparse Correspondence
  Alignment
Wear-Any-Way: Manipulable Virtual Try-on via Sparse Correspondence Alignment
Mengting Chen
Xi Chen
Zhonghua Zhai
Chen Ju
Xuewen Hong
Jinsong Lan
Shuai Xiao
OOD
DiffM
53
21
0
19 Mar 2024
When Do We Not Need Larger Vision Models?
When Do We Not Need Larger Vision Models?
Baifeng Shi
Ziyang Wu
Maolin Mao
Xin Wang
Trevor Darrell
VLM
LRM
59
42
0
19 Mar 2024
You Only Sample Once: Taming One-Step Text-to-Image Synthesis by
  Self-Cooperative Diffusion GANs
You Only Sample Once: Taming One-Step Text-to-Image Synthesis by Self-Cooperative Diffusion GANs
Yihong Luo
Xiaolong Chen
Xinghua Qu
Jing Tang
61
6
0
19 Mar 2024
ViTGaze: Gaze Following with Interaction Features in Vision Transformers
ViTGaze: Gaze Following with Interaction Features in Vision Transformers
Yuehao Song
Xinggang Wang
Jingfeng Yao
Wenyu Liu
Jinglin Zhang
Xiangmin Xu
ViT
54
2
0
19 Mar 2024
Selective, Interpretable, and Motion Consistent Privacy Attribute
  Obfuscation for Action Recognition
Selective, Interpretable, and Motion Consistent Privacy Attribute Obfuscation for Action Recognition
Filip Ilic
Henghui Zhao
Thomas Pock
Richard P. Wildes
PICV
AAML
44
2
0
19 Mar 2024
Learning Cross-view Visual Geo-localization without Ground Truth
Learning Cross-view Visual Geo-localization without Ground Truth
Haoyuan Li
Chang Xu
Wen Yang
Huai Yu
Gui-Song Xia
48
9
0
19 Mar 2024
IFFNeRF: Initialisation Free and Fast 6DoF pose estimation from a single
  image and a NeRF model
IFFNeRF: Initialisation Free and Fast 6DoF pose estimation from a single image and a NeRF model
M. Bortolon
T. Tsesmelis
Stuart James
Fabio Poiesi
Alessio Del Bue
35
5
0
19 Mar 2024
OV9D: Open-Vocabulary Category-Level 9D Object Pose and Size Estimation
OV9D: Open-Vocabulary Category-Level 9D Object Pose and Size Estimation
Junhao Cai
Yisheng He
Weihao Yuan
Siyu Zhu
Zilong Dong
Liefeng Bo
Qifeng Chen
DiffM
40
8
0
19 Mar 2024
Fusion Transformer with Object Mask Guidance for Image Forgery Analysis
Fusion Transformer with Object Mask Guidance for Image Forgery Analysis
Dimitrios Karageorgiou
Giorgos Kordopatis-Zilos
Symeon Papadopoulos
ViT
28
5
0
18 Mar 2024
Zero-Shot Image Feature Consensus with Deep Functional Maps
Zero-Shot Image Feature Consensus with Deep Functional Maps
Xinle Cheng
Congyue Deng
Adam W. Harley
Yixin Zhu
Leonidas J. Guibas
49
2
0
18 Mar 2024
VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion
  Models
VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models
Junlin Han
Filippos Kokkinos
Philip Torr
VGen
83
40
0
18 Mar 2024
LN3Diff: Scalable Latent Neural Fields Diffusion for Speedy 3D
  Generation
LN3Diff: Scalable Latent Neural Fields Diffusion for Speedy 3D Generation
Yushi Lan
Fangzhou Hong
Shuai Yang
Shangchen Zhou
Xuyi Meng
Bo Dai
Xingang Pan
Chen Change Loy
42
39
0
18 Mar 2024
Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion
  Distillation
Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation
Axel Sauer
Frederic Boesel
Tim Dockhorn
A. Blattmann
Patrick Esser
Robin Rombach
DiffM
55
109
0
18 Mar 2024
SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image
  using Latent Video Diffusion
SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion
Vikram S. Voleti
Chun-Han Yao
Mark Boss
Adam Letts
David Pankratz
Dmitry Tochilkin
Christian Laforte
Robin Rombach
Varun Jampani
DiffM
VGen
49
170
0
18 Mar 2024
BEVCar: Camera-Radar Fusion for BEV Map and Object Segmentation
BEVCar: Camera-Radar Fusion for BEV Map and Object Segmentation
Jonas Schramm
Niclas Vodisch
Kürsat Petek
Ravi Kiran
S. Yogamani
Wolfram Burgard
Abhinav Valada
42
12
0
18 Mar 2024
LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images
LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images
Ruyi Xu
Yuan Yao
Zonghao Guo
Junbo Cui
Zanlin Ni
Chunjiang Ge
Tat-Seng Chua
Zhiyuan Liu
Maosong Sun
Gao Huang
VLM
MLLM
37
105
0
18 Mar 2024
TTT-KD: Test-Time Training for 3D Semantic Segmentation through
  Knowledge Distillation from Foundation Models
TTT-KD: Test-Time Training for 3D Semantic Segmentation through Knowledge Distillation from Foundation Models
Lisa Weijler
Muhammad Jehanzeb Mirza
Leon Sick
Can Ekkazan
Pedro Hermosilla
TTA
46
0
0
18 Mar 2024
Better (pseudo-)labels for semi-supervised instance segmentation
Better (pseudo-)labels for semi-supervised instance segmentation
Franccois Porcher
Camille Couprie
Marc Szafraniec
Jakob Verbeek
ISeg
32
0
0
18 Mar 2024
End-to-end multi-modal product matching in fashion e-commerce
End-to-end multi-modal product matching in fashion e-commerce
Sándor Tóth
Stephen Wilson
Alexia Tsoukara
Enric Moreu
Anton Masalovich
Lars Roemheld
35
0
0
18 Mar 2024
VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding
VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding
Yue Fan
Xiaojian Ma
Rujie Wu
Yuntao Du
Jiaqi Li
Zhi Gao
Qing Li
VLM
LLMAG
51
57
0
18 Mar 2024
Universal Semi-Supervised Domain Adaptation by Mitigating Common-Class
  Bias
Universal Semi-Supervised Domain Adaptation by Mitigating Common-Class Bias
Wenyu Zhang
Qingmu Liu
Felix Ong Wei Cong
Mohamed Ragab
Chuan-Sheng Foo
37
0
0
17 Mar 2024
TAG: Guidance-free Open-Vocabulary Semantic Segmentation
TAG: Guidance-free Open-Vocabulary Semantic Segmentation
Yasufumi Kawano
Yoshimitsu Aoki
VLM
30
2
0
17 Mar 2024
MaskDiffusion: Exploiting Pre-trained Diffusion Models for Semantic
  Segmentation
MaskDiffusion: Exploiting Pre-trained Diffusion Models for Semantic Segmentation
Yasufumi Kawano
Yoshimitsu Aoki
DiffM
37
5
0
17 Mar 2024
3D Human Reconstruction in the Wild with Synthetic Data Using Generative
  Models
3D Human Reconstruction in the Wild with Synthetic Data Using Generative Models
Yongtao Ge
Wenjia Wang
Yongfan Chen
Hao Chen
Chunhua Shen
3DH
45
8
0
17 Mar 2024
StainDiffuser: MultiTask Dual Diffusion Model for Virtual Staining
StainDiffuser: MultiTask Dual Diffusion Model for Virtual Staining
Tushar Kataria
Beatrice Knudsen
Shireen Y. Elhabian
DiffM
MedIm
37
9
0
17 Mar 2024
Frozen Feature Augmentation for Few-Shot Image Classification
Frozen Feature Augmentation for Few-Shot Image Classification
Andreas Bär
N. Houlsby
Mostafa Dehghani
Manoj Kumar
VLM
39
4
0
15 Mar 2024
Autonomous Monitoring of Pharmaceutical R&D Laboratories with 6 Axis Arm
  Equipped Quadruped Robot and Generative AI: A Preliminary Study
Autonomous Monitoring of Pharmaceutical R&D Laboratories with 6 Axis Arm Equipped Quadruped Robot and Generative AI: A Preliminary Study
Shunichi Hato
Nozomi Ogawa
36
1
0
15 Mar 2024
On the Utility of 3D Hand Poses for Action Recognition
On the Utility of 3D Hand Poses for Action Recognition
Md Salman Shamil
Dibyadip Chatterjee
Fadime Sener
Shugao Ma
Angela Yao
45
5
0
14 Mar 2024
BOP Challenge 2023 on Detection, Segmentation and Pose Estimation of
  Seen and Unseen Rigid Objects
BOP Challenge 2023 on Detection, Segmentation and Pose Estimation of Seen and Unseen Rigid Objects
Tomás Hodan
M. Sundermeyer
Yann Labbé
Van Nguyen Nguyen
Gu Wang
Eric Brachmann
Bertram Drost
Vincent Lepetit
Carsten Rother
Jiri Matas
3DPC
37
48
0
14 Mar 2024
GroupContrast: Semantic-aware Self-supervised Representation Learning
  for 3D Understanding
GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D Understanding
Chengyao Wang
Li Jiang
Xiaoyang Wu
Zhuotao Tian
Bohao Peng
Hengshuang Zhao
Jiaya Jia
3DPC
SSL
83
15
0
14 Mar 2024
Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering
Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering
Zeyu Liu
Weicong Liang
Zhanhao Liang
Chong Luo
Ji Li
Gao Huang
Yuhui Yuan
DiffM
72
26
0
14 Mar 2024
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Brandon McKinzie
Zhe Gan
J. Fauconnier
Sam Dodge
Bowen Zhang
...
Zirui Wang
Ruoming Pang
Peter Grasch
Alexander Toshev
Yinfei Yang
MLLM
43
189
0
14 Mar 2024
Eta Inversion: Designing an Optimal Eta Function for Diffusion-based
  Real Image Editing
Eta Inversion: Designing an Optimal Eta Function for Diffusion-based Real Image Editing
Wonjun Kang
Kevin Galim
Hyung Il Koo
DiffM
39
5
0
14 Mar 2024
3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation
3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation
Frank Zhang
Yibo Zhang
Quan Zheng
R. Ma
W. Hua
Hujun Bao
Weiwei Xu
Changqing Zou
64
10
0
14 Mar 2024
Video Editing via Factorized Diffusion Distillation
Video Editing via Factorized Diffusion Distillation
Uriel Singer
Amit Zohar
Yuval Kirstain
Shelly Sheynin
Adam Polyak
Devi Parikh
Yaniv Taigman
DiffM
VGen
51
12
0
14 Mar 2024
Annotation Free Semantic Segmentation with Vision Foundation Models
Annotation Free Semantic Segmentation with Vision Foundation Models
Soroush Seifi
Daniel Olmeda Reino
Fabien Despinoy
Rahaf Aljundi
VLM
39
1
0
14 Mar 2024
VisionGPT: Vision-Language Understanding Agent Using Generalized
  Multimodal Framework
VisionGPT: Vision-Language Understanding Agent Using Generalized Multimodal Framework
Chris Kelly
Luhui Hu
Bang Yang
Yu Tian
Deshun Yang
Cindy Yang
Zaoshan Huang
Zihao Li
Jiayin Hu
Yuexian Zou
47
9
0
14 Mar 2024
VDNA-PR: Using General Dataset Representations for Robust Sequential
  Visual Place Recognition
VDNA-PR: Using General Dataset Representations for Robust Sequential Visual Place Recognition
Benjamin Ramtoula
Daniele De Martini
Matthew Gadd
Paul Newman
34
0
0
14 Mar 2024
CART: Caltech Aerial RGB-Thermal Dataset in the Wild
CART: Caltech Aerial RGB-Thermal Dataset in the Wild
Connor T. Lee
Matthew O. Anderson
Nikhil Raganathan
Xingxing Zuo
Kevin Do
Georgia Gkioxari
Soon-Jo Chung
45
7
0
13 Mar 2024
VANP: Learning Where to See for Navigation with Self-Supervised
  Vision-Action Pre-Training
VANP: Learning Where to See for Navigation with Self-Supervised Vision-Action Pre-Training
Mohammad Nazeri
Junzhe Wang
Amirreza Payandeh
Xuesu Xiao
SSL
ViT
52
6
0
12 Mar 2024
CuVLER: Enhanced Unsupervised Object Discoveries through Exhaustive
  Self-Supervised Transformers
CuVLER: Enhanced Unsupervised Object Discoveries through Exhaustive Self-Supervised Transformers
Shahaf Arica
Or Rubin
Sapir Gershov
S. Laufer
37
6
0
12 Mar 2024
Learning Generalizable Feature Fields for Mobile Manipulation
Learning Generalizable Feature Fields for Mobile Manipulation
Ri-Zhao Qiu
Yafei Hu
Ge Yang
Yuchen Song
Yang Fu
...
Jiteng Mu
Ruihan Yang
Nikolay Atanasov
Sebastian Scherer
Xiaolong Wang
45
27
0
12 Mar 2024
SemGauss-SLAM: Dense Semantic Gaussian Splatting SLAM
SemGauss-SLAM: Dense Semantic Gaussian Splatting SLAM
Siting Zhu
Renjie Qin
Guangming Wang
Jiuming Liu
Hesheng Wang
43
28
0
12 Mar 2024
TFCounter:Polishing Gems for Training-Free Object Counting
TFCounter:Polishing Gems for Training-Free Object Counting
Pan Ting
Jianfeng Lin
Wenhao Yu
Wenlong Zhang
Xiaoying Chen
Jinlu Zhang
Binqiang Huang
37
0
0
12 Mar 2024
DragAnything: Motion Control for Anything using Entity Representation
DragAnything: Motion Control for Anything using Entity Representation
Wejia Wu
Zhuang Li
Yuchao Gu
Rui Zhao
Yefei He
David Junhao Zhang
Mike Zheng Shou
Yan Li
Yan Li
Di Zhang
VGen
93
51
0
12 Mar 2024
Previous
123...333435...434445
Next