ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.07193
  4. Cited By
DINOv2: Learning Robust Visual Features without Supervision

DINOv2: Learning Robust Visual Features without Supervision

14 April 2023
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
Vasil Khalidov
Pierre Fernandez
Daniel Haziza
Francisco Massa
Alaaeldin El-Nouby
Mahmoud Assran
Nicolas Ballas
Wojciech Galuba
Russ Howes
Po-Yao (Bernie) Huang
Shang-Wen Li
Ishan Misra
Michael G. Rabbat
Vasu Sharma
Gabriel Synnaeve
Huijiao Xu
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
    VLM
    CLIP
    SSL
ArXivPDFHTML

Papers citing "DINOv2: Learning Robust Visual Features without Supervision"

50 / 2,249 papers shown
Title
4K4DGen: Panoramic 4D Generation at 4K Resolution
4K4DGen: Panoramic 4D Generation at 4K Resolution
Renjie Li
Panwang Pan
Bangbang Yang
Dejia Xu
Shijie Zhou
Xuanyang Zhang
Zeming Li
A. Kadambi
Zhangyang Wang
Zhiwen Fan
VGen
68
17
0
19 Jun 2024
Large-Scale Dataset Pruning in Adversarial Training through Data
  Importance Extrapolation
Large-Scale Dataset Pruning in Adversarial Training through Data Importance Extrapolation
Bjorn Nieth
Thomas Altstidl
Leo Schwinn
Björn Eskofier
AAML
55
2
0
19 Jun 2024
ChangeViT: Unleashing Plain Vision Transformers for Change Detection
ChangeViT: Unleashing Plain Vision Transformers for Change Detection
Duowang Zhu
Xiaohu Huang
Haiyan Huang
Zhenfeng Shao
Q. Cheng
59
8
0
18 Jun 2024
GeoBench: Benchmarking and Analyzing Monocular Geometry Estimation
  Models
GeoBench: Benchmarking and Analyzing Monocular Geometry Estimation Models
Yongtao Ge
Guangkai Xu
Zhiyue Zhao
Libo Sun
Zheng Huang
Yanlong Sun
Hao Chen
Chunhua Shen
MDE
42
3
0
18 Jun 2024
Cycle-Correspondence Loss: Learning Dense View-Invariant Visual Features
  from Unlabeled and Unordered RGB Images
Cycle-Correspondence Loss: Learning Dense View-Invariant Visual Features from Unlabeled and Unordered RGB Images
David B. Adrian
A. Kupcsik
Markus Spies
Heiko Neumann
SSL
39
0
0
18 Jun 2024
The Wisdom of a Crowd of Brains: A Universal Brain Encoder
The Wisdom of a Crowd of Brains: A Universal Brain Encoder
Roman Beliy
Navve Wasserman
Amit Zalcher
Michal Irani
45
2
0
18 Jun 2024
DistillNeRF: Perceiving 3D Scenes from Single-Glance Images by
  Distilling Neural Fields and Foundation Model Features
DistillNeRF: Perceiving 3D Scenes from Single-Glance Images by Distilling Neural Fields and Foundation Model Features
Letian Wang
Seung Wook Kim
Jiawei Yang
Cunjun Yu
Boris Ivanovic
Steven Waslander
Yue Wang
Sanja Fidler
Marco Pavone
Peter Karkus
48
8
0
17 Jun 2024
Tracking the perspectives of interacting language models
Tracking the perspectives of interacting language models
Hayden Helm
Brandon Duderstadt
Youngser Park
Carey E. Priebe
80
6
0
17 Jun 2024
Cross-domain Open-world Discovery
Cross-domain Open-world Discovery
Shuo Wen
Maria Brbic
OOD
73
3
0
17 Jun 2024
Duoduo CLIP: Efficient 3D Understanding with Multi-View Images
Duoduo CLIP: Efficient 3D Understanding with Multi-View Images
Han-Hung Lee
Yiming Zhang
Angel X. Chang
3DPC
71
4
0
17 Jun 2024
Benchmarking Out-of-Distribution Generalization Capabilities of
  DNN-based Encoding Models for the Ventral Visual Cortex
Benchmarking Out-of-Distribution Generalization Capabilities of DNN-based Encoding Models for the Ventral Visual Cortex
Spandan Madan
Will Xiao
Mingran Cao
Hanspeter Pfister
Margaret Livingstone
Gabriel Kreiman
OOD
79
4
0
16 Jun 2024
ExPLoRA: Parameter-Efficient Extended Pre-Training to Adapt Vision Transformers under Domain Shifts
ExPLoRA: Parameter-Efficient Extended Pre-Training to Adapt Vision Transformers under Domain Shifts
Samar Khanna
Medhanie Irgau
David B. Lobell
Stefano Ermon
VLM
55
4
0
16 Jun 2024
SemanticMIM: Marring Masked Image Modeling with Semantics Compression
  for General Visual Representation
SemanticMIM: Marring Masked Image Modeling with Semantics Compression for General Visual Representation
Yike Yuan
Huanzhang Dou
Fengjun Guo
Xi Li
46
2
0
15 Jun 2024
Enhancing Anomaly Detection Generalization through Knowledge Exposure:
  The Dual Effects of Augmentation
Enhancing Anomaly Detection Generalization through Knowledge Exposure: The Dual Effects of Augmentation
Mohammad Akhavan Anvari
Rojina Kashefi
Vahid Reza Khazaie
Mohammad Khalooei
Mohammad Sabokrou
68
0
0
15 Jun 2024
NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows
NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows
Z-H. Tang
Zhongzheng Ren
Xiaoming Zhao
Bowen Wen
Jonathan Tremblay
Stan Birchfield
Alexander Schwing
32
2
0
15 Jun 2024
Learning to Adapt Foundation Model DINOv2 for Capsule Endoscopy
  Diagnosis
Learning to Adapt Foundation Model DINOv2 for Capsule Endoscopy Diagnosis
Bowen Zhang
Ying Chen
Long Bai
Yan Zhao
Yuxiang Sun
Yixuan Yuan
Jianhua Zhang
Hongliang Ren
67
5
0
15 Jun 2024
The BabyView dataset: High-resolution egocentric videos of infants' and
  young children's everyday experiences
The BabyView dataset: High-resolution egocentric videos of infants' and young children's everyday experiences
Bria Long
Violet Xiang
Stefan Stojanov
Robert Z. Sparks
Zi Yin
...
Steven Y. Feng
Chengxu Zhuang
V. Marchman
Daniel L. K. Yamins
Michael C. Frank
VGen
EgoV
58
2
0
14 Jun 2024
EFM3D: A Benchmark for Measuring Progress Towards 3D Egocentric
  Foundation Models
EFM3D: A Benchmark for Measuring Progress Towards 3D Egocentric Foundation Models
Julian Straub
Daniel DeTone
Tianwei Shen
Nan Yang
Chris Sweeney
Richard Newcombe
EgoV
48
6
0
14 Jun 2024
Exploring the Benefits of Vision Foundation Models for Unsupervised
  Domain Adaptation
Exploring the Benefits of Vision Foundation Models for Unsupervised Domain Adaptation
B. B. Englert
Fabrizio J. Piva
Tommie Kerssies
Daan de Geus
Gijs Dubbelman
68
10
0
14 Jun 2024
Grounding Image Matching in 3D with MASt3R
Grounding Image Matching in 3D with MASt3R
Vincent Leroy
Yohann Cabon
Jérôme Revaud
3DGS
3DV
65
131
0
14 Jun 2024
ImageNet3D: Towards General-Purpose Object-Level 3D Understanding
ImageNet3D: Towards General-Purpose Object-Level 3D Understanding
Wufei Ma
Guanning Zeng
Guofeng Zhang
Qihao Liu
Letian Zhang
Adam Kortylewski
Yaoyao Liu
Alan Yuille
VLM
3DV
54
7
0
13 Jun 2024
Depth Anything V2
Depth Anything V2
Lihe Yang
Bingyi Kang
Zilong Huang
Zhen Zhao
Xiaogang Xu
Jiashi Feng
Hengshuang Zhao
DiffM
VLM
MDE
70
348
0
13 Jun 2024
4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities
4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities
Roman Bachmann
Oğuzhan Fatih Kar
David Mizrahi
Ali Garjani
Mingfei Gao
David Griffiths
Jiaming Hu
Afshin Dehghan
Amir Zamir
MoE
VLM
MLLM
56
14
0
13 Jun 2024
Multiagent Multitraversal Multimodal Self-Driving: Open MARS Dataset
Multiagent Multitraversal Multimodal Self-Driving: Open MARS Dataset
Yiming Li
Zhiheng Li
Nuo Chen
Moonjun Gong
Zonglin Lyu
Zehong Wang
Peili Jiang
Chen Feng
52
10
0
13 Jun 2024
Toffee: Efficient Million-Scale Dataset Construction for Subject-Driven
  Text-to-Image Generation
Toffee: Efficient Million-Scale Dataset Construction for Subject-Driven Text-to-Image Generation
Yufan Zhou
Ruiyi Zhang
Kaizhi Zheng
Nanxuan Zhao
Jiuxiang Gu
Zichao Wang
Xin Eric Wang
Tong Sun
DiffM
40
2
0
13 Jun 2024
Parameter-Efficient Active Learning for Foundational models
Parameter-Efficient Active Learning for Foundational models
Athmanarayanan Lakshmi Narayanan
R. Krishnan
Amrutha Machireddy
Mahesh Subedar
VLM
45
0
0
13 Jun 2024
OpenVLA: An Open-Source Vision-Language-Action Model
OpenVLA: An Open-Source Vision-Language-Action Model
Moo Jin Kim
Karl Pertsch
Siddharth Karamcheti
Ted Xiao
Ashwin Balakrishna
...
Russ Tedrake
Dorsa Sadigh
Sergey Levine
Percy Liang
Chelsea Finn
LM&Ro
VLM
60
402
0
13 Jun 2024
WildlifeReID-10k: Wildlife re-identification dataset with 10k individual animals
WildlifeReID-10k: Wildlife re-identification dataset with 10k individual animals
L. Adam
Vojtěch Čermák
Kostas Papafitsoros
Lukás Picek
52
2
0
13 Jun 2024
APSeg: Auto-Prompt Network for Cross-Domain Few-Shot Semantic
  Segmentation
APSeg: Auto-Prompt Network for Cross-Domain Few-Shot Semantic Segmentation
Weizhao He
Zhiyuan Liu
Wei Zhuo
Linlin Shen
Jiaqi Yang
Songhe Deng
Liang Sun
VLM
51
8
0
12 Jun 2024
UDON: Universal Dynamic Online distillatioN for generic image
  representations
UDON: Universal Dynamic Online distillatioN for generic image representations
Nikolaos-Antonios Ypsilantis
Kaifeng Chen
André Araujo
Ondřej Chum
48
3
0
12 Jun 2024
A$^{2}$-MAE: A spatial-temporal-spectral unified remote sensing
  pre-training method based on anchor-aware masked autoencoder
A2^{2}2-MAE: A spatial-temporal-spectral unified remote sensing pre-training method based on anchor-aware masked autoencoder
Lixian Zhang
Yi Zhao
Runmin Dong
Jinxiao Zhang
Shuai Yuan
...
Weijia Li
Wei Liu
Wayne Zhang
Xue Jiang
Haohuan Fu
56
4
0
12 Jun 2024
Scaling Manipulation Learning with Visual Kinematic Chain Prediction
Scaling Manipulation Learning with Visual Kinematic Chain Prediction
Xinyu Zhang
Yuhan Liu
Haonan Chang
Abdeslam Boularias
66
1
0
12 Jun 2024
A3VLM: Actionable Articulation-Aware Vision Language Model
A3VLM: Actionable Articulation-Aware Vision Language Model
Siyuan Huang
Haonan Chang
Yuhan Liu
Yimeng Zhu
Hao Dong
Peng Gao
Abdeslam Boularias
Hongsheng Li
52
10
0
11 Jun 2024
Zero-shot Image Editing with Reference Imitation
Zero-shot Image Editing with Reference Imitation
Xi Chen
Yutong Feng
Mengting Chen
Yiyang Wang
Shilong Zhang
Yu Liu
Yujun Shen
Hengshuang Zhao
DiffM
44
21
0
11 Jun 2024
BAKU: An Efficient Transformer for Multi-Task Policy Learning
BAKU: An Efficient Transformer for Multi-Task Policy Learning
Siddhant Haldar
Zhuoran Peng
Lerrel Pinto
OffRL
53
32
0
11 Jun 2024
Active Scout: Multi-Target Tracking Using Neural Radiance Fields in
  Dense Urban Environments
Active Scout: Multi-Target Tracking Using Neural Radiance Fields in Dense Urban Environments
Christopher D. Hsu
Pratik Chaudhari
77
1
0
11 Jun 2024
Let Go of Your Labels with Unsupervised Transfer
Let Go of Your Labels with Unsupervised Transfer
Artyom Gadetsky
Yulun Jiang
Maria Brbić
VLM
54
6
0
11 Jun 2024
Evolving from Single-modal to Multi-modal Facial Deepfake Detection: Progress and Challenges
Evolving from Single-modal to Multi-modal Facial Deepfake Detection: Progress and Challenges
Ping Liu
Qiqi Tao
Joey Tianyi Zhou
71
0
0
11 Jun 2024
Beyond Bare Queries: Open-Vocabulary Object Grounding with 3D Scene Graph
Beyond Bare Queries: Open-Vocabulary Object Grounding with 3D Scene Graph
S. Linok
T. Zemskova
Svetlana Ladanova
Roman Titkov
Dmitry A. Yudin
Maxim Monastyrny
Aleksei Valenkov
LM&Ro
77
0
0
11 Jun 2024
Adapters Strike Back
Adapters Strike Back
Jan-Martin O. Steitz
Stefan Roth
46
6
0
10 Jun 2024
HO-Cap: A Capture System and Dataset for 3D Reconstruction and Pose Tracking of Hand-Object Interaction
HO-Cap: A Capture System and Dataset for 3D Reconstruction and Pose Tracking of Hand-Object Interaction
Jikai Wang
Qifan Zhang
Yu-Wei Chao
Bowen Wen
Xiaohu Guo
Yu Xiang
3DH
68
2
0
10 Jun 2024
Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding
  of Sound and Language
Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and Language
Mark Hamilton
Andrew Zisserman
John R. Hershey
William T. Freeman
VLM
67
5
0
09 Jun 2024
Regularized Training with Generated Datasets for Name-Only Transfer of
  Vision-Language Models
Regularized Training with Generated Datasets for Name-Only Transfer of Vision-Language Models
Minho Park
S. Park
Jooyeol Yun
Jaegul Choo
VLM
48
0
0
08 Jun 2024
USE: Universal Segment Embeddings for Open-Vocabulary Image Segmentation
USE: Universal Segment Embeddings for Open-Vocabulary Image Segmentation
Xiaoqi Wang
Wenbin He
Xiwei Xuan
Clint Sebastian
Jorge Henrique Piazentin Ono
...
Sima Behpour
T. Doan
Liang Gou
Han-Wei Shen
Liu Ren
VLM
45
6
0
07 Jun 2024
Multiplane Prior Guided Few-Shot Aerial Scene Rendering
Multiplane Prior Guided Few-Shot Aerial Scene Rendering
Zihan Gao
Licheng Jiao
Lingling Li
Xu Liu
Fan Liu
Puhua Chen
Yuwei Guo
62
3
0
07 Jun 2024
Nacala-Roof-Material: Drone Imagery for Roof Detection, Classification,
  and Segmentation to Support Mosquito-borne Disease Risk Assessment
Nacala-Roof-Material: Drone Imagery for Roof Detection, Classification, and Segmentation to Support Mosquito-borne Disease Risk Assessment
Venkanna Babu Guthula
Stefan Oehmcke
Remigio Chilaule
Hui Zhang
Nico Lang
A. Kariryaa
Johan Mottelson
Christian Igel
36
0
0
07 Jun 2024
Labeled Data Selection for Category Discovery
Labeled Data Selection for Category Discovery
Bingchen Zhao
Nico Lang
Serge Belongie
Oisin Mac Aodha
56
3
0
07 Jun 2024
MoE Jetpack: From Dense Checkpoints to Adaptive Mixture of Experts for
  Vision Tasks
MoE Jetpack: From Dense Checkpoints to Adaptive Mixture of Experts for Vision Tasks
Xingkui Zhu
Yiran Guan
Dingkang Liang
Yuchao Chen
Yuliang Liu
Xiang Bai
MoE
53
5
0
07 Jun 2024
LocLLM: Exploiting Generalizable Human Keypoint Localization via Large
  Language Model
LocLLM: Exploiting Generalizable Human Keypoint Localization via Large Language Model
Dongkai Wang
Shiyu Xuan
Shiliang Zhang
LRM
50
6
0
07 Jun 2024
MAIRA-2: Grounded Radiology Report Generation
MAIRA-2: Grounded Radiology Report Generation
Shruthi Bannur
Kenza Bouzid
Daniel Coelho De Castro
Anton Schwaighofer
Sam Bond-Taylor
...
Anja Thieme
M. Lungren
Maria T. A. Wetscherek
Javier Alvarez-Valle
Stephanie L. Hyland
47
38
0
06 Jun 2024
Previous
123...272829...434445
Next