Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.07193
Cited By
DINOv2: Learning Robust Visual Features without Supervision
14 April 2023
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
Vasil Khalidov
Pierre Fernandez
Daniel Haziza
Francisco Massa
Alaaeldin El-Nouby
Mahmoud Assran
Nicolas Ballas
Wojciech Galuba
Russ Howes
Po-Yao (Bernie) Huang
Shang-Wen Li
Ishan Misra
Michael G. Rabbat
Vasu Sharma
Gabriel Synnaeve
Huijiao Xu
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
VLM
CLIP
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DINOv2: Learning Robust Visual Features without Supervision"
50 / 2,220 papers shown
Title
Image First or Text First? Optimising the Sequencing of Modalities in Large Language Model Prompting and Reasoning Tasks
Grant Wardle
Teo Susnjak
LRM
46
0
0
04 Oct 2024
AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark
Wenhao Chai
Enxin Song
Y. Du
Chenlin Meng
Vashisht Madhavan
Omer Bar-Tal
Jeng-Neng Hwang
Saining Xie
Christopher D. Manning
3DV
89
26
0
04 Oct 2024
Optimization Proxies using Limited Labeled Data and Training Time -- A Semi-Supervised Bayesian Neural Network Approach
Parikshit Pareek
K. Sundar
Deepjyoti Deka
Sidhant Misra
Sidhant Misra
57
0
0
04 Oct 2024
RSA: Resolving Scale Ambiguities in Monocular Depth Estimators through Language Descriptions
Ziyao Zeng
Yangchao Wu
Hyoungseob Park
Daniel Wang
Fengyu Yang
Stefano Soatto
Dong Lao
Byung-Woo Hong
Alex Wong
MDE
39
7
0
03 Oct 2024
Learning from Offline Foundation Features with Tensor Augmentations
Emir Konuk
Christos Matsoukas
Moein Sorkhei
Phitchapha Lertsiravaramet
Kevin Smith
OffRL
26
1
0
03 Oct 2024
Unleashing the Potential of the Diffusion Model in Few-shot Semantic Segmentation
Muzhi Zhu
Yang Liu
Zekai Luo
Chenchen Jing
Hao Chen
Guangkai Xu
Xinlong Wang
Chunhua Shen
DiffM
VLM
36
3
0
03 Oct 2024
CaLMFlow: Volterra Flow Matching using Causal Language Models
Shiyang Zhang
Daniel Levine
Ivan Vrkic
Marco Francesco Bressana
David Zhang
S. Rizvi
Yangtian Zhang
E. Zappala
David van Dijk
27
0
0
03 Oct 2024
DecTrain: Deciding When to Train a Monocular Depth DNN Online
Zih-Sing Fu
Soumya Sudhakar
S. Karaman
Vivienne Sze
46
0
0
03 Oct 2024
ControlAR: Controllable Image Generation with Autoregressive Models
Zongming Li
Tianheng Cheng
Shoufa Chen
Peize Sun
Haocheng Shen
Longjin Ran
Xiaoxin Chen
Wenyu Liu
Xinggang Wang
DiffM
143
15
0
03 Oct 2024
Semi-Supervised Fine-Tuning of Vision Foundation Models with Content-Style Decomposition
M. Drozdova
Vitaliy Kinakh
Yury Belousov
E. Lastufka
Slava Voloshynovskiy
47
3
0
02 Oct 2024
FeelAnyForce: Estimating Contact Force Feedback from Tactile Sensation for Vision-Based Tactile Sensors
A. Shahidzadeh
G. Caddeo
Koushik Alapati
Lorenzo Natale
Cornelia Fermuller
Yiannis Aloimonos
30
2
0
02 Oct 2024
Topological mapping for traversability-aware long-range navigation in off-road terrain
J. Tremblay
Julie Alhosh
Louis Petit
F. Lotfi
Lara Landauro
David Meger
34
0
0
02 Oct 2024
Towards a vision foundation model for comprehensive assessment of Cardiac MRI
Athira J. Jacob
Indraneel Borgohain
T. Chitiboi
Puneet Sharma
Dorin Comaniciu
Daniel Rueckert
MedIm
42
4
0
02 Oct 2024
Bayes' Power for Explaining In-Context Learning Generalizations
Samuel G. Müller
Noah Hollmann
Frank Hutter
BDL
44
1
0
02 Oct 2024
Toward a Holistic Evaluation of Robustness in CLIP Models
Weijie Tu
Weijian Deng
Tom Gedeon
VLM
46
5
0
02 Oct 2024
Forte : Finding Outliers with Representation Typicality Estimation
Debargha Ganguly
Warren Morningstar
A. Yu
Vipin Chaudhary
OODD
52
0
0
02 Oct 2024
OmniSR: Shadow Removal under Direct and Indirect Lighting
Jiamin Xu
Zelong Li
Yuxin Zheng
Chenyu Huang
Renshu Gu
Weiwei Xu
Gang Xu
3DV
53
1
0
02 Oct 2024
Towards Full-parameter and Parameter-efficient Self-learning For Endoscopic Camera Depth Estimation
Shuting Zhao
Chenkang Du
Kristin Qi
Xinrong Chen
Xinhan Di
MDE
42
0
0
01 Oct 2024
Flex3D: Feed-Forward 3D Generation With Flexible Reconstruction Model And Input View Curation
Junlin Han
Jianyuan Wang
Andrea Vedaldi
Philip Torr
Filippos Kokkinos
38
4
0
01 Oct 2024
Evaluating Deep Regression Models for WSI-Based Gene-Expression Prediction
Fredrik K. Gustafsson
Mattias Rantalainen
37
0
0
01 Oct 2024
Arges: Spatio-Temporal Transformer for Ulcerative Colitis Severity Assessment in Endoscopy Videos
Krishna Chaitanya
Pablo F. Damasceno
Shreyas Fadnavis
Pooya Mobadersany
Chaitanya Parmar
...
Lindsey Surace
Louis R. Ghanem
Oana Gabriela Cula
Tommaso Mansi
K. Standish
26
0
0
01 Oct 2024
iTeach: Interactive Teaching for Robot Perception using Mixed Reality
Jishnu Jaykumar P
Cole Salvato
Vinaya Bomnale
Jikai Wang
Yu Xiang
55
0
0
01 Oct 2024
OpenAnimals: Revisiting Person Re-Identification for Animals Towards Better Generalization
Saihui Hou
Panjian Huang
Zengbin Wang
Yang Liu
Zeyu Li
Man Zhang
Yongzhen Huang
38
0
0
30 Sep 2024
Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained Transformers
Lirui Wang
Xinlei Chen
Jialiang Zhao
Kaiming He
41
34
0
30 Sep 2024
World to Code: Multi-modal Data Generation via Self-Instructed Compositional Captioning and Filtering
Jiacong Wang
Bohong Wu
Haiyong Jiang
Xun Zhou
Xin Xiao
Haoyuan Guo
Jun Xiao
VLM
VGen
51
4
0
30 Sep 2024
Loose Social-Interaction Recognition in Real-world Therapy Scenarios
Abid Ali
Rui Dai
Ashish Marisetty
Guillaume Astruc
Monique Thonnat
J. Odobez
Susanne Thümmler
Francois Bremond
46
1
0
30 Sep 2024
UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models
Qiaojun Yu
Siyuan Huang
Xibin Yuan
Zhengkai Jiang
Ce Hao
...
Junbo Wang
Liu Liu
Hongsheng Li
Peng Gao
Cewu Lu
78
3
0
30 Sep 2024
Towards Effective Utilization of Mixed-Quality Demonstrations in Robotic Manipulation via Segment-Level Selection and Optimization
Jingjing Chen
Hongjie Fang
Hao-Shu Fang
Cewu Lu
45
2
0
30 Sep 2024
Exploring Social Media Image Categorization Using Large Models with Different Adaptation Methods: A Case Study on Cultural Nature's Contributions to People
Rohaifa Khaldi
Domingo Alcaraz-Segura
Ignacio Sánchez-Herrera
Javier Martinez-Lopez
Carlos Javier Navarro
Siham Tabik
VLM
18
1
0
30 Sep 2024
Lessons Learned from Developing a Human-Centered Guide Dog Robot for Mobility Assistance
Hochul Hwang
Ken Suzuki
Nicholas A. Giudice
Joydeep Biswas
S. I. Lee
Donghyun Kim
26
0
0
29 Sep 2024
Flipped Classroom: Aligning Teacher Attention with Student in Generalized Category Discovery
Haonan Lin
Wenbin An
Jiahao Wang
Yan Chen
Feng Tian
Mengmeng Wang
Guang Dai
Qianying Wang
Jingdong Wang
49
2
0
29 Sep 2024
OptiGrasp: Optimized Grasp Pose Detection Using RGB Images for Warehouse Picking Robots
Soofiyan Atar
Yi Li
Markus Grotz
Michael Wolf
Dieter Fox
Joshua Smith
45
1
0
29 Sep 2024
KineDepth: Utilizing Robot Kinematics for Online Metric Depth Estimation
Soofiyan Atar
Yuheng Zhi
Florian Richter
Michael C. Yip
MDE
44
0
0
29 Sep 2024
G3R: Gradient Guided Generalizable Reconstruction
Yun Chen
Jingkang Wang
Ze Yang
S. Manivasagam
R. Urtasun
45
9
0
28 Sep 2024
VLAD-BuFF: Burst-aware Fast Feature Aggregation for Visual Place Recognition
Ahmad Khaliq
Ming Xu
Stephen Hausler
Michael Milford
Sourav Garg
CoGe
39
3
0
28 Sep 2024
A Parameter-Efficient Tuning Framework for Language-guided Object Grounding and Robot Grasping
Houjian Yu
Mingen Li
Alireza Rezazadeh
Yang Yang
Changhyun Choi
57
1
0
28 Sep 2024
ProMerge: Prompt and Merge for Unsupervised Instance Segmentation
Dylan Li
Gyungin Shin
37
3
0
27 Sep 2024
Improving Visual Object Tracking through Visual Prompting
Shih-Fang Chen
Jun-Cheng Chen
I-Hong Jhuo
Yen-Yu Lin
VLM
41
1
0
27 Sep 2024
How Effective is Pre-training of Large Masked Autoencoders for Downstream Earth Observation Tasks?
Jose Sosa
Mohamed Aloulou
Danila Rukhovich
Rim Sleimi
Boonyarit Changaival
Anis Kacem
Djamila Aouada
45
1
0
27 Sep 2024
Evaluation of Security of ML-based Watermarking: Copy and Removal Attacks
Vitaliy Kinakh
Brian Pulfer
Yury Belousov
Pierre Fernandez
Teddy Furon
Slava Voloshynovskiy
24
2
0
26 Sep 2024
SKT: Integrating State-Aware Keypoint Trajectories with Vision-Language Models for Robotic Garment Manipulation
Xin Li
Siyuan Huang
Qiaojun Yu
Zhengkai Jiang
Ce Hao
Yimeng Zhu
Hongsheng Li
Peng Gao
Cewu Lu
47
0
0
26 Sep 2024
FreeEdit: Mask-free Reference-based Image Editing with Multi-modal Instruction
Runze He
Kai Ma
Linjiang Huang
Shaofei Huang
Jialin Gao
Xiaoming Wei
Jiao Dai
Jizhong Han
Si Liu
DiffM
52
8
0
26 Sep 2024
Revisit Anything: Visual Place Recognition via Image Segment Retrieval
Kartik Garg
Sai Shubodh Puligilla
Shishir Kolathaya
Madhava Krishna
Sourav Garg
46
4
0
26 Sep 2024
DARE: Diverse Visual Question Answering with Robustness Evaluation
Hannah Sterz
Jonas Pfeiffer
Ivan Vulić
OOD
VLM
31
2
0
26 Sep 2024
AnyLogo: Symbiotic Subject-Driven Diffusion System with Gemini Status
Jinghao Zhang
Wen Qian
Hao Luo
Fan Wang
Feng Zhao
DiffM
34
0
0
26 Sep 2024
Search and Detect: Training-Free Long Tail Object Detection via Web-Image Retrieval
Mankeerat Sidhu
Hetarth Chopra
Ansel Blume
Jeonghwan Kim
Revanth Gangi Reddy
Heng Ji
ObjD
VLM
36
0
0
26 Sep 2024
SECURE: Semantics-aware Embodied Conversation under Unawareness for Lifelong Robot Learning
Rimvydas Rubavicius
Peter David Fagan
A. Lascarides
Subramanian Ramamoorthy
LM&Ro
238
0
0
26 Sep 2024
Multi-View and Multi-Scale Alignment for Contrastive Language-Image Pre-training in Mammography
Yuexi Du
John Onofrey
Nicha Dvornek
VLM
60
1
0
26 Sep 2024
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Kai Chen
Yunhao Gou
Runhui Huang
Zhili Liu
Daxin Tan
...
Qun Liu
Jun Yao
Lu Hou
Hang Xu
Hang Xu
AuLLM
MLLM
VLM
82
23
0
26 Sep 2024
Canonical Representation and Force-Based Pretraining of 3D Tactile for Dexterous Visuo-Tactile Policy Learning
Tianhao Wu
Jinzhou Li
Jiyao Zhang
Mingdong Wu
Hao Dong
SSL
44
6
0
26 Sep 2024
Previous
1
2
3
...
19
20
21
...
43
44
45
Next