Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.02704
Cited By
v1
v2 (latest)
VGLD: Visually-Guided Linguistic Disambiguation for Monocular Depth Scale Recovery
5 May 2025
Bojin Wu
Jing Chen
MDE
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"VGLD: Visually-Guided Linguistic Disambiguation for Monocular Depth Scale Recovery"
28 / 28 papers shown
Title
DepthMaster: Taming Diffusion Models for Monocular Depth Estimation
Ziyang Song
Zerong Wang
Bo Li
Haoyang Zhang
Ruijie Zhu
Li Liu
Peng-Tao Jiang
Tianzhu Zhang
DiffM
51
4
0
05 Jan 2025
Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation
Mu Hu
Wei Yin
C. Zhang
Zhipeng Cai
Xiaoxiao Long
Kaixuan Wang
Kaixuan Wang
Gang Yu
Chunhua Shen
Shaojie Shen
3DGS
261
134
0
22 Mar 2024
GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image
Xiao Fu
Wei Yin
Mu Hu
Kaixuan Wang
Yuexin Ma
Ping Tan
Shaojie Shen
Dahua Lin
Xiaoxiao Long
DiffM
103
122
0
18 Mar 2024
CLIP Can Understand Depth
Dunam Kim
Seokju Lee
VLM
MDE
110
2
0
05 Feb 2024
EVP: Enhanced Visual Perception using Inverse Multi-Attentive Feature Refinement and Regularized Image-Text Alignment
M. Lavrenyuk
Shariq Farooq Bhat
Matthias Müller
Peter Wonka
ObjD
MDE
63
9
0
13 Dec 2023
Unleashing Text-to-Image Diffusion Models for Visual Perception
Wenliang Zhao
Yongming Rao
Zuyan Liu
Benlin Liu
Jie Zhou
Jiwen Lu
ObjD
VLM
MDE
239
230
0
03 Mar 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
426
4,563
0
30 Jan 2023
All in Tokens: Unifying Output Space of Visual Tasks via Soft Token
Jia Ning
Chen Li
Zheng Zhang
Zigang Geng
Qi Dai
Kun He
Han Hu
101
46
0
05 Jan 2023
PointCLIP V2: Prompting CLIP and GPT for Powerful 3D Open-world Learning
Xiangyang Zhu
Renrui Zhang
Bowei He
Ziyu Guo
Ziyao Zeng
Zipeng Qin
Shanghang Zhang
Peng Gao
VLM
70
145
0
21 Nov 2022
Can Language Understand Depth?
Renrui Zhang
Ziyao Zeng
Ziyu Guo
Yafeng Li
VLM
MDE
79
73
0
03 Jul 2022
MGNet: Monocular Geometric Scene Understanding for Autonomous Driving
Markus Schön
M. Buchholz
Klaus C. J. Dietmayer
3DPC
3DGS
57
42
0
27 Jun 2022
BinsFormer: Revisiting Adaptive Bins for Monocular Depth Estimation
Zhenyu Li
Xuyang Wang
Xianming Liu
Junjun Jiang
MDE
82
194
0
03 Apr 2022
LocalBins: Improving Depth Estimation by Learning Local Distributions
S. Bhat
Ibraheem Alhashim
Peter Wonka
MDE
59
101
0
28 Mar 2022
Visual Prompt Tuning
Menglin Jia
Luming Tang
Bor-Chun Chen
Claire Cardie
Serge Belongie
Bharath Hariharan
Ser-Nam Lim
VLM
VPVLM
153
1,627
0
23 Mar 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
542
4,360
0
28 Jan 2022
DIML/CVL RGB-D Dataset: 2M RGB-D Images of Natural Indoor and Outdoor Scenes
Jaehoon Cho
Dongbo Min
Youngjung Kim
Kwanghoon Sohn
3DV
99
43
0
22 Oct 2021
CLIP-Adapter: Better Vision-Language Models with Feature Adapters
Peng Gao
Shijie Geng
Renrui Zhang
Teli Ma
Rongyao Fang
Yongfeng Zhang
Hongsheng Li
Yu Qiao
VLM
CLIP
299
1,042
0
09 Oct 2021
Adaptive Surface Normal Constraint for Depth Estimation
Xiaoxiao Long
Cheng Lin
Lingjie Liu
Wei Li
Christian Theobalt
Ruigang Yang
Wenping Wang
3DV
79
62
0
29 Mar 2021
Vision Transformers for Dense Prediction
René Ranftl
Alexey Bochkovskiy
V. Koltun
ViT
MDE
138
1,734
0
24 Mar 2021
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
931
29,436
0
26 Feb 2021
AdaBins: Depth Estimation using Adaptive Bins
S. Bhat
Ibraheem Alhashim
Peter Wonka
3DV
MDE
ViT
120
858
0
28 Nov 2020
Targeted Adversarial Perturbations for Monocular Depth Prediction
A. Wong
Safa Cicek
Stefano Soatto
AAML
MDE
53
44
0
12 Jun 2020
DiverseDepth: Affine-invariant Depth Prediction Using Diverse Data
Wei Yin
Xinlong Wang
Chunhua Shen
Yifan Liu
Zhi Tian
Songcen Xu
Changming Sun
Dou Renyin
3DH
MDE
100
70
0
03 Feb 2020
Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer
René Ranftl
Katrin Lasinger
David Hafner
Konrad Schindler
V. Koltun
MDE
204
1,793
0
02 Jul 2019
3D Packing for Self-Supervised Monocular Depth Estimation
Vitor Campagnolo Guizilini
Rares Andrei Ambrus
Sudeep Pillai
Allan Raventos
Adrien Gaidon
SSL
3DPC
MDE
79
648
0
06 May 2019
Deep Ordinal Regression Network for Monocular Depth Estimation
Huan Fu
Biwei Huang
Chaohui Wang
Kayhan Batmanghelich
Dacheng Tao
MDE
484
1,731
0
06 Jun 2018
Sparsity Invariant CNNs
J. Uhrig
N. Schneider
Lukas Schneider
Uwe Franke
Thomas Brox
Andreas Geiger
130
826
0
22 Aug 2017
Depth Map Prediction from a Single Image using a Multi-Scale Deep Network
David Eigen
Christian Puhrsch
Rob Fergus
MDE
3DPC
3DV
239
4,059
0
09 Jun 2014
1