ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.15506
  4. Cited By
Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation

Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation

22 March 2024
Mu Hu
Wei Yin
C. Zhang
Zhipeng Cai
Xiaoxiao Long
Kaixuan Wang
Kaixuan Wang
Gang Yu
Chunhua Shen
Shaojie Shen
    3DGS
ArXivPDFHTML

Papers citing "Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation"

50 / 191 papers shown
Title
GeoMan: Temporally Consistent Human Geometry Estimation using Image-to-Video Diffusion
GeoMan: Temporally Consistent Human Geometry Estimation using Image-to-Video Diffusion
Gwanghyun Kim
Xueting Li
Ye Yuan
Koki Nagano
Tianye Li
Jan Kautz
Se Young Chun
Umar Iqbal
DiffM
42
0
0
29 May 2025
Bridging Geometric and Semantic Foundation Models for Generalized Monocular Depth Estimation
Bridging Geometric and Semantic Foundation Models for Generalized Monocular Depth Estimation
Sanggyun Ma
Wonjoon Choi
Jihun Park
Jaeyeul Kim
Seunghun Lee
Jiwan Seo
S. Im
37
0
0
29 May 2025
Multi-SpatialMLLM: Multi-Frame Spatial Understanding with Multi-Modal Large Language Models
Multi-SpatialMLLM: Multi-Frame Spatial Understanding with Multi-Modal Large Language Models
Runsen Xu
Weiyao Wang
Hao Tang
Xingyu Chen
Xiaodong Wang
Fu-Jen Chu
Dahua Lin
Matt Feiszli
Kevin J. Liang
LRM
77
1
0
22 May 2025
MonoMobility: Zero-Shot 3D Mobility Analysis from Monocular Videos
MonoMobility: Zero-Shot 3D Mobility Analysis from Monocular Videos
Hongyi Zhou
Xiaogang Wang
Yulan Guo
Kai Xu
48
0
0
17 May 2025
SurgPose: Generalisable Surgical Instrument Pose Estimation using Zero-Shot Learning and Stereo Vision
SurgPose: Generalisable Surgical Instrument Pose Estimation using Zero-Shot Learning and Stereo Vision
Utsav Rai
Haozheng Xu
Stamatia Giannarou
MedIm
60
0
0
16 May 2025
Depth Anything with Any Prior
Depth Anything with Any Prior
Zehan Wang
Siyu Chen
Lihe Yang
Jialei Wang
Ziang Zhang
Hengshuang Zhao
Zhou Zhao
3DGS
VLM
MDE
68
0
0
15 May 2025
Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis
Bingxin Ke
Kevin Qu
Tianfu Wang
Nando Metzger
Shengyu Huang
Bo Li
Anton Obukhov
Konrad Schindler
DiffM
VLM
91
1
0
14 May 2025
From Seeing to Doing: Bridging Reasoning and Decision for Robotic Manipulation
From Seeing to Doing: Bridging Reasoning and Decision for Robotic Manipulation
Yifu Yuan
Haiqin Cui
Yibin Chen
Zibin Dong
Fei Ni
Longxin Kou
Jinyi Liu
Pengyi Li
Yan Zheng
Jianye Hao
86
0
0
13 May 2025
ElectricSight: 3D Hazard Monitoring for Power Lines Using Low-Cost Sensors
ElectricSight: 3D Hazard Monitoring for Power Lines Using Low-Cost Sensors
Xingchen Li
Liwen Wang
Yu Sheng
ZhiPeng Tang
Haojie Ren
Guoliang You
YiFan Duan
Jianmin Ji
Yanyong Zhang
61
0
0
10 May 2025
VGLD: Visually-Guided Linguistic Disambiguation for Monocular Depth Scale Recovery
VGLD: Visually-Guided Linguistic Disambiguation for Monocular Depth Scale Recovery
Bojin Wu
Jing Chen
MDE
82
0
0
05 May 2025
Pixel3DMM: Versatile Screen-Space Priors for Single-Image 3D Face Reconstruction
Pixel3DMM: Versatile Screen-Space Priors for Single-Image 3D Face Reconstruction
Simon Giebenhain
Tobias Kirschstein
Martin Rünz
Lourdes Agapito
Matthias Nießner
CVBM
3DH
94
0
0
01 May 2025
The Fourth Monocular Depth Estimation Challenge
The Fourth Monocular Depth Estimation Challenge
Anton Obukhov
Matteo Poggi
Fabio Tosi
Ripudaman Singh Arora
Jaime Spencer
...
Tuan-Anh Yang
Minh-Quang Nguyen
T. Tran
Albert Luginov
Muhammad Shahzad
MDE
357
1
0
24 Apr 2025
A Guide to Structureless Visual Localization
A Guide to Structureless Visual Localization
Vojtech Panek
Qunjie Zhou
Yaqing Ding
Sérgio Agostinho
Zuzana Kúkelová
Torsten Sattler
Laura Leal-Taixe
65
0
0
24 Apr 2025
Physically Consistent Humanoid Loco-Manipulation using Latent Diffusion Models
Physically Consistent Humanoid Loco-Manipulation using Latent Diffusion Models
Ilyass Taouil
Haizhou Zhao
Angela Dai
Majid Khadiv
DiffM
79
0
0
23 Apr 2025
MonoTher-Depth: Enhancing Thermal Depth Estimation via Confidence-Aware Distillation
MonoTher-Depth: Enhancing Thermal Depth Estimation via Confidence-Aware Distillation
Xingxing Zuo
Nikhil Ranganathan
Connor T. Lee
Georgia Gkioxari
Soon-Jo Chung
VLM
148
1
0
21 Apr 2025
Metric-Solver: Sliding Anchored Metric Depth Estimation from a Single Image
Metric-Solver: Sliding Anchored Metric Depth Estimation from a Single Image
Tao Wen
Jiadong Wang
Yuxiao Chen
Shugong Xu
Chi Zhang
Xuelong Li
MDE
87
0
0
16 Apr 2025
RealCam-Vid: High-resolution Video Dataset with Dynamic Scenes and Metric-scale Camera Movements
RealCam-Vid: High-resolution Video Dataset with Dynamic Scenes and Metric-scale Camera Movements
Guangcong Zheng
Teng Li
Xianpan Zhou
Xi Li
VGen
3DV
90
1
0
11 Apr 2025
FlashDepth: Real-time Streaming Video Depth Estimation at 2K Resolution
FlashDepth: Real-time Streaming Video Depth Estimation at 2K Resolution
Gene Chou
Wenqi Xian
Guandao Yang
Mohamed Abdelfattah
Bharath Hariharan
Noah Snavely
Ning Yu
P. Debevec
MDE
96
0
0
09 Apr 2025
POMATO: Marrying Pointmap Matching with Temporal Motion for Dynamic 3D Reconstruction
POMATO: Marrying Pointmap Matching with Temporal Motion for Dynamic 3D Reconstruction
Songyan Zhang
Yongtao Ge
Jinyuan Tian
Guangkai Xu
Hao Chen
Chen Lv
Chunhua Shen
3DPC
67
0
0
08 Apr 2025
NuScenes-SpatialQA: A Spatial Understanding and Reasoning Benchmark for Vision-Language Models in Autonomous Driving
NuScenes-SpatialQA: A Spatial Understanding and Reasoning Benchmark for Vision-Language Models in Autonomous Driving
Kexin Tian
Jingrui Mao
Yu Zhang
Jiwan Jiang
Yang Zhou
Zhengzhong Tu
CoGe
118
3
0
04 Apr 2025
WildGS-SLAM: Monocular Gaussian Splatting SLAM in Dynamic Environments
WildGS-SLAM: Monocular Gaussian Splatting SLAM in Dynamic Environments
Jianhao Zheng
Zihan Zhu
Valentin Bieri
Marc Pollefeys
Songyou Peng
Iro Armeni
3DGS
77
2
0
04 Apr 2025
FlowR: Flowing from Sparse to Dense 3D Reconstructions
FlowR: Flowing from Sparse to Dense 3D Reconstructions
Tobias Fischer
Samuel Rota Buló
Yung-Hsu Yang
Nikhil Varma Keetha
Lorenzo Porzi
Norman Muller
Katja Schwarz
Jonathon Luiten
Marc Pollefeys
Peter Kontschieder
3DGS
92
1
0
02 Apr 2025
DEPTHOR: Depth Enhancement from a Practical Light-Weight dToF Sensor and RGB Image
DEPTHOR: Depth Enhancement from a Practical Light-Weight dToF Sensor and RGB Image
Jijun Xiang
Xuan Zhu
Xianqi Wang
Yuanbo Wang
Hao Zhang
Fei Guo
Xin-She Yang
68
0
0
02 Apr 2025
GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors
GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors
Tian-Xing Xu
Xiangjun Gao
Wenbo Hu
Xiaoyu Li
Song-Hai Zhang
Ying Shan
VGen
MDE
110
1
0
01 Apr 2025
AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual Videos
AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual Videos
Felix Wimbauer
Weirong Chen
Dominik Muhle
Christian Rupprecht
Daniel Cremers
VGen
139
0
0
30 Mar 2025
MVSAnywhere: Zero-Shot Multi-View Stereo
MVSAnywhere: Zero-Shot Multi-View Stereo
Sergio Izquierdo
Mohamed Sayed
Michael Firman
Guillermo Garcia-Hernando
Daniyar Turmukhambetov
Javier Civera
Oisin Mac Aodha
Gabriel J. Brostow
Jamie Watson
3DV
81
4
0
28 Mar 2025
Uni4D: Unifying Visual Foundation Models for 4D Modeling from a Single Video
Uni4D: Unifying Visual Foundation Models for 4D Modeling from a Single Video
David Yifan Yao
Albert Zhai
Shenlong Wang
VGen
114
1
0
27 Mar 2025
ST-VLM: Kinematic Instruction Tuning for Spatio-Temporal Reasoning in Vision-Language Models
ST-VLM: Kinematic Instruction Tuning for Spatio-Temporal Reasoning in Vision-Language Models
Dohwan Ko
S. Kim
Yumin Suh
Vijay Kumar B.G
Minseo Yoon
Manmohan Chandraker
Hyunwoo J. Kim
LRM
71
0
0
25 Mar 2025
CoMapGS: Covisibility Map-based Gaussian Splatting for Sparse Novel View Synthesis
CoMapGS: Covisibility Map-based Gaussian Splatting for Sparse Novel View Synthesis
Youngkyoon Jang
Eduardo Pérez-Pellitero
88
0
0
25 Mar 2025
MonoInstance: Enhancing Monocular Priors via Multi-view Instance Alignment for Neural Rendering and Reconstruction
MonoInstance: Enhancing Monocular Priors via Multi-view Instance Alignment for Neural Rendering and Reconstruction
Wenyuan Zhang
Yixiao Yang
Han Huang
Liang Han
Kanle Shi
Yu-Shen Liu
Zhizhong Han
MDE
108
3
0
24 Mar 2025
Distilling Monocular Foundation Model for Fine-grained Depth Completion
Distilling Monocular Foundation Model for Fine-grained Depth Completion
Yingping Liang
Yutao Hu
Wenqi Shao
Ying Fu
MDE
72
1
0
21 Mar 2025
Loop Closure from Two Views: Revisiting PGO for Scalable Trajectory Estimation through Monocular Priors
Loop Closure from Two Views: Revisiting PGO for Scalable Trajectory Estimation through Monocular Priors
Tian Yi Lim
Boyang Sun
Marc Pollefeys
Hermann Blum
78
1
0
20 Mar 2025
UniK3D: Universal Camera Monocular 3D Estimation
UniK3D: Universal Camera Monocular 3D Estimation
Luigi Piccinelli
Daniel Gehrig
Mattia Segu
Yifan Yang
Siyuan Li
Wim Abbeloos
Luc Van Gool
MDE
70
1
0
20 Mar 2025
A Recipe for Generating 3D Worlds From a Single Image
A Recipe for Generating 3D Worlds From a Single Image
Katja Schwarz
Denys Rozumnyi
Samuel Rota Buló
Lorenzo Porzi
Peter Kontschieder
VGen
114
3
0
20 Mar 2025
Vision-Language Embodiment for Monocular Depth Estimation
Vision-Language Embodiment for Monocular Depth Estimation
Jinchang Zhang
Guoyu Lu
VLM
MDE
143
1
0
18 Mar 2025
Niagara: Normal-Integrated Geometric Affine Field for Scene Reconstruction from a Single View
Niagara: Normal-Integrated Geometric Affine Field for Scene Reconstruction from a Single View
Xianzu Wu
Zhenxin Ai
Harry Yang
Ser-Nam Lim
Jun Liu
Haoran Wang
3DV
88
0
0
16 Mar 2025
Seeing and Seeing Through the Glass: Real and Synthetic Data for Multi-Layer Depth Estimation
Hongyu Wen
Yiming Zuo
Venkat Subramanian
Patrick Chen
Jia Deng
3DV
132
0
0
14 Mar 2025
LiSu: A Dataset and Method for LiDAR Surface Normal Estimation
Dušan Malić
Christian Fruhwirth-Reisinger
Samuel Schulter
Horst Possegger
3DV
99
0
0
11 Mar 2025
VidBot: Learning Generalizable 3D Actions from In-the-Wild 2D Human Videos for Zero-Shot Robotic Manipulation
VidBot: Learning Generalizable 3D Actions from In-the-Wild 2D Human Videos for Zero-Shot Robotic Manipulation
Hanzhi Chen
Boyang Sun
Anran Zhang
Marc Pollefeys
Stefan Leutenegger
LM&Ro
115
0
0
10 Mar 2025
LBM: Latent Bridge Matching for Fast Image-to-Image Translation
Clement Chadebec
O. Tasar
Sanjeev Sreetharan
Benjamin Aubin
115
0
0
10 Mar 2025
A Novel Solution for Drone Photogrammetry with Low-overlap Aerial Images using Monocular Depth Estimation
J. Zhong
Qi Zhou
Ming Li
Armin Gruen
Xuan Liao
MDE
82
0
0
06 Mar 2025
DuCos: Duality Constrained Depth Super-Resolution via Foundation Model
Zhiqiang Yan
Zhengxue Wang
Haoye Dong
Jun Yu Li
Jian Yang
Gim Hee Lee
102
0
0
06 Mar 2025
H3O: Hyper-Efficient 3D Occupancy Prediction with Heterogeneous Supervision
Y. Shi
H. Cai
Amin Ansari
Fatih Porikli
86
1
0
06 Mar 2025
Is Pre-training Applicable to the Decoder for Dense Prediction?
Is Pre-training Applicable to the Decoder for Dense Prediction?
Chao Ning
Wanshui Gan
Weihao Xuan
Naoto Yokoya
223
0
0
05 Mar 2025
Back to the Future Cyclopean Stereo: a human perception approach combining deep and geometric constraints
Back to the Future Cyclopean Stereo: a human perception approach combining deep and geometric constraints
Sherlon Almeida da Silva
Davi Geiger
Luiz Velho
Moacir Antonelli Ponti
69
0
0
28 Feb 2025
UniDepthV2: Universal Monocular Metric Depth Estimation Made Simpler
UniDepthV2: Universal Monocular Metric Depth Estimation Made Simpler
Luigi Piccinelli
Daniel Gehrig
Yifan Yang
Mattia Segu
Siyuan Li
Wim Abbeloos
Luc Van Gool
MDE
113
10
0
27 Feb 2025
Matrix3D: Large Photogrammetry Model All-in-One
Matrix3D: Large Photogrammetry Model All-in-One
Yuanxun Lu
Jingyang Zhang
Tian Fang
Jean-Daniel Nahmias
Yanghai Tsin
Long Quan
Xun Cao
Yao Yao
Shiwei Li
168
5
0
11 Feb 2025
Survey on Monocular Metric Depth Estimation
Survey on Monocular Metric Depth Estimation
Jiuling Zhang
VLM
201
0
0
21 Jan 2025
FrontierNet: Learning Visual Cues to Explore
FrontierNet: Learning Visual Cues to Explore
Boyang Sun
Hanzhi Chen
Stefan Leutenegger
Cesar Cadena
Marc Pollefeys
Hermann Blum
100
0
0
08 Jan 2025
DPBridge: Latent Diffusion Bridge for Dense Prediction
DPBridge: Latent Diffusion Bridge for Dense Prediction
Haorui Ji
Taojun Lin
Hongdong Li
DiffM
221
1
0
29 Dec 2024
1234
Next