ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.10891
  4. Cited By
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

19 January 2024
Lihe Yang
Bingyi Kang
Zilong Huang
Xiaogang Xu
Jiashi Feng
Hengshuang Zhao
    VLM
ArXivPDFHTML

Papers citing "Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data"

46 / 146 papers shown
Title
DynOMo: Online Point Tracking by Dynamic Online Monocular Gaussian Reconstruction
DynOMo: Online Point Tracking by Dynamic Online Monocular Gaussian Reconstruction
Jenny Seidenschwarz
Qunjie Zhou
Bardienus Duisterhof
Deva Ramanan
Laura Leal-Taixe
45
4
0
03 Sep 2024
DARES: Depth Anything in Robotic Endoscopic Surgery with Self-supervised
  Vector-LoRA of the Foundation Model
DARES: Depth Anything in Robotic Endoscopic Surgery with Self-supervised Vector-LoRA of the Foundation Model
Mona Sheikh Zeinoddin
Chiara Lena
Jiongqi Qu
Luca Carlini
Mattia Magro
...
E. Mazomenos
Daniel C. Alexander
Danail Stoyanov
Matthew J. Clarkson
Mobarakol Islam
34
1
0
30 Aug 2024
A Simple and Generalist Approach for Panoptic Segmentation
A Simple and Generalist Approach for Panoptic Segmentation
Nedyalko Prisadnikov
Wouter Van Gansbeke
Danda Pani Paudel
Luc Van Gool
VLM
43
0
0
29 Aug 2024
P3P: Pseudo-3D Pre-training for Scaling 3D Masked Autoencoders
P3P: Pseudo-3D Pre-training for Scaling 3D Masked Autoencoders
Xuechao Chen
Ying Chen
Jialin Li
Qiang Nie
Hanqiu Deng
Qixing Huang
Yang Li
Yang Li
3DPC
73
0
0
19 Aug 2024
Breaking Class Barriers: Efficient Dataset Distillation via Inter-Class Feature Compensator
Breaking Class Barriers: Efficient Dataset Distillation via Inter-Class Feature Compensator
Xin Zhang
Jiawei Du
Ping Liu
Joey Tianyi Zhou
DD
47
2
0
13 Aug 2024
Modeling Electromagnetic Signal Injection Attacks on Camera-based Smart
  Systems: Applications and Mitigation
Modeling Electromagnetic Signal Injection Attacks on Camera-based Smart Systems: Applications and Mitigation
Youqian Zhang
Michael Cheung
Chunxi Yang
Xinwei Zhai
Zitong Shen
Xinyu Ji
Eugene Y. Fu
Sze-Yiu Chau
Xiapu Luo
AAML
43
1
0
09 Aug 2024
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining
Dongyang Liu
Shitian Zhao
Le Zhuo
Weifeng Lin
Yu Qiao
Xinyue Li
Qi Qin
Yu Qiao
Hongsheng Li
Peng Gao
MLLM
67
48
0
05 Aug 2024
Zero-Shot Surgical Tool Segmentation in Monocular Video Using Segment
  Anything Model 2
Zero-Shot Surgical Tool Segmentation in Monocular Video Using Segment Anything Model 2
Ange Lou
Yamin Li
Ji-Eun Han
Wonjin Yang
Zhi-Qi Cheng
VLM
27
8
0
03 Aug 2024
NVC-1B: A Large Neural Video Coding Model
NVC-1B: A Large Neural Video Coding Model
Xihua Sheng
Chuanbo Tang
Li Li
Dong Liu
Feng Wu
3DV
VLM
50
2
0
28 Jul 2024
Revisit Self-supervised Depth Estimation with Local
  Structure-from-Motion
Revisit Self-supervised Depth Estimation with Local Structure-from-Motion
Shengjie Zhu
Xiaoming Liu
MDE
47
1
0
27 Jul 2024
HoloDreamer: Holistic 3D Panoramic World Generation from Text
  Descriptions
HoloDreamer: Holistic 3D Panoramic World Generation from Text Descriptions
Haiyang Zhou
Xinhua Cheng
Wangbo Yu
Yonghong Tian
Li-ming Yuan
3DGS
DiffM
61
10
0
21 Jul 2024
SegSTRONG-C: Segmenting Surgical Tools Robustly On Non-adversarial Generated Corruptions -- An EndoVis'24 Challenge
SegSTRONG-C: Segmenting Surgical Tools Robustly On Non-adversarial Generated Corruptions -- An EndoVis'24 Challenge
Hao Ding
Tuxun Lu
Yuqian Zhang
Ruixing Liang
Hongchao Shu
...
Bo Wang
Marcos Fernández-Rodríguez
Estevao Lima
João L. Vilaça
Mathias Unberath
63
4
0
16 Jul 2024
3D Foundation Models Enable Simultaneous Geometry and Pose Estimation of
  Grasped Objects
3D Foundation Models Enable Simultaneous Geometry and Pose Estimation of Grasped Objects
Weiming Zhi
Haozhan Tang
Tianyi Zhang
Matthew Johnson-Roberson
36
1
0
14 Jul 2024
Learning Spatial-Semantic Features for Robust Video Object Segmentation
Learning Spatial-Semantic Features for Robust Video Object Segmentation
Xin Li
Deshui Miao
Zhenyu He
Yuhui Wang
Huchuan Lu
Ming Yang
VOS
56
4
0
10 Jul 2024
TAPVid-3D: A Benchmark for Tracking Any Point in 3D
TAPVid-3D: A Benchmark for Tracking Any Point in 3D
Skanda Koppula
Ignacio Rocco
Yi Yang
Joe Heyward
João Carreira
Andrew Zisserman
Gabriel J. Brostow
Carl Doersch
52
14
0
08 Jul 2024
Camera-LiDAR Cross-modality Gait Recognition
Camera-LiDAR Cross-modality Gait Recognition
Wenxuan Guo
Yingping Liang
Zhiyu Pan
Ziheng Xi
Jianjiang Feng
Jie Zhou
CVBM
35
3
0
02 Jul 2024
SVG: 3D Stereoscopic Video Generation via Denoising Frame Matrix
SVG: 3D Stereoscopic Video Generation via Denoising Frame Matrix
Peng Dai
Feitong Tan
Qiangeng Xu
David Futschik
Ruofei Du
S. Fanello
Xiaojuan Qi
Yinda Zhang
VGen
25
4
0
29 Jun 2024
High-resolution open-vocabulary object 6D pose estimation
High-resolution open-vocabulary object 6D pose estimation
Jaime Corsetti
Davide Boscaini
Francesco Giuliari
Changjae Oh
Andrea Cavallaro
Fabio Poiesi
32
1
0
24 Jun 2024
Wild-GS: Real-Time Novel View Synthesis from Unconstrained Photo
  Collections
Wild-GS: Real-Time Novel View Synthesis from Unconstrained Photo Collections
Jiacong Xu
Yiqun Mei
Vishal M. Patel
3DGS
53
18
0
14 Jun 2024
D-NPC: Dynamic Neural Point Clouds for Non-Rigid View Synthesis from Monocular Video
D-NPC: Dynamic Neural Point Clouds for Non-Rigid View Synthesis from Monocular Video
Moritz Kappel
Florian Hahlbohm
Timon Scholz
Susana Castillo
Christian Theobalt
Martin Eisemann
Vladislav Golyanik
M. Magnor
3DH
47
2
0
14 Jun 2024
PatchRefiner: Leveraging Synthetic Data for Real-Domain High-Resolution
  Monocular Metric Depth Estimation
PatchRefiner: Leveraging Synthetic Data for Real-Domain High-Resolution Monocular Metric Depth Estimation
Zhenyu Li
Shariq Farooq Bhat
Peter Wonka
3DV
MDE
33
7
0
10 Jun 2024
Normal-guided Detail-Preserving Neural Implicit Function for High-Fidelity 3D Surface Reconstruction
Normal-guided Detail-Preserving Neural Implicit Function for High-Fidelity 3D Surface Reconstruction
Aarya Patel
Hamid Laga
Ojaswa Sharma
49
1
0
07 Jun 2024
Flash3D: Feed-Forward Generalisable 3D Scene Reconstruction from a
  Single Image
Flash3D: Feed-Forward Generalisable 3D Scene Reconstruction from a Single Image
Stanislaw Szymanowicz
Eldar Insafutdinov
Chuanxia Zheng
Dylan Campbell
João F. Henriques
Christian Rupprecht
Andrea Vedaldi
3DGS
31
49
0
06 Jun 2024
Enhanced Semantic Segmentation Pipeline for WeatherProof Dataset
  Challenge
Enhanced Semantic Segmentation Pipeline for WeatherProof Dataset Challenge
Nan Zhang
Xidan Zhang
Jianing Wei
Fangjun Wang
Zhiming Tan
MDE
36
0
0
06 Jun 2024
The 3D-PC: a benchmark for visual perspective taking in humans and machines
The 3D-PC: a benchmark for visual perspective taking in humans and machines
Drew Linsley
Peisen Zhou
A. Ashok
Akash Nagaraj
Gaurav Gaonkar
Francis E Lewis
Zygmunt Pizlo
Thomas Serre
48
6
0
06 Jun 2024
Effective Data Selection for Seismic Interpretation through Disagreement
Effective Data Selection for Seismic Interpretation through Disagreement
Ryan Benkert
Mohit Prabhushankar
Ghassan AlRegib
32
2
0
01 Jun 2024
Adapting Pre-Trained Vision Models for Novel Instance Detection and Segmentation
Adapting Pre-Trained Vision Models for Novel Instance Detection and Segmentation
Ya Lu
Jishnu Jaykumar
Yunhui Guo
Nicholas Ruozzi
Yu Xiang
VLM
ISeg
58
4
0
28 May 2024
DCPI-Depth: Explicitly Infusing Dense Correspondence Prior to Unsupervised Monocular Depth Estimation
DCPI-Depth: Explicitly Infusing Dense Correspondence Prior to Unsupervised Monocular Depth Estimation
Mengtan Zhang
Yi Feng
Qijun Chen
Rui Fan
MDE
43
5
0
27 May 2024
Splat-SLAM: Globally Optimized RGB-only SLAM with 3D Gaussians
Splat-SLAM: Globally Optimized RGB-only SLAM with 3D Gaussians
Erik Sandström
Keisuke Tateno
Michael Oechsle
Michael Niemeyer
Luc Van Gool
Martin R. Oswald
Federico Tombari
3DGS
36
24
0
26 May 2024
NVS-Solver: Video Diffusion Model as Zero-Shot Novel View Synthesizer
NVS-Solver: Video Diffusion Model as Zero-Shot Novel View Synthesizer
Meng You
Zhiyu Zhu
Hui Liu
Junhui Hou
VGen
DiffM
31
23
0
24 May 2024
Good Seed Makes a Good Crop: Discovering Secret Seeds in Text-to-Image Diffusion Models
Good Seed Makes a Good Crop: Discovering Secret Seeds in Text-to-Image Diffusion Models
Katherine Xu
Lingzhi Zhang
Jianbo Shi
43
12
0
23 May 2024
DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular
  Videos
DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos
Wen-Hsuan Chu
Lei Ke
Katerina Fragkiadaki
3DGS
VGen
25
29
0
03 May 2024
GazeHTA: End-to-end Gaze Target Detection with Head-Target Association
GazeHTA: End-to-end Gaze Target Detection with Head-Target Association
Zhi-Yi Lin
Jouh Yeong Chew
J. C. V. Gemert
Xucong Zhang
44
1
0
16 Apr 2024
RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion
RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion
Jaidev Shriram
Alex Trevithick
Lingjie Liu
Ravi Ramamoorthi
DiffM
3DGS
75
55
0
10 Apr 2024
Spatial Cognition from Egocentric Video: Out of Sight, Not Out of Mind
Spatial Cognition from Egocentric Video: Out of Sight, Not Out of Mind
Chiara Plizzari
Shubham Goel
Toby Perrett
Jacob Chalk
Angjoo Kanazawa
Dima Damen
38
10
0
07 Apr 2024
Gen3DSR: Generalizable 3D Scene Reconstruction via Divide and Conquer from a Single View
Gen3DSR: Generalizable 3D Scene Reconstruction via Divide and Conquer from a Single View
Andreea Dogaru
M. Ozer
Bernhard Egger
3DGS
64
4
0
04 Apr 2024
TRAM: Global Trajectory and Motion of 3D Humans from in-the-wild Videos
TRAM: Global Trajectory and Motion of 3D Humans from in-the-wild Videos
Yufu Wang
ZiYun Wang
Lingjie Liu
Kostas Daniilidis
45
25
0
26 Mar 2024
Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation
Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation
Mu Hu
Wei Yin
C. Zhang
Zhipeng Cai
Xiaoxiao Long
Kaixuan Wang
Kaixuan Wang
Gang Yu
Chunhua Shen
Shaojie Shen
3DGS
54
116
0
22 Mar 2024
LIX: Implicitly Infusing Spatial Geometric Prior Knowledge into Visual Semantic Segmentation for Autonomous Driving
LIX: Implicitly Infusing Spatial Geometric Prior Knowledge into Visual Semantic Segmentation for Autonomous Driving
Sicen Guo
Zhiyuan Wu
Qijun Chen
Ioannis Pitas
Rui Fan
Rui Fan
37
1
0
13 Mar 2024
Splat-Nav: Safe Real-Time Robot Navigation in Gaussian Splatting Maps
Splat-Nav: Safe Real-Time Robot Navigation in Gaussian Splatting Maps
Timothy Chen
O. Shorinwa
Joseph Bruno
Javier Yu
Weijia Zeng
Weijia Zeng
Keiko Nagami
Mac Schwager
Mac Schwager
3DGS
37
31
0
05 Mar 2024
How NeRFs and 3D Gaussian Splatting are Reshaping SLAM: a Survey
How NeRFs and 3D Gaussian Splatting are Reshaping SLAM: a Survey
Fabio Tosi
Youming Zhang
Ziren Gong
Erik Sandström
S. Mattoccia
Martin R. Oswald
Matteo Poggi
3DGS
63
54
0
20 Feb 2024
SceneWiz3D: Towards Text-guided 3D Scene Composition
SceneWiz3D: Towards Text-guided 3D Scene Composition
Qihang Zhang
Chaoyang Wang
Aliaksandr Siarohin
Peiye Zhuang
Yinghao Xu
Ceyuan Yang
Dahua Lin
Bolei Zhou
Sergey Tulyakov
Hsin-Ying Lee
32
31
0
13 Dec 2023
FSGS: Real-Time Few-shot View Synthesis using Gaussian Splatting
FSGS: Real-Time Few-shot View Synthesis using Gaussian Splatting
Zehao Zhu
Zhiwen Fan
Yifan Jiang
Zhangyang Wang
3DGS
20
144
0
01 Dec 2023
Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image
Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image
Wei Yin
Chi Zhang
Hao Chen
Zhipeng Cai
Gang Yu
Kaixuan Wang
Xiaozhi Chen
Chunhua Shen
MDE
134
174
0
20 Jul 2023
Unleashing Text-to-Image Diffusion Models for Visual Perception
Unleashing Text-to-Image Diffusion Models for Visual Perception
Wenliang Zhao
Yongming Rao
Zuyan Liu
Benlin Liu
Jie Zhou
Jiwen Lu
ObjD
VLM
MDE
160
215
0
03 Mar 2023
ImageNet Large Scale Visual Recognition Challenge
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
296
39,198
0
01 Sep 2014
Previous
123