Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.07637
Cited By
Is Pre-training Applicable to the Decoder for Dense Prediction?
5 March 2025
Chao Ning
Wanshui Gan
Weihao Xuan
Naoto Yokoya
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Is Pre-training Applicable to the Decoder for Dense Prediction?"
50 / 56 papers shown
Title
UniDepth: Universal Monocular Metric Depth Estimation
Luigi Piccinelli
Yung-Hsu Yang
Daniel Gehrig
Mattia Segu
Siyuan Li
Luc Van Gool
Fisher Yu
VLM
MDE
98
131
0
27 Mar 2024
Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation
Mu Hu
Wei Yin
C. Zhang
Zhipeng Cai
Xiaoxiao Long
Kaixuan Wang
Kaixuan Wang
Gang Yu
Chunhua Shen
Shaojie Shen
3DGS
114
121
0
22 Mar 2024
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Lihe Yang
Bingyi Kang
Zilong Huang
Xiaogang Xu
Jiashi Feng
Hengshuang Zhao
VLM
170
744
0
19 Jan 2024
IEBins: Iterative Elastic Bins for Monocular Depth Estimation
Shuwei Shao
Z. Pei
Xingming Wu
Zhong Liu
Weihai Chen
Zhengguo Li
MDE
39
50
0
25 Sep 2023
NDDepth: Normal-Distance Assisted Monocular Depth Estimation
Shuwei Shao
Z. Pei
Weihai Chen
Xingming Wu
Zhengguo Li
MDE
32
43
0
19 Sep 2023
GEDepth: Ground Embedding for Monocular Depth Estimation
Xiaodong Yang
Zhuang Ma
Zhiyu Ji
Zhe Ren
MDE
48
24
0
18 Sep 2023
DINOv2: Learning Robust Visual Features without Supervision
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
...
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
VLM
CLIP
SSL
242
3,205
0
14 Apr 2023
Segment Anything
A. Kirillov
Eric Mintun
Nikhila Ravi
Hanzi Mao
Chloe Rolland
...
Spencer Whitehead
Alexander C. Berg
Wan-Yen Lo
Piotr Dollár
Ross B. Girshick
MLLM
VLM
230
7,047
0
05 Apr 2023
Single Image Depth Prediction Made Better: A Multivariate Gaussian Take
Ce Liu
Suryansh Kumar
Shuhang Gu
Radu Timofte
Luc Van Gool
MDE
VLM
74
15
0
31 Mar 2023
DDP: Diffusion Model for Dense Visual Prediction
Yuanfeng Ji
Zhe Chen
Enze Xie
Lanqing Hong
Xihui Liu
Zhaoqiang Liu
Tong Lu
Zhenguo Li
Ping Luo
DiffM
VLM
85
132
0
30 Mar 2023
ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders
Sanghyun Woo
Shoubhik Debnath
Ronghang Hu
Xinlei Chen
Zhuang Liu
In So Kweon
Saining Xie
SyDa
120
760
0
02 Jan 2023
CroCo v2: Improved Cross-view Completion Pre-training for Stereo Matching and Optical Flow
Philippe Weinzaepfel
Thomas Lucas
Vincent Leroy
Yohann Cabon
Vaibhav Arora
Romain Brégier
G. Csurka
L. Antsfeld
Boris Chidlovskii
Jérôme Revaud
ViT
61
87
0
18 Nov 2022
Unifying Flow, Stereo and Depth Estimation
Haofei Xu
Jing Zhang
Jianfei Cai
Hamid Rezatofighi
Feng Yu
Dacheng Tao
Andreas Geiger
MDE
67
201
0
10 Nov 2022
Attention Attention Everywhere: Monocular Depth Prediction with Skip Attention
Ashutosh Agarwal
Chetan Arora
MDE
35
138
0
17 Oct 2022
BinsFormer: Revisiting Adaptive Bins for Monocular Depth Estimation
Zhenyu Li
Xuyang Wang
Xianming Liu
Junjun Jiang
MDE
47
193
0
03 Apr 2022
LocalBins: Improving Depth Estimation by Learning Local Distributions
S. Bhat
Ibraheem Alhashim
Peter Wonka
MDE
43
100
0
28 Mar 2022
DepthFormer: Exploiting Long-Range Correlation and Local Information for Accurate Monocular Depth Estimation
Zhenyu Li
Zehui Chen
Xianming Liu
Junjun Jiang
ViT
MDE
42
185
1
27 Mar 2022
StructToken : Rethinking Semantic Segmentation with Structural Prior
Fangjian Lin
Zhanhao Liang
Miao Zheng
Junjun He
Kaibing Chen
Sheng Tian
42
49
0
23 Mar 2022
NeW CRFs: Neural Window Fully-connected CRFs for Monocular Depth Estimation
Weihao Yuan
Xiaodong Gu
Zuozhuo Dai
Siyu Zhu
Ping Tan
54
178
0
03 Mar 2022
A ConvNet for the 2020s
Zhuang Liu
Hanzi Mao
Chaozheng Wu
Christoph Feichtenhofer
Trevor Darrell
Saining Xie
ViT
60
5,073
0
10 Jan 2022
Masked-attention Mask Transformer for Universal Image Segmentation
Bowen Cheng
Ishan Misra
Alex Schwing
Alexander Kirillov
Rohit Girdhar
ISeg
173
2,315
0
02 Dec 2021
Per-Pixel Classification is Not All You Need for Semantic Segmentation
Bowen Cheng
Alex Schwing
Alexander Kirillov
VLM
ViT
116
1,517
0
13 Jul 2021
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
156
2,785
0
15 Jun 2021
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
Enze Xie
Wenhai Wang
Zhiding Yu
Anima Anandkumar
J. Álvarez
Ping Luo
ViT
120
4,934
0
31 May 2021
Segmenter: Transformer for Semantic Segmentation
Robin Strudel
Ricardo Garcia Pinel
Ivan Laptev
Cordelia Schmid
ViT
116
1,442
0
12 May 2021
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
542
5,920
0
29 Apr 2021
Vision Transformers with Patch Diversification
Chengyue Gong
Dilin Wang
Meng Li
Vikas Chandra
Qiang Liu
ViT
52
63
0
26 Apr 2021
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
ViT
246
21,051
0
25 Mar 2021
Vision Transformers for Dense Prediction
René Ranftl
Alexey Bochkovskiy
V. Koltun
ViT
MDE
107
1,696
0
24 Mar 2021
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
605
28,659
0
26 Feb 2021
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
Sixiao Zheng
Jiachen Lu
Hengshuang Zhao
Xiatian Zhu
Zekun Luo
...
Yanwei Fu
Jianfeng Feng
Tao Xiang
Philip Torr
Li Zhang
ViT
109
2,872
0
31 Dec 2020
AdaBins: Depth Estimation using Adaptive Bins
S. Bhat
Ibraheem Alhashim
Peter Wonka
3DV
MDE
ViT
79
845
0
28 Nov 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
259
40,217
0
22 Oct 2020
Root Mean Square Layer Normalization
Biao Zhang
Rico Sennrich
39
712
0
16 Oct 2019
Panoptic-DeepLab
Bowen Cheng
Maxwell D. Collins
Yukun Zhu
Ting Liu
Thomas S. Huang
Hartwig Adam
Liang-Chieh Chen
38
610
0
10 Oct 2019
From Big to Small: Multi-Scale Local Planar Guidance for Monocular Depth Estimation
Jin Han Lee
Myung-Kyu Han
D. W. Ko
I. Suh
3DV
MDE
90
679
0
24 Jul 2019
3D Packing for Self-Supervised Monocular Depth Estimation
Vitor Campagnolo Guizilini
Rares Andrei Ambrus
Sudeep Pillai
Allan Raventos
Adrien Gaidon
SSL
3DPC
MDE
54
643
0
06 May 2019
Panoptic Feature Pyramid Networks
Alexander Kirillov
Ross B. Girshick
Kaiming He
Piotr Dollár
ISeg
SSeg
84
1,278
0
08 Jan 2019
CCNet: Criss-Cross Attention for Semantic Segmentation
Zilong Huang
Xinggang Wang
Yunchao Wei
Lichao Huang
Humphrey Shi
Wenyu Liu
Chang Huang
VOS
111
2,531
0
28 Nov 2018
Unified Perceptual Parsing for Scene Understanding
Tete Xiao
Yingcheng Liu
Bolei Zhou
Yuning Jiang
Jian Sun
OCL
VOS
95
1,859
0
26 Jul 2018
Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
Liang-Chieh Chen
Yukun Zhu
George Papandreou
Florian Schroff
Hartwig Adam
SSeg
106
13,005
0
07 Feb 2018
Non-local Neural Networks
Xinyu Wang
Ross B. Girshick
Abhinav Gupta
Kaiming He
OffRL
192
8,867
0
21 Nov 2017
Revisiting Unreasonable Effectiveness of Data in Deep Learning Era
Chen Sun
Abhinav Shrivastava
Saurabh Singh
Abhinav Gupta
VLM
91
2,378
0
10 Jul 2017
Rethinking Atrous Convolution for Semantic Image Segmentation
Liang-Chieh Chen
George Papandreou
Florian Schroff
Hartwig Adam
SSeg
133
8,425
0
17 Jun 2017
Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning
Stefan Elfwing
E. Uchibe
Kenji Doya
53
1,690
0
10 Feb 2017
Feature Pyramid Networks for Object Detection
Nayeon Lee
Piotr Dollár
Ross B. Girshick
Kaiming He
Bharath Hariharan
Serge J. Belongie
ObjD
403
21,951
0
09 Dec 2016
Pyramid Scene Parsing Network
Hengshuang Zhao
Jianping Shi
Xiaojuan Qi
Xiaogang Wang
Jiaya Jia
VOS
SSeg
298
11,941
0
04 Dec 2016
InstanceCut: from Edges to Instances with MultiCut
Alexander Kirillov
Evgeny Levinkov
Bjoern Andres
Bogdan Savchynskyy
Carsten Rother
SSeg
50
250
0
24 Nov 2016
RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation
Guosheng Lin
Anton Milan
Chunhua Shen
Ian Reid
AI4TS
SSeg
227
2,835
0
20 Nov 2016
Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network
Wenzhe Shi
Jose Caballero
Ferenc Huszár
J. Totz
Andrew P. Aitken
Rob Bishop
Daniel Rueckert
Zehan Wang
SupR
309
5,205
0
16 Sep 2016
1
2
Next