RigNet: Repetitive Image Guided Network for Depth Completion
- 3DVVLM
Depth completion deals with the problem of recovering dense depth maps from sparse ones, where color images are often used to facilitate this task. Recent approaches mainly focus on image guided learning to predict dense results. However, blurry guidance in image and unclear structure in depth still impede the performance of the image guided frameworks. Inspired by the popular mechanism of looking and thinking twice, we explore a repetitive design in our image guided network to gradually and sufficiently recover depth values. Specifically, the repetition is embodied in both the image guidance branch and depth generation branch. In the former branch, we design a repetitive hourglass network to extract discriminative image features of complex environments, which can provide powerful contextual instruction for depth prediction. In the latter branch, we introduce a repetitive guidance module based on dynamic convolution, in which an efficient convolution factorization is proposed to simultaneously reduce its complexity and progressively model high-frequency structures. Extensive experiments show that our method achieves state-of-the-art results on the KITTI benchmark and NYUv2 dataset.
View on arXiv