0

Learning with Geometric Priors in U-Net Variants for Polyp Segmentation

Fabian Vazquez
Jose A. Nuñez
Diego Adame
Alissen Moreno
Augustin Zhan
Huimin Li
Jinghao Yang
Haoteng Tang
Bin Fu
Pengfei Gu
Main:4 Pages
4 Figures
Bibliography:1 Pages
4 Tables
Abstract

Accurate and robust polyp segmentation is essential for early colorectal cancer detection and for computer-aided diagnosis. While convolutional neural network-, Transformer-, and Mamba-based U-Net variants have achieved strong performance, they still struggle to capture geometric and structural cues, especially in low-contrast or cluttered colonoscopy scenes. To address this challenge, we propose a novel Geometric Prior-guided Module (GPM) that injects explicit geometric priors into U-Net-based architectures for polyp segmentation. Specifically, we fine-tune the Visual Geometry Grounded Transformer (VGGT) on a simulated ColonDepth dataset to estimate depth maps of polyp images tailored to the endoscopic domain. These depth maps are then processed by GPM to encode geometric priors into the encoder's feature maps, where they are further refined using spatial and channel attention mechanisms that emphasize both local spatial and global channel information. GPM is plug-and-play and can be seamlessly integrated into diverse U-Net variants. Extensive experiments on five public polyp segmentation datasets demonstrate consistent gains over three strong baselines. Code and the generated depth maps are available at:this https URL

View on arXiv
Comments on this paper