G4Seg: Generation for Inexact Segmentation Refinement with Diffusion Models

2 June 2025

Main:10 Pages

16 Figures

Bibliography:1 Pages

Appendix:5 Pages

Abstract

This paper considers the problem of utilizing a large-scale text-to-image diffusion model to tackle the challenging Inexact Segmentation (IS) task. Unlike traditional approaches that rely heavily on discriminative-model-based paradigms or dense visual representations derived from internal attention mechanisms, our method focuses on the intrinsic generative priors in Stable Diffusion~(SD). Specifically, we exploit the pattern discrepancies between original images and mask-conditional generated images to facilitate a coarse-to-fine segmentation refinement by establishing a semantic correspondence alignment and updating the foreground probability. Comprehensive quantitative and qualitative experiments validate the effectiveness and superiority of our plug-and-play design, underscoring the potential of leveraging generation discrepancies to model dense representations and encouraging further exploration of generative approaches for solving discriminative tasks.

View on arXiv

@article{zhang2025_2506.01539,
  title={ G4Seg: Generation for Inexact Segmentation Refinement with Diffusion Models },
  author={ Tianjiao Zhang and Fei Zhang and Jiangchao Yao and Ya Zhang and Yanfeng Wang },
  journal={arXiv preprint arXiv:2506.01539},
  year={ 2025 }
}

Comments on this paper