Point-to-Region Loss for Semi-Supervised Point-Based Crowd Counting

28 May 2025

Main:8 Pages

11 Figures

Bibliography:3 Pages

4 Tables

Appendix:5 Pages

Abstract

Point detection has been developed to locate pedestrians in crowded scenes by training a counter through a point-to-point (P2P) supervision scheme. Despite its excellent localization and counting performance, training a point-based counter still faces challenges concerning annotation labor: hundreds to thousands of points are required to annotate a single sample capturing a dense crowd. In this paper, we integrate point-based methods into a semi-supervised counting framework based on pseudo-labeling, enabling the training of a counter with only a few annotated samples supplemented by a large volume of pseudo-labeled data. However, during implementation, the training encounters issues as the confidence for pseudo-labels fails to be propagated to background pixels via the P2P. To tackle this challenge, we devise a point-specific activation map (PSAM) to visually interpret the phenomena occurring during the ill-posed training. Observations from the PSAM suggest that the feature map is excessively activated by the loss for unlabeled data, causing the decoder to misinterpret these over-activations as pedestrians. To mitigate this issue, we propose a point-to-region (P2R) scheme to substitute P2P, which segments out local regions rather than detects a point corresponding to a pedestrian for supervision. Consequently, pixels in the local region can share the same confidence with the corresponding pseudo points. Experimental results in both semi-supervised counting and unsupervised domain adaptation highlight the advantages of our method, illustrating P2R can resolve issues identified in PSAM. The code is available atthis https URL.

View on arXiv

@article{lin2025_2505.21943,
  title={ Point-to-Region Loss for Semi-Supervised Point-Based Crowd Counting },
  author={ Wei Lin and Chenyang Zhao and Antoni B. Chan },
  journal={arXiv preprint arXiv:2505.21943},
  year={ 2025 }
}

Comments on this paper