v1v2 (latest)

Deep Probabilistic Supervision for Image Classification

30 December 2025

Anton Adelöw

Matteo Gamba

Atsuto Maki

BDL

UQCV

ArXiv (abs)PDF HTML Github

Main:9 Pages

12 Figures

Bibliography:4 Pages

6 Tables

Appendix:3 Pages

Abstract

Supervised training of deep neural networks for classification typically relies on hard targets, which promote overconfidence and can limit calibration, generalization, and robustness. Self-distillation methods aim to mitigate this by leveraging inter-class and sample-specific information present in the model's own predictions, but often remain dependent on hard targets without explicitly modeling predictive uncertainty. With this in mind, we propose Deep Probabilistic Supervision (DPS), a principled learning framework constructing sample-specific target distributions via statistical inference on the model's own predictions, remaining independent of hard targets after initialization. We show that DPS consistently yields higher test accuracy (e.g., +2.0% for DenseNet-264 on ImageNet) and significantly lower Expected Calibration Error (ECE) (-40% ResNet-50, CIFAR-100) than existing self-distillation methods. When combined with a contrastive loss, DPS achieves state-of-the-art robustness under label noise.

View on arXiv

Comments on this paper