Decomposing Probabilistic Scores: Reliability, Information Loss and Uncertainty
- UQCVUDPER
Calibration is a conditional property that depends on the information retained by a predictor. We develop decomposition identities for arbitrary proper losses that make this dependence explicit. At any information level , the expected loss of an -measurable predictor splits into a proper-regret (reliability) term and a conditional entropy (residual uncertainty) term. For nested levels , a chain decomposition quantifies the information gain from to . Applied to classification with features and score , this yields a three-term identity: miscalibration, a {\em grouping} term measuring information loss from to , and irreducible uncertainty at the feature level. We leverage the framework to analyze post-hoc recalibration, aggregation of calibrated models, and stagewise/boosting constructions, with explicit forms for Brier and log-loss.
View on arXiv