357

Supervised learning with probabilistic morphisms and kernel mean embeddings

Abstract

In this paper I propose a concept of a correct loss function in a generative model of supervised learning for an input space X\mathcal{X} and a label space Y\mathcal{Y}, both of which are measurable spaces. A correct loss function in a generative model of supervised learning must accurately measure the discrepancy between elements of a hypothesis space H\mathcal{H} of possible predictors and the supervisor operator, even when the supervisor operator does not belong to H\mathcal{H}. To define correct loss functions, I propose a characterization of a regular conditional probability measure μYX\mu_{\mathcal{Y}|\mathcal{X}} for a probability measure μ\mu on X×Y\mathcal{X} \times \mathcal{Y} relative to the projection ΠX:X×YX\Pi_{\mathcal{X}}: \mathcal{X}\times\mathcal{Y}\to \mathcal{X} as a solution of a linear operator equation. If Y\mathcal{Y} is a separable metrizable topological space with the Borel σ\sigma-algebra $ \mathcal{B} (\mathcal{Y})$, I propose an additional characterization of a regular conditional probability measure μYX\mu_{\mathcal{Y}|\mathcal{X}} as a minimizer of mean square error on the space of Markov kernels, referred to as probabilistic morphisms, from X\mathcal{X} to Y\mathcal{Y}. This characterization utilizes kernel mean embeddings. Building upon these results and employing inner measure to quantify the generalizability of a learning algorithm, I extend a result due to Cucker-Smale, which addresses the learnability of a regression model, to the setting of a conditional probability estimation problem. Additionally, I present a variant of Vapnik's regularization method for solving stochastic ill-posed problems, incorporating inner measure, and showcase its applications.

View on arXiv
Comments on this paper