Distances between probability distributions of different dimensions

IEEE Transactions on Information Theory (IEEE Trans. Inf. Theory), 2020

1 November 2020

Abstract

Comparing probability distributions is an indispensable and ubiquitous task in machine learning and statistics. The most common way to compare a pair of Borel probability measures is to compute a metric between them, and by far the most widely used notions of metric are the Wasserstein metric and the total variation metric. The next most common way is to compute a divergence between them, and in this case almost every known divergences such as those of Kullback-Leibler, Jensen-Shannon, R\'enyi, and many more, are special cases of the $f$ -divergence. Nevertheless these metrics and divergences may only be computed, in fact, are only defined, when the pair of probability measures are on spaces of the same dimension. How would one measure, say, a Wasserstein distance between the uniform distribution on the interval $[-1,1]$ and a Gaussian distribution on $\mathbb{R}^3$ ? We will show that various common notions of metrics and divergences can be extended in a completely natural manner to Borel probability measures defined on spaces of different dimensions, e.g., one on $\mathbb{R}^m$ and another on $\mathbb{R}^n$ where $m, n$ are distinct, so as to give a meaningful answer to the previous question.

View on arXiv

Comments on this paper