We present a new way of study of Mercer kernels, by corresponding to a special kernel a pseudo-differential operator such that acts on smooth functions in the same way as an integral operator associated with (where is the Fourier transform). We show that kernels defined by pseudo-differential operators are able to approximate uniformly any continuous Mercer kernel on a compact set. The symbol encapsulates a lot of useful information about the structure of the Maximum Mean Discrepancy distance defined by the kernel . We approximate with the sum of the first terms of the Singular Value Decomposition of , denoted by . If ordered singular values of the integral operator associated with die down rapidly, the MMD distance defined by the new symbol differs from the initial one only slightly. Moreover, the new MMD distance can be interpreted as an aggregated result of comparing local moments of two probability distributions. The latter results holds under the condition that right singular vectors of the integral operator associated with are uniformly bounded. But even if this is not satisfied we can still hold that the Hilbert-Schmidt distance between and vanishes. Thus, we report an interesting phenomenon: the MMD distance measures the difference of two probability distributions with respect to a certain number of local moments, , and this number depends on the speed with which singular values of die down.
View on arXiv