310

On the symmetrical Kullback-Leibler Jeffreys centroids

Abstract

Clustering histograms became an important ingredient of modern information processing thanks to the success of the bag-of-word modeling paradigm. Histogram clustering can be performed using the celebrated kk-means centroid-based algorithm. From the viewpoint of applications, it is usually required to deal with symmetric distances. We consider the Jeffreys divergence that symmetrizes the Kullback-Leibler divergence, and investigate the computation of centroids with respect to that distance. We first prove that the Jeffreys centroid can be expressed analytically in closed form using the Lambert WW function for {\em positive} histograms. We then show how to obtain a fast guaranteed tight approximation when dealing with {\em frequency} histograms. Finally, we conclude with some remarks on the kk-means histogram clustering.

View on arXiv
Comments on this paper