On the symmetrical Kullback-Leibler Jeffreys centroids

Due to the success of the bag-of-word modeling paradigm, clustering histograms has become an important ingredient of modern information processing. Clustering histograms can be performed using the celebrated -means centroid-based algorithm. From the viewpoint of applications, it is usually required to deal with symmetric distances. In this letter, we consider the Jeffreys divergence that symmetrizes the Kullback-Leibler divergence, and investigate the computation of Jeffreys centroids. We first prove that the Jeffreys centroid can be expressed analytically using the Lambert function for positive histograms. We then show how to obtain a fast guaranteed approximation when dealing with frequency histograms. Finally, we conclude with some remarks on the -means histogram clustering.
View on arXiv