11
0

Stronger Coreset Bounds for Kernel Density Estimators via Chaining

Abstract

We apply the discrepancy method and a chaining approach to give improved bounds on the coreset complexity of a wide class of kernel functions. Our results give randomized polynomial time algorithms to produce coresets of size O(dεloglog1ε)O\big(\frac{\sqrt{d}}{\varepsilon}\sqrt{\log\log \frac{1}{\varepsilon}}\big) for the Gaussian and Laplacian kernels in the case that the data set is uniformly bounded, an improvement that was not possible with previous techniques. We also obtain coresets of size O(1εloglog1ε)O\big(\frac{1}{\varepsilon}\sqrt{\log\log \frac{1}{\varepsilon}}\big) for the Laplacian kernel for dd constant. Finally, we give the best known bounds of O(dεlog(2max{1,α}))O\big(\frac{\sqrt{d}}{\varepsilon}\sqrt{\log(2\max\{1,\alpha\})}\big) on the coreset complexity of the exponential, Hellinger, and JS Kernels, where 1/α1/\alpha is the bandwidth parameter of the kernel.

View on arXiv
Comments on this paper