Kernel Density Estimation on Embedded Manifolds with Boundary

We consider practical density estimation from large data sets sampled on manifolds embedded in Euclidean space. Existing density estimators on manifolds typically require prior knowledge of the geometry of the manifold, and all density estimation on embedded manifolds is restricted to compact manifolds without boundary. First, motivated by recent developments in kernel-based manifold learning, we show that it is possible to estimate the density on an embedded manifold using only the Euclidean distances between the data points in the ambient space. Second, we extend the theory of local kernels to a larger class of manifolds which includes many noncompact manifolds. This theory reveals that estimating the density without prior knowledge of the geometry introduces an additional bias term which depends on the extrinsic curvature of the embedding. Finally, we develop a boundary correction method that does not require any prior knowledge of the location of the boundary. In particular, we develop statistics which provably estimate the distance and direction of the boundary, which allows us to apply a cut-and-normalize boundary correction. By combining multiple cut-and-normalize estimators we introduce a consistent kernel density estimator that has uniform bias on manifold and boundary.
View on arXiv