73
9

Quantitative asymptotics of graphical projection pursuit

Abstract

There is a result of Diaconis and Freedman which says that, in a limiting sense, for large collections of high-dimensional data most one-dimensional projections of the data are approximately Gaussian. This paper gives quantitative versions of that result. For a set of deterministic vectors {xi}i=1n\{x_i\}_{i=1}^n in Rd\R^d with nn and dd fixed, let θ\sd1\theta\in\s^{d-1} be a random point of the sphere and let μnθ\mu_n^\theta denote the random measure which puts mass 1n\frac{1}{n} at each of the points \inprodx1θ,,\inprodxnθ\inprod{x_1}{\theta},\ldots,\inprod{x_n}{\theta}. For a fixed bounded Lipschitz test function ff, ZZ a standard Gaussian random variable and σ2\sigma^2 a suitable constant, an explicit bound is derived for the quantity \ds[fdμnθ\Ef(σZ)>ϵ]\ds\P\left[\left|\int f d\mu_n^\theta-\E f( \sigma Z)\right|>\epsilon\right]. A bound is also given for \ds[dBL(μnθ,N(0,σ2))>ϵ]\ds\P\left[d_{BL}(\mu_n^\theta, N(0,\sigma^2))>\epsilon\right], where dBLd_{BL} denotes the bounded-Lipschitz distance.

View on arXiv
Comments on this paper