16
33

Near-Optimal Bounds for Binary Embeddings of Arbitrary Sets

Samet Oymak
Ben Recht
Abstract

We study embedding a subset KK of the unit sphere to the Hamming cube {1,+1}m\{-1,+1\}^m. We characterize the tradeoff between distortion and sample complexity mm in terms of the Gaussian width ω(K)\omega(K) of the set. For subspaces and several structured sets we show that Gaussian maps provide the optimal tradeoff mδ2ω2(K)m\sim \delta^{-2}\omega^2(K), in particular for δ\delta distortion one needs mδ2dm\approx\delta^{-2}{d} where dd is the subspace dimension. For general sets, we provide sharp characterizations which reduces to mδ4ω2(K)m\approx{\delta^{-4}}{\omega^2(K)} after simplification. We provide improved results for local embedding of points that are in close proximity of each other which is related to locality sensitive hashing. We also discuss faster binary embedding where one takes advantage of an initial sketching procedure based on Fast Johnson-Lindenstauss Transform. Finally, we list several numerical observations and discuss open problems.

View on arXiv
Comments on this paper