15
37

Near Optimal Sketching of Low-Rank Tensor Regression

Abstract

We study the least squares regression problem \begin{align*} \min_{\Theta \in \mathcal{S}_{\odot D,R}} \|A\Theta-b\|_2, \end{align*} where SD,R\mathcal{S}_{\odot D,R} is the set of Θ\Theta for which Θ=r=1Rθ1(r)θD(r)\Theta = \sum_{r=1}^{R} \theta_1^{(r)} \circ \cdots \circ \theta_D^{(r)} for vectors θd(r)Rpd\theta_d^{(r)} \in \mathbb{R}^{p_d} for all r[R]r \in [R] and d[D]d \in [D], and \circ denotes the outer product of vectors. That is, Θ\Theta is a low-dimensional, low-rank tensor. This is motivated by the fact that the number of parameters in Θ\Theta is only Rd=1DpdR \cdot \sum_{d=1}^D p_d, which is significantly smaller than the d=1Dpd\prod_{d=1}^{D} p_d number of parameters in ordinary least squares regression. We consider the above CP decomposition model of tensors Θ\Theta, as well as the Tucker decomposition. For both models we show how to apply data dimensionality reduction techniques based on {\it sparse} random projections ΦRm×n\Phi \in \mathbb{R}^{m \times n}, with mnm \ll n, to reduce the problem to a much smaller problem minΘΦAΘΦb2\min_{\Theta} \|\Phi A \Theta - \Phi b\|_2, for which if Θ\Theta' is a near-optimum to the smaller problem, then it is also a near optimum to the original problem. We obtain significantly smaller dimension and sparsity in Φ\Phi than is possible for ordinary least squares regression, and we also provide a number of numerical simulations supporting our theory.

View on arXiv
Comments on this paper