19
28

On the computational and statistical complexity of over-parameterized matrix sensing

Abstract

We consider solving the low rank matrix sensing problem with Factorized Gradient Descend (FGD) method when the true rank is unknown and over-specified, which we refer to as over-parameterized matrix sensing. If the ground truth signal XRdd\mathbf{X}^* \in \mathbb{R}^{d*d} is of rank rr, but we try to recover it using FF\mathbf{F} \mathbf{F}^\top where FRdk\mathbf{F} \in \mathbb{R}^{d*k} and k>rk>r, the existing statistical analysis falls short, due to a flat local curvature of the loss function around the global maxima. By decomposing the factorized matrix F\mathbf{F} into separate column spaces to capture the effect of extra ranks, we show that FtFtXF2\|\mathbf{F}_t \mathbf{F}_t - \mathbf{X}^*\|_{F}^2 converges to a statistical error of O~(kdσ2/n)\tilde{\mathcal{O}} ({k d \sigma^2/n}) after O~(σrσnd)\tilde{\mathcal{O}}(\frac{\sigma_{r}}{\sigma}\sqrt{\frac{n}{d}}) number of iterations where Ft\mathbf{F}_t is the output of FGD after tt iterations, σ2\sigma^2 is the variance of the observation noise, σr\sigma_{r} is the rr-th largest eigenvalue of X\mathbf{X}^*, and nn is the number of sample. Our results, therefore, offer a comprehensive picture of the statistical and computational complexity of FGD for the over-parameterized matrix sensing problem.

View on arXiv
Comments on this paper