Dropping Convexity for Faster Semi-definite Optimization

14 September 2015

Srinadh Bhojanapalli

Anastasios Kyrillidis

Sujay Sanghavi

Abstract

In this paper, we study the minimization of a convex function $f(X)$ over the space of $n\times n$ positive semidefinite matrices $X\succeq 0$ , but when the problem is recast as the non-convex problem $\min_U g(U)$ where $g(U) := f(UU^\top)$ , with $U$ being an $n\times r$ matrix and $r\leq n$ . We study the performance of gradient descent on $g$ -- which we refer to as Factored Gradient Descent (FGD) -- under standard assumptions on the original function $f$ . We provide a rule for selecting the step size, and with this choice show that the local convergence rate of FGD mirrors that of standard gradient descent on the original $f$ -- the error after $k$ steps is $O(1/k)$ for smooth $f$ , and exponentially small in $k$ when $f$ is (restricted) strongly convex. Note that $g$ is not locally convex. In addition, we provide a procedure to initialize FGD for (restricted) strongly convex objectives and when one only has access to $f$ via a first-order oracle. FGD and similar procedures are widely used in practice for problems that can be posed as matrix factorization; to the best of our knowledge, ours is the first paper to provide precise convergence rate guarantees for general convex functions under standard convex assumptions.

View on arXiv

Comments on this paper