Dropping Convexity for Faster Semi-definite Optimization

In this paper, we study the minimization of a convex function over the space of positive semidefinite matrices , but when the problem is recast as the non-convex problem where , with being an matrix and . We study the performance of gradient descent on -- which we refer to as Factored Gradient Descent (FGD) -- under standard assumptions on the original function . We provide a rule for selecting the step size, and with this choice show that the local convergence rate of FGD mirrors that of standard gradient descent on the original -- the error after steps is for smooth , and exponentially small in when is (restricted) strongly convex. Note that is not locally convex. In addition, we provide a procedure to initialize FGD for (restricted) strongly convex objectives and when one only has access to via a first-order oracle. FGD and similar procedures are widely used in practice for problems that can be posed as matrix factorization; to the best of our knowledge, ours is the first paper to provide precise convergence rate guarantees for general convex functions under standard convex assumptions.
View on arXiv