19
172

Dropping Convexity for Faster Semi-definite Optimization

Srinadh Bhojanapalli
Anastasios Kyrillidis
Sujay Sanghavi
Abstract

We study the minimization of a convex function f(X)f(X) over the set of n×nn\times n positive semi-definite matrices, but when the problem is recast as minUg(U):=f(UU)\min_U g(U) := f(UU^\top), with URn×rU \in \mathbb{R}^{n \times r} and rnr \leq n. We study the performance of gradient descent on gg---which we refer to as Factored Gradient Descent (FGD)---under standard assumptions on the original function ff. We provide a rule for selecting the step size and, with this choice, show that the local convergence rate of FGD mirrors that of standard gradient descent on the original ff: i.e., after kk steps, the error is O(1/k)O(1/k) for smooth ff, and exponentially small in kk when ff is (restricted) strongly convex. In addition, we provide a procedure to initialize FGD for (restricted) strongly convex objectives and when one only has access to ff via a first-order oracle; for several problem instances, such proper initialization leads to global convergence guarantees. FGD and similar procedures are widely used in practice for problems that can be posed as matrix factorization. To the best of our knowledge, this is the first paper to provide precise convergence rate guarantees for general convex functions under standard convex assumptions.

View on arXiv
Comments on this paper