We study implicit regularization when optimizing an underdetermined quadratic objective over a matrix with gradient descent on a factorization of . We conjecture and provide empirical and theoretical evidence that with small enough step sizes and initialization close enough to the origin, gradient descent on a full dimensional factorization converges to the minimum nuclear norm solution.
View on arXiv