ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2208.11246
16
10

Accelerating SGD for Highly Ill-Conditioned Huge-Scale Online Matrix Completion

24 August 2022
G. Zhang
Hong-Ming Chiu
Richard Y. Zhang
ArXivPDFHTML
Abstract

The matrix completion problem seeks to recover a d×dd\times dd×d ground truth matrix of low rank r≪dr\ll dr≪d from observations of its individual elements. Real-world matrix completion is often a huge-scale optimization problem, with ddd so large that even the simplest full-dimension vector operations with O(d)O(d)O(d) time complexity become prohibitively expensive. Stochastic gradient descent (SGD) is one of the few algorithms capable of solving matrix completion on a huge scale, and can also naturally handle streaming data over an evolving ground truth. Unfortunately, SGD experiences a dramatic slow-down when the underlying ground truth is ill-conditioned; it requires at least O(κlog⁡(1/ϵ))O(\kappa\log(1/\epsilon))O(κlog(1/ϵ)) iterations to get ϵ\epsilonϵ-close to ground truth matrix with condition number κ\kappaκ. In this paper, we propose a preconditioned version of SGD that preserves all the favorable practical qualities of SGD for huge-scale online optimization while also making it agnostic to κ\kappaκ. For a symmetric ground truth and the Root Mean Square Error (RMSE) loss, we prove that the preconditioned SGD converges to ϵ\epsilonϵ-accuracy in O(log⁡(1/ϵ))O(\log(1/\epsilon))O(log(1/ϵ)) iterations, with a rapid linear convergence rate as if the ground truth were perfectly conditioned with κ=1\kappa=1κ=1. In our experiments, we observe a similar acceleration for item-item collaborative filtering on the MovieLens25M dataset via a pair-wise ranking loss, with 100 million training pairs and 10 million testing pairs. [See supporting code at https://github.com/Hong-Ming/ScaledSGD.]

View on arXiv
Comments on this paper