25
5

Cubic regularized subspace Newton for non-convex optimization

Abstract

This paper addresses the optimization problem of minimizing non-convex continuous functions, which is relevant in the context of high-dimensional machine learning applications characterized by over-parametrization. We analyze a randomized coordinate second-order method named SSCN which can be interpreted as applying cubic regularization in random subspaces. This approach effectively reduces the computational complexity associated with utilizing second-order information, rendering it applicable in higher-dimensional scenarios. Theoretically, we establish convergence guarantees for non-convex functions, with interpolating rates for arbitrary subspace sizes and allowing inexact curvature estimation. When increasing subspace size, our complexity matches O(ϵ3/2)\mathcal{O}(\epsilon^{-3/2}) of the cubic regularization (CR) rate. Additionally, we propose an adaptive sampling scheme ensuring exact convergence rate of O(ϵ3/2,ϵ3)\mathcal{O}(\epsilon^{-3/2}, \epsilon^{-3}) to a second-order stationary point, even without sampling all coordinates. Experimental results demonstrate substantial speed-ups achieved by SSCN compared to conventional first-order methods.

View on arXiv
@article{zhao2025_2406.16666,
  title={ Cubic regularized subspace Newton for non-convex optimization },
  author={ Jim Zhao and Aurelien Lucchi and Nikita Doikov },
  journal={arXiv preprint arXiv:2406.16666},
  year={ 2025 }
}
Comments on this paper