14
9

Extensions to the Proximal Distance Method of Constrained Optimization

Abstract

The current paper studies the problem of minimizing a loss f(x)f(\boldsymbol{x}) subject to constraints of the form DxS\boldsymbol{D}\boldsymbol{x} \in S, where SS is a closed set, convex or not, and D\boldsymbol{D} is a matrix that fuses parameters. Fusion constraints can capture smoothness, sparsity, or more general constraint patterns. To tackle this generic class of problems, we combine the Beltrami-Courant penalty method with the proximal distance principle. The latter is driven by minimization of penalized objectives f(x)+ρ2dist(Dx,S)2f(\boldsymbol{x})+\frac{\rho}{2}\text{dist}(\boldsymbol{D}\boldsymbol{x},S)^2 involving large tuning constants ρ\rho and the squared Euclidean distance of Dx\boldsymbol{D}\boldsymbol{x} from SS. The next iterate xn+1\boldsymbol{x}_{n+1} of the corresponding proximal distance algorithm is constructed from the current iterate xn\boldsymbol{x}_n by minimizing the majorizing surrogate function f(x)+ρ2DxPS(Dxn)2f(\boldsymbol{x})+\frac{\rho}{2}\|\boldsymbol{D}\boldsymbol{x}-\mathcal{P}_{S}(\boldsymbol{D}\boldsymbol{x}_n)\|^2. For fixed ρ\rho and a subanalytic loss f(x)f(\boldsymbol{x}) and a subanalytic constraint set SS, we prove convergence to a stationary point. Under stronger assumptions, we provide convergence rates and demonstrate linear local convergence. We also construct a steepest descent (SD) variant to avoid costly linear system solves. To benchmark our algorithms, we compare against the alternating direction method of multipliers (ADMM). Our extensive numerical tests include problems on metric projection, convex regression, convex clustering, total variation image denoising, and projection of a matrix to a good condition number. These experiments demonstrate the superior speed and acceptable accuracy of our steepest variant on high-dimensional problems.

View on arXiv
Comments on this paper