A Quasi-Newton Approach to Nonsmooth Convex Optimization
We extend the well-known BFGS quasi-Newton method and its limited-memory variant LBFGS to the optimization of nonsmooth convex objectives. This is done in a rigorous fashion by generalizing three components of BFGS to subdifferentials: The local quadratic model, the identification of a descent direction, and the Wolfe line search conditions. We apply the resulting subLBFGS algorithm to L2-regularized risk minimization with the binary hinge loss. To extend our algorithm to the multiclass and multilabel settings we develop a new, efficient, exact line search algorithm. We prove its worst-case time complexity bounds, and show that it can also extend a recently developed bundle method to the multiclass and multilabel settings. We also apply the direction-finding component of our algorithm to L1-regularized risk minimization with logistic loss. In all these contexts our methods perform comparable to or better than specialized state-of-the-art solvers on a number of publicly available datasets. Open source software implementing our algorithms is freely available for download.
View on arXiv