16
63

Variance Reduction for Matrix Games

Abstract

We present a randomized primal-dual algorithm that solves the problem minxmaxyyAx\min_{x} \max_{y} y^\top A x to additive error ϵ\epsilon in time nnz(A)+nnz(A)n/ϵ\mathrm{nnz}(A) + \sqrt{\mathrm{nnz}(A)n}/\epsilon, for matrix AA with larger dimension nn and nnz(A)\mathrm{nnz}(A) nonzero entries. This improves the best known exact gradient methods by a factor of nnz(A)/n\sqrt{\mathrm{nnz}(A)/n} and is faster than fully stochastic gradient methods in the accurate and/or sparse regime ϵn/nnz(A)\epsilon \le \sqrt{n/\mathrm{nnz}(A)}. Our results hold for x,yx,y in the simplex (matrix games, linear programming) and for xx in an 2\ell_2 ball and yy in the simplex (perceptron / SVM, minimum enclosing ball). Our algorithm combines Nemirovski's "conceptual prox-method" and a novel reduced-variance gradient estimator based on "sampling from the difference" between the current iterate and a reference point.

View on arXiv
Comments on this paper