589

A Minimax Approach to Supervised Learning

Neural Information Processing Systems (NeurIPS), 2016
Abstract

Given a task of predicting YY from XX, a loss function LL, and a set of probability distributions Γ\Gamma on (X,Y)(X,Y), what is the optimal decision rule minimizing the worst-case expected loss over Γ\Gamma? In this paper, we address this question by introducing a generalization of the maximum entropy principle. Applying this principle to sets of distributions with marginal on XX constrained to be the empirical marginal, we provide a minimax interpretation of the maximum likelihood problem over generalized linear models, which connects the minimax problem for each loss function to a generalized linear model. While in some cases such as quadratic and logarithmic loss functions we revisit well-known linear and logistic regression models, our approach reveals novel models for other loss functions. In particular, for the 0-1 loss we derive a classification approach which we call the minimax SVM. The minimax SVM minimizes the worst-case expected 0-1 loss over the proposed Γ\Gamma by solving a tractable optimization problem. Moreover, applying the minimax approach to the Brier loss function we derive a new classification model called the minimax Brier. The maximum likelihood problem for this model uses the Huber penalty function. We perform several numerical experiments to show the power of the minimax SVM and the minimax Brier.

View on arXiv
Comments on this paper