A Minimax Approach to Supervised Learning

Neural Information Processing Systems (NeurIPS), 2016

7 June 2016

Abstract

Given a task of predicting $Y$ from $X$ , a loss function $L$ , and a set of probability distributions $\Gamma$ on $(X,Y)$ , what is the optimal decision rule minimizing the worst-case expected loss over $\Gamma$ ? In this paper, we address this question by introducing a generalization of the maximum entropy principle. Applying this principle to sets of distributions with marginal on $X$ constrained to be the empirical marginal, we provide a minimax interpretation of the maximum likelihood problem over generalized linear models, which connects the minimax problem for each loss function to a generalized linear model. While in some cases such as quadratic and logarithmic loss functions we revisit well-known linear and logistic regression models, our approach reveals novel models for other loss functions. In particular, for the 0-1 loss we derive a classification approach which we call the minimax SVM. The minimax SVM minimizes the worst-case expected 0-1 loss over the proposed $\Gamma$ by solving a tractable optimization problem. Moreover, applying the minimax approach to the Brier loss function we derive a new classification model called the minimax Brier. The maximum likelihood problem for this model uses the Huber penalty function. We perform several numerical experiments to show the power of the minimax SVM and the minimax Brier.

View on arXiv

Comments on this paper