Semi-supervised Classification with Anomaly Rejection

International Conference on Artificial Intelligence and Statistics (AISTATS), 2013

21 June 2013

Clayton Scott

Abstract

In standard multiclass classification, the learner is presented with examples from several classes, and produces a classifier that will classify test data drawn from those same classes. In many situations, however, the test data also consist of examples drawn from a novel class that was not observed during training. In such cases, it is desirable that the classifier have the option of rejecting anomalous examples as not belonging to any of the training classes. We show that in a semi-supervised setting, it is possible to achieve optimal performance in the sense of consistency. Our approach hinges on a method for estimating the proportions of the training and novel classes in the test data. Unlike previous methods for semi-supervised class proportion estimation, the method we adopt is able to consistently estimate class proportions in the test data despite lacking training examples for the novel class. The method is demonstrated on several benchmark data sets.

View on arXiv

Comments on this paper