192
79

Online Importance Weight Aware Updates

Abstract

An importance weight quantifies the relative importance of one example over another, coming up in applications of boosting, asymmetric classification costs, reductions, and active learning. The standard approach for dealing with importance weights in gradient descent is via multiplication of the gradient. This approach has obvious problems when importance weights are large. We develop an alternate approach based on an invariance property: that updating twice with importance weight h is equivalent to updating once with importance weight 2h. For many important losses this has a closed form update which satisfies standard regret guarantees when all examples have h=1. The derived approach shares several properties with implicit online learning. We therefore adapt implicit online learning to work with large importance weights as well. Empirically, both approaches yield substantially superior prediction with similar computational performance while reducing the sensitivity of the algorithm to the exact setting of the learning rate. We apply these to online active learning yielding an extraordinarily fast active learning algorithm that works even in the presence of adversarial noise.

View on arXiv
Comments on this paper