231
79

Importance Weight Aware Gradient Updates

Abstract

An importance weight quantifies the relative importance of one example over another, coming up in applications of boosting, asymmetric classification costs, and active learning. The standard approach for dealing with importance weights in gradient descent is via multiplication of the gradient. This approach has obvious problems when importance weights are large. We develop an alternate approach based on an invariance property: that updating twice with importance weight hh is equivalent to updating once with importance weight 2h2h. For many important losses this has a closed form update which satisfies standard regret guarantees when all examples have h=1h=1. Empirically, importance weight invariant updates yield better learned hypotheses and reduce the sensitivity of the algorithm to the exact setting of the learning rate even for datasets where all importance weights are equal to one.

View on arXiv
Comments on this paper

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.