17
8

Are There Exceptions to Goodhart's Law? On the Moral Justification of Fairness-Aware Machine Learning

Abstract

Fairness-aware machine learning (fair-ml) techniques are algorithmic interventions designed to ensure that individuals who are affected by the predictions of a machine learning model are treated fairly. The problem is often posed as an optimization problem, where the objective is to achieve high predictive performance under a quantitative fairness constraint. However, any attempt to design a fair-ml algorithm must assume a world where Goodhart's law has an exception: when a fairness measure becomes an optimization constraint, it does not cease to be a good measure. In this paper, we argue that fairness measures are particularly sensitive to Goodhart's law. Our main contributions are as follows. First, we present a framework for moral reasoning about the justification of fairness metrics. In contrast to existing work, our framework incorporates the belief that whether a distribution of outcomes is fair, depends not only on the cause of inequalities but also on what moral claims decision subjects have to receive a particular benefit or avoid a burden. We use the framework to distil moral and empirical assumptions under which particular fairness metrics correspond to a fair distribution of outcomes. Second, we explore the extent to which employing fairness metrics as a constraint in a fair-ml algorithm is morally justifiable, exemplified by the fair-ml algorithm introduced by Hardt et al. (2016). We illustrate that enforcing a fairness metric through a fair-ml algorithm often does not result in the fair distribution of outcomes that motivated its use and can even harm the individuals the intervention was intended to protect.

View on arXiv
Comments on this paper