80
18

Making Intelligence: Ethics, IQ, and ML Benchmarks

Abstract

The ML community recognizes the importance of anticipating and mitigating the potential negative impacts of benchmark research. In this position paper, we argue that more attention must be paid to areas of ethical risk at the technical and scientific core of ML benchmarks. We identify overlooked structural similarities between human IQ and ML benchmarks. These share similarities in setting standards for describing, evaluating, and comparing performance on tasks relevant to intelligence. Drawing on prior research on IQ benchmarks from feminist philosophy of science, we argue that values need to be considered when creating ML benchmarks and datasets, and that it is not possible to avoid this choice by creating benchmarks that are value-neutral. Finally, we outline practical recommendations for benchmark research ethics and ethics review.

View on arXiv
Comments on this paper