202

Global Multiclass Classification from Heterogeneous Local Models

Abstract

Multiclass classification problems are most often solved by either training a single centralized classifier on all KK classes, or by reducing the problem to multiple binary classification tasks. This paper explores the uncharted region between these two extremes: How can we solve the KK-class classification problem by combining the predictions of smaller classifiers, each trained on an arbitrary number of classes R{2,3,,K}R \in \{2, 3, \ldots, K\}? We present a mathematical framework for answering this question, and derive bounds on the number of classifiers (in terms of KK and RR) needed to accurately predict the true class of an unlabeled sample under both adversarial and stochastic assumptions. By exploiting a connection to the classical set cover problem in combinatorics, we produce an efficient, near-optimal scheme (with respect to the number of classifiers) for designing such configurations of classifiers, which recovers the well-known one-vs.-one strategy as a special case when R=2R=2. Experiments with the MNIST and CIFAR-10 datasets show that our scheme is capable of matching the performance of centralized classifiers in practice. The results suggest that our approach offers a promising direction for solving the problem of data heterogeneity which plagues current federated learning methods.

View on arXiv
Comments on this paper