6
15

Do Outliers Ruin Collaboration?

Abstract

We consider the problem of learning a binary classifier from nn different data sources, among which at most an η\eta fraction are adversarial. The overhead is defined as the ratio between the sample complexity of learning in this setting and that of learning the same hypothesis class on a single data distribution. We present an algorithm that achieves an O(ηn+lnn)O(\eta n + \ln n) overhead, which is proved to be worst-case optimal. We also discuss the potential challenges to the design of a computationally efficient learning algorithm with a small overhead.

View on arXiv
Comments on this paper