Do Outliers Ruin Collaboration?

Abstract
We consider the problem of learning a binary classifier from different data sources, among which at most an fraction are adversarial. The overhead is defined as the ratio between the sample complexity of learning in this setting and that of learning the same hypothesis class on a single data distribution. We present an algorithm that achieves an overhead, which is proved to be worst-case optimal. We also discuss the potential challenges to the design of a computationally efficient learning algorithm with a small overhead.
View on arXivComments on this paper