Hardness of Learning Halfspaces with Massart Noise
We study the complexity of PAC learning halfspaces in the presence of Massart (bounded) noise. Specifically, given labeled examples from a distribution on such that the marginal distribution on is arbitrary and the labels are generated by an unknown halfspace corrupted with Massart noise at rate , we want to compute a hypothesis with small misclassification error. Characterizing the efficient learnability of halfspaces in the Massart model has remained a longstanding open problem in learning theory. Recent work gave a polynomial-time learning algorithm for this problem with error . This error upper bound can be far from the information-theoretically optimal bound of . More recent work showed that {\em exact learning}, i.e., achieving error , is hard in the Statistical Query (SQ) model. In this work, we show that there is an exponential gap between the information-theoretically optimal error and the best error that can be achieved by a polynomial-time SQ algorithm. In particular, our lower bound implies that no efficient SQ algorithm can approximate the optimal error within any polynomial factor.
View on arXiv