39

Harmful Overfitting in Sobolev Spaces

Kedar Karhadkar
Alexander Sietsema
Deanna Needell
Guido Montufar
Main:6 Pages
Bibliography:3 Pages
Appendix:17 Pages
Abstract

Motivated by recent work on benign overfitting in overparameterized machine learning, we study the generalization behavior of functions in Sobolev spaces Wk,p(Rd)W^{k, p}(\mathbb{R}^d) that perfectly fit a noisy training data set. Under assumptions of label noise and sufficient regularity in the data distribution, we show that approximately norm-minimizing interpolators, which are canonical solutions selected by smoothness bias, exhibit harmful overfitting: even as the training sample size nn \to \infty, the generalization error remains bounded below by a positive constant with high probability. Our results hold for arbitrary values of p[1,)p \in [1, \infty), in contrast to prior results studying the Hilbert space case (p=2p = 2) using kernel methods. Our proof uses a geometric argument which identifies harmful neighborhoods of the training data using Sobolev inequalities.

View on arXiv
Comments on this paper