93
4
v1v2 (latest)

Agnostic Sample Compression Schemes for Regression

Abstract

We obtain the first positive results for bounded sample compression in the agnostic regression setting with the p\ell_p loss, where p[1,]p\in [1,\infty]. We construct a generic approximate sample compression scheme for real-valued function classes exhibiting exponential size in the fat-shattering dimension but independent of the sample size. Notably, for linear regression, an approximate compression of size linear in the dimension is constructed. Moreover, for 1\ell_1 and \ell_\infty losses, we can even exhibit an efficient exact sample compression scheme of size linear in the dimension. We further show that for every other p\ell_p loss, p(1,)p\in (1,\infty), there does not exist an exact agnostic compression scheme of bounded size. This refines and generalizes a negative result of David, Moran, and Yehudayoff for the 2\ell_2 loss. We close by posing general open questions: for agnostic regression with 1\ell_1 loss, does every function class admits an exact compression scheme of size equal to its pseudo-dimension? For the 2\ell_2 loss, does every function class admit an approximate compression scheme of polynomial size in the fat-shattering dimension? These questions generalize Warmuth's classic sample compression conjecture for realizable-case classification.

View on arXiv
Comments on this paper

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.