Learning Coverage Functions

8 April 2013

Abstract

We study the problem of approximating and learning coverage functions. A function $c: 2^{[n]} \rightarrow \R^{+}$ is a coverage function, if there exists a universe $U$ with non-negative weights $w(u)$ for each $u \in U$ and subsets $A_1, A_2, ..., A_n$ of $U$ such that $c(S) = \sum_{u \in \cup_{i \in S} A_i} w(u)$ . Alternatively, coverage functions can be described as non-negative linear combinations of monotone disjunctions. They are a natural subclass of submodular functions and arise in a number of applications. We give an algorithm that for any $\gamma,\delta>0$ , given random and uniform examples of an unknown coverage function $c$ , finds a function $h$ that approximates $c$ within factor $(1+\gamma)$ on all but $\delta$ -fraction of the points in time $\poly(n,1/\gamma,1/\delta)$ . This is the first fully-polynomial algorithm for learning an interesting class of functions in the demanding PMAC model of Balcan and Harvey (2011). Our algorithm relies on first solving a simpler problem of learning coverage functions with low $\ell_1$ -error. Our algorithms are based on several new structural properties of coverage functions and, in particular, we prove that any coverage function can be $\eps$ -approximated in $\ell_1$ by a coverage function that depends only on $O(1/\eps^2)$ variables. In contrast, we show that, without assumptions on the distribution, learning coverage functions is at least as hard as learning polynomial-size disjoint DNF formulas, a class of function for which the best known algorithm runs in time $n^{\tilde{O}(n^{1/3})}$ (Klivans and Servedio, 2004). As an application of our result, we give a simple polynomial-time differentially-private algorithm for releasing monotone disjunction queries with low average error over the uniform distribution on disjunctions.

View on arXiv

Comments on this paper