11
1

Conditional Sparse p\ell_p-norm Regression With Optimal Probability

Abstract

We consider the following conditional linear regression problem: the task is to identify both (i) a kk-DNF condition cc and (ii) a linear rule ff such that the probability of cc is (approximately) at least some given bound μ\mu, and ff minimizes the p\ell_p loss of predicting the target zz in the distribution of examples conditioned on cc. Thus, the task is to identify a portion of the distribution on which a linear rule can provide a good fit. Algorithms for this task are useful in cases where simple, learnable rules only accurately model portions of the distribution. The prior state-of-the-art for such algorithms could only guarantee finding a condition of probability Ω(μ/nk)\Omega(\mu/n^k) when a condition of probability μ\mu exists, and achieved an O(nk)O(n^k)-approximation to the target loss, where nn is the number of Boolean attributes. Here, we give efficient algorithms for solving this task with a condition cc that nearly matches the probability of the ideal condition, while also improving the approximation to the target loss. We also give an algorithm for finding a kk-DNF reference class for prediction at a given query point, that obtains a sparse regression fit that has loss within O(nk)O(n^k) of optimal among all sparse regression parameters and sufficiently large kk-DNF reference classes containing the query point.

View on arXiv
Comments on this paper