Limitations of Using Identical Distributions for Training and Testing When Learning Boolean Functions
Jordi Pérez-Guijarro
- OOD

Main:6 Pages
Bibliography:2 Pages
Appendix:8 Pages
Abstract
When the distributions of the training and test data do not coincide, the problem of understanding generalization becomes considerably more complex, prompting a variety of questions. In this work, we focus on a fundamental one: Is it always optimal for the training distribution to be identical to the test distribution? Surprisingly, assuming the existence of one-way functions, we find that the answer is no. That is, matching distributions is not always the best scenario, which contrasts with the behavior of most learning methods. Nonetheless, we also show that when certain regularities are imposed on the target functions, the standard conclusion is recovered in the case of the uniform distribution.
View on arXivComments on this paper
