Limitations of Using Identical Distributions for Training and Testing When Learning Boolean Functions

30 November 2025

Jordi Pérez-Guijarro

OOD

ArXiv (abs)PDF HTML

Main:6 Pages

Bibliography:2 Pages

Appendix:8 Pages

Abstract

When the distributions of the training and test data do not coincide, the problem of understanding generalization becomes considerably more complex, prompting a variety of questions. In this work, we focus on a fundamental one: Is it always optimal for the training distribution to be identical to the test distribution? Surprisingly, assuming the existence of one-way functions, we find that the answer is no. That is, matching distributions is not always the best scenario, which contrasts with the behavior of most learning methods. Nonetheless, we also show that when certain regularities are imposed on the target functions, the standard conclusion is recovered in the case of the uniform distribution.

View on arXiv

Comments on this paper