53
21

Universality laws for Gaussian mixtures in generalized linear models

Abstract

Let (xi,yi)i=1,,n(x_{i}, y_{i})_{i=1,\dots,n} denote independent samples from a general mixture distribution cCρcPcx\sum_{c\in\mathcal{C}}\rho_{c}P_{c}^{x}, and consider the hypothesis class of generalized linear models y^=F(Θx)\hat{y} = F(\Theta^{\top}x). In this work, we investigate the asymptotic joint statistics of the family of generalized linear estimators (Θ1,,ΘM)(\Theta_{1}, \dots, \Theta_{M}) obtained either from (a) minimizing an empirical risk R^n(Θ;X,y)\hat{R}_{n}(\Theta;X,y) or (b) sampling from the associated Gibbs measure exp(βnR^n(Θ;X,y))\exp(-\beta n \hat{R}_{n}(\Theta;X,y)). Our main contribution is to characterize under which conditions the asymptotic joint statistics of this family depends (on a weak sense) only on the means and covariances of the class conditional features distribution PcxP_{c}^{x}. In particular, this allow us to prove the universality of different quantities of interest, such as the training and generalization errors, redeeming a recent line of work in high-dimensional statistics working under the Gaussian mixture hypothesis. Finally, we discuss the applications of our results to different machine learning tasks of interest, such as ensembling and uncertainty

View on arXiv
Comments on this paper

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.