Sharp Information-Theoretic Thresholds for Shuffled Linear Regression

This paper studies the problem of shuffled linear regression, where the correspondence between predictors and responses in a linear model is obfuscated by a latent permutation. Specifically, we consider the model , where is an standard Gaussian design matrix, is Gaussian noise with entrywise variance , is an unknown permutation matrix, and is the regression coefficient, also unknown. Previous work has shown that, in the large -limit, the minimal signal-to-noise ratio (), , for recovering the unknown permutation exactly with high probability is between and for some absolute constant and the sharp threshold is unknown even for . We show that this threshold is precisely for exact recovery throughout the sublinear regime . As a by-product of our analysis, we also determine the sharp threshold of almost exact recovery to be , where all but a vanishing fraction of the permutation is reconstructed.
View on arXiv