Inferring Sparse Preference Lists From Partial Information

This paper pursues a new non-parametric approach to modeling distributions over the set of all permutations of objects. In particular, assuming available data that consists of the probabilities of events under an unknown distribution over permutations, we seek to find a distribution over permutations of sparsest support and nearly consistent with the observed data. In other words, we seek to find a `simple' explanation for the observed data. From a modeling perspective, our approach is natural and truly non-parametric. However, the computational task associated with finding the sparsest distribution consistent with observed data is daunting. One straightforward approaches to the problem requires solving an integer program whose dimension in exponential in -- an essentially hopeless task. The present paper contributes a novel scheme to accomplishing this task. The scheme exploits a combinatorial characterization of the solutions to such problems in developing an integer program that produces an `approximate' solution to the learning problem. This integer program has polynomial dimension in and and renders the approach quite practical. We characterize the quality of the approximations produced when the `true' distribution arises from a Plackett-Luce model or from a natural exponential family.
View on arXiv