On Top- Selection from -wise Partial Rankings via Borda Counting

We analyze the performance of the Borda counting algorithm in a non-parametric model. The algorithm needs to utilize probabilistic rankings of the items within -sized subsets to accurately determine which items are the overall top- items in a total of items. The Borda counting algorithm simply counts the cumulative scores for each item from these partial ranking observations. This generalizes a previous work of a similar nature by Shah et al. using probabilistic pairwise comparison data. The performance of the Borda counting algorithm critically depends on the associated score separation between the -th item and the -th item. Specifically, we show that if is greater than certain value, then the top- items selected by the algorithm is asymptotically accurate almost surely; if is below certain value, then the result will be inaccurate with a constant probability. In the special case of , i.e., pairwise comparison, the resultant bound is tighter than that given by Shah et al., leading to a reduced gap between the error probability upper and lower bounds. These results are further extended to the approximate top- selection setting. Numerical experiments demonstrate the effectiveness and accuracy of the Borda counting algorithm, compared with the spectral MLE-based algorithm, particularly when the data does not necessarily follow an assumed parametric model.
View on arXiv