Private Set Intersection: A Multi-Message Symmetric Private Information Retrieval Perspective

We study the problem of private set intersection (PSI). In this problem, there are two entities , for , each storing a set , whose elements are picked from a finite field , on replicated and non-colluding databases. It is required to determine the set intersection without leaking any information about the remaining elements to the other entity with the least amount of downloaded bits. We first show that the PSI problem can be recast as a multi-message symmetric private information retrieval (MM-SPIR) problem. Next, as a stand-alone result, we derive the information-theoretic sum capacity of MM-SPIR, . We show that with messages, databases, and the size of the desired message set , the exact capacity of MM-SPIR is when , provided that the entropy of the common randomness satisfies per desired symbol. This result implies that there is no gain for MM-SPIR over successive single-message SPIR (SM-SPIR). For the MM-SPIR problem, we present a novel capacity-achieving scheme that builds on the near-optimal scheme of Banawan-Ulukus originally proposed for the multi-message PIR (MM-PIR) problem without database privacy constraints. Surprisingly, our scheme here is exactly optimal for the MM-SPIR problem for any , in contrast to the scheme for the MM-PIR problem, which was proved only to be near-optimal. Our scheme is an alternative to the SM-SPIR scheme of Sun-Jafar. Based on this capacity result for MM-SPIR, and after addressing the added requirements in its conversion to the PSI problem, we show that the optimal download cost for the PSI problem is , where is the cardinality of set
View on arXiv