25
3

Structure Learning of HH-colorings

Abstract

We study the structure learning problem for HH-colorings, an important class of Markov random fields that capture key combinatorial structures on graphs, including proper colorings and independent sets, as well as spin systems from statistical physics. The learning problem is as follows: for a fixed (and known) constraint graph HH with qq colors and an unknown graph G=(V,E)G=(V,E) with nn vertices, given uniformly random HH-colorings of GG, how many samples are required to learn the edges of the unknown graph GG? We give a characterization of HH for which the problem is identifiable for every GG, i.e., we can learn GG with an infinite number of samples. We also show that there are identifiable constraint graphs for which one cannot hope to learn every graph GG efficiently. We focus particular attention on the case of proper vertex qq-colorings of graphs of maximum degree dd where intriguing connections to statistical physics phase transitions appear. We prove that in the tree uniqueness region (when q>dq>d) the problem is identifiable and we can learn GG in poly(d,q)×O(n2logn){\rm poly}(d,q) \times O(n^2\log{n}) time. In contrast for soft-constraint systems, such as the Ising model, the best possible running time is exponential in dd. In the tree non-uniqueness region (when qdq\leq d) we prove that the problem is not identifiable and thus GG cannot be learned. Moreover, when q<dd+Θ(1)q<d-\sqrt{d} + \Theta(1) we prove that even learning an equivalent graph (any graph with the same set of HH-colorings) is computationally hard---sample complexity is exponential in nn in the worst case. We further explore the connection between the efficiency/hardness of the structure learning problem and the uniqueness/non-uniqueness phase transition for general HH-colorings and prove that under the well-known Dobrushin uniqueness condition, we can learn GG in poly(d,q)×O(n2logn){\rm poly}(d,q)\times O(n^2\log{n}) time.

View on arXiv
Comments on this paper