Optimal partition recovery in general graphs

We consider a graph-structured change point problem in which we observe a random vector with piecewise constant but unknown mean and whose independent, sub-Gaussian coordinates correspond to the nodes of a fixed graph. We are interested in the localisation task of recovering the partition of the nodes associated to the constancy regions of the mean vector. When the partition consists of only two elements, we characterise the difficulty of the localisation problem in terms of four key parameters: the maximal noise variance , the size of the smaller element of the partition, the magnitude of the difference in the signal values across contiguous elements of the partition and the sum of the effective resistance edge weights of the corresponding cut -- a graph theoretic quantity quantifying the size of the partition boundary. In particular, we demonstrate an information theoretical lower bound implying that, in the low signal-to-noise ratio regime , no consistent estimator of the true partition exists. On the other hand, when , with being the sum of effective resistance weighted edges and being any diverging sequence in , we show that a polynomial-time, approximate -penalised least squared estimator delivers a localisation error -- measured by the symmetric difference between the true and estimated partition -- of order . Aside from the term, this rate is minimax optimal. Finally, we provide discussions on the localisation error for more general partitions of unknown sizes.
View on arXiv