We study the classic problem of correlation clustering in dynamic node streams. In this setting, nodes are either added or randomly deleted over time, and each node pair is connected by a positive or negative edge. The objective is to continuously find a partition which minimizes the sum of positive edges crossing clusters and negative edges within clusters. We present an algorithm that maintains an -approximation with (polylog ) amortized update time. Prior to our work, Behnezhad, Charikar, Ma, and L. Tan achieved a -approximation with expected update time in edge streams which translates in node streams to an -update time where is the maximum possible degree. Finally we complement our theoretical analysis with experiments on real world data.
View on arXiv