19
13

Local Correlation Clustering with Asymmetric Classification Errors

Abstract

In the Correlation Clustering problem, we are given a complete weighted graph GG with its edges labeled as "similar" and "dissimilar" by a noisy binary classifier. For a clustering C\mathcal{C} of graph GG, a similar edge is in disagreement with C\mathcal{C}, if its endpoints belong to distinct clusters; and a dissimilar edge is in disagreement with C\mathcal{C} if its endpoints belong to the same cluster. The disagreements vector, dis\text{dis}, is a vector indexed by the vertices of GG such that the vv-th coordinate disv\text{dis}_v equals the weight of all disagreeing edges incident on vv. The goal is to produce a clustering that minimizes the p\ell_p norm of the disagreements vector for p1p\geq 1. We study the p\ell_p objective in Correlation Clustering under the following assumption: Every similar edge has weight in the range of [αw,w][\alpha\mathbf{w},\mathbf{w}] and every dissimilar edge has weight at least αw\alpha\mathbf{w} (where α1\alpha \leq 1 and w>0\mathbf{w}>0 is a scaling parameter). We give an O((1α)1212plog1α)O\left((\frac{1}{\alpha})^{\frac{1}{2}-\frac{1}{2p}}\cdot \log\frac{1}{\alpha}\right) approximation algorithm for this problem. Furthermore, we show an almost matching convex programming integrality gap.

View on arXiv
Comments on this paper