Sketch-based Randomized Algorithms for Dynamic Graph Regression

Applied Mathematics and Computation (Appl. Math. Comput.), 2019

28 May 2019

Abstract

A well-known problem in data science and machine learning is {\em linear regression}, which is recently extended to dynamic graphs. Existing exact algorithms for updating the solution of dynamic graph regression problem require at least a linear time (in terms of $n$ : the size of the graph). However, this time complexity might be intractable in practice. In the current paper, we utilize {\em subsampled randomized Hadamard transform} and \textsf{CountSketch} to propose the first randomized algorithms. Suppose that we are given an $n\times m$ matrix embedding $M$ of the graph, where $m \ll n$ . Let $r$ be the number of samples required for a guaranteed approximation error, which is a sublinear function of $n$ . Our first algorithm reduces time complexity of pre-processing to $O(n(m + 1) + 2n(m + 1) \log_2(r + 1) + rm^2)$ . Then after an edge insertion or an edge deletion, it updates the approximate solution in $O(rm)$ time. Our second algorithm reduces time complexity of pre-processing to $O \left( nnz(M) + m^3 \epsilon^{-2} \log^7(m/\epsilon) \right)$ , where $nnz(M)$ is the number of nonzero elements of $M$ . Then after an edge insertion or an edge deletion or a node insertion or a node deletion, it updates the approximate solution in $O(qm)$ time, with $q=O\left(\frac{m^2}{\epsilon^2} \log^6(m/\epsilon) \right)$ . Finally, we show that under some assumptions, if $\ln n < \epsilon^{-1}$ our first algorithm outperforms our second algorithm and if $\ln n \geq \epsilon^{-1}$ our second algorithm outperforms our first algorithm.

View on arXiv

Comments on this paper