Change point detection in high dimensional data with U-statistics

18 July 2022

Abstract

We consider the problem of detecting distributional changes in a sequence of high dimensional data. Our approach combines two separate statistics stemming from $L_p$ norms whose behavior is similar under $H_0$ but potentially different under $H_A$ , leading to a testing procedure that that is flexible against a variety of alternatives. We establish the asymptotic distribution of our proposed test statistics separately in cases of weakly dependent and strongly dependent coordinates as $\min\{N,d\}\to\infty$ , where $N$ denotes sample size and $d$ is the dimension, and establish consistency of testing and estimation procedures in high dimensions under one-change alternative settings. Computational studies in single and multiple change point scenarios demonstrate our method can outperform other nonparametric approaches in the literature for certain alternatives in high dimensions. We illustrate our approach though an application to Twitter data concerning the mentions of U.S. Governors.

View on arXiv

Comments on this paper