13
1

Change point detection in high dimensional data with U-statistics

Abstract

We consider the problem of detecting distributional changes in a sequence of high dimensional data. Our approach combines two separate statistics stemming from LpL_p norms whose behavior is similar under H0H_0 but potentially different under HAH_A, leading to a testing procedure that that is flexible against a variety of alternatives. We establish the asymptotic distribution of our proposed test statistics separately in cases of weakly dependent and strongly dependent coordinates as min{N,d}\min\{N,d\}\to\infty, where NN denotes sample size and dd is the dimension, and establish consistency of testing and estimation procedures in high dimensions under one-change alternative settings. Computational studies in single and multiple change point scenarios demonstrate our method can outperform other nonparametric approaches in the literature for certain alternatives in high dimensions. We illustrate our approach though an application to Twitter data concerning the mentions of U.S. Governors.

View on arXiv
Comments on this paper