Multiscale Change-Point Inference

We introduce a new estimator SMUCE (simultaneous multiscale change-point estimator) for the change-point problem in exponential family regression. An unknown step function is estimated by minimizing the number of change-points over the acceptance region of a multiscale test at a level \alpha. The probability of overestimating the true number of change-points K is controlled by the asymptotic null distribution of the multiscale test statistic. Further, we derive exponential bounds for the probability of underestimating K. By balancing these quantities, \alpha will be chosen such that the probability of correctly estimating K is maximized. All results are even non-asymptotic for the normal case. Based on the aforementioned bounds, we construct asymptotically honest confidence sets for the unknown step function and its change-points. At the same time, we obtain exponential bounds for estimating the change-point locations which for example yield the minimax rate O(1/n) up to a log term. Finally, SMUCE asymptotically achieves the optimal detection rate of vanishing signals. We illustrate how dynamic programming techniques can be employed for efficient computation of estimators and confidence regions. The performance of the proposed multiscale approach is illustrated by simulations and in two cutting-edge applications from genetic engineering and photoemission spectroscopy.
View on arXiv