论文标题

种子二进制分割:快速和最佳变化点检测的一般方法

Seeded Binary Segmentation: A general methodology for fast and optimal change point detection

论文作者

Kovács, Solt, Li, Housen, Bühlmann, Peter, Munk, Axel

论文摘要

近年来,对于大规模变更点检测问题的有效算法的需求增加了。为此,我们提出了种子二进制分割,这是一种依赖背景间隔的确定性结构(称为种子间隔)的方法,其中搜索了单个变更点。根据种子间隔的候选者的最终选择可以以各种方式完成,适合当前的问题。因此,种子二进制分割很容易适应广泛的变化点检测问题,让这是单变量,多变量甚至高维。 我们考虑详细的平均设置上的单变量高斯变化。对于这种特定情况,我们表明二进制分割会导致与基本变化点无关的接近线性时间方法(即线性到对数因素)。此外,使用适当的选择方法,该方法被证明是最佳的渐近最小值。尽管计算更有效,但与最先进的程序相比,有限的样本估计性能仍然具有竞争力。此外,我们说明了具有逆协方差变化点检测问题的高维设置的方法,在这些方法中,我们的建议会导致巨大的计算增长,同时仍表现出良好的统计性能。

In recent years, there has been an increasing demand on efficient algorithms for large scale change point detection problems. To this end, we propose seeded binary segmentation, an approach relying on a deterministic construction of background intervals, called seeded intervals, in which single change points are searched. The final selection of change points based on the candidates from seeded intervals can be done in various ways, adapted to the problem at hand. Thus, seeded binary segmentation is easy to adapt to a wide range of change point detection problems, let that be univariate, multivariate or even high-dimensional. We consider the univariate Gaussian change in mean setup in detail. For this specific case we show that seeded binary segmentation leads to a near-linear time approach (i.e. linear up to a logarithmic factor) independent of the underlying number of change points. Furthermore, using appropriate selection methods, the methodology is shown to be asymptotically minimax optimal. While computationally more efficient, the finite sample estimation performance remains competitive compared to state of the art procedures. Moreover, we illustrate the methodology for high-dimensional settings with an inverse covariance change point detection problem where our proposal leads to massive computational gains while still exhibiting good statistical performance.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源