论文标题

通过迭代频谱降低对平均矢量的几乎最小值稳健估计器

Nearly minimax robust estimator of the mean vector by iterative spectral dimension reduction

论文作者

Bateni, Amir-Hossein, Minasyan, Arshak, Dalalyan, Arnak S.

论文摘要

我们研究了对高斯分布的平均向量的稳健估计问题。我们引入了一个基于光谱尺寸降低(SDR)的估计器,并建立有限的样品上限,其误差最小到对数因子的误差。此外,我们证明了SDR估计器的分解点等于$ 1/2 $,这是故障点的最高值。另外,SDR估计器是按相似性变换而均等的,并且计算复杂性较低。更确切地说,在$ n $ dimension $ p $的$ n $ vectors中 - 最多最多$ \ varepsilon n $ n $ out在对手上损坏 - SDR估算器具有$ \ big的顺序错误(\ frac {r_σ} {r_σ} {r_σ} {n} + \ varepsilon^2 \ logs log(1/\ log)订单的时间$ p^3 + n p^2 $。在这里,$r_σ\ le p $是参考分布的协方差矩阵的有效等级。 SDR估计器的另一个优点是,它不需要了解污染率,也不涉及样品分裂。我们还研究了所提出的算法的扩展以及在(部分)未知协方差矩阵的情况下获得的结果。

We study the problem of robust estimation of the mean vector of a sub-Gaussian distribution. We introduce an estimator based on spectral dimension reduction (SDR) and establish a finite sample upper bound on its error that is minimax-optimal up to a logarithmic factor. Furthermore, we prove that the breakdown point of the SDR estimator is equal to $1/2$, the highest possible value of the breakdown point. In addition, the SDR estimator is equivariant by similarity transforms and has low computational complexity. More precisely, in the case of $n$ vectors of dimension $p$ -- at most $\varepsilon n$ out of which are adversarially corrupted -- the SDR estimator has a squared error of order $\big(\frac{r_Σ}{n} + \varepsilon^2\log(1/\varepsilon)\big){\log p}$ and a running time of order $p^3 + n p^2$. Here, $r_Σ\le p$ is the effective rank of the covariance matrix of the reference distribution. Another advantage of the SDR estimator is that it does not require knowledge of the contamination rate and does not involve sample splitting. We also investigate extensions of the proposed algorithm and of the obtained results in the case of (partially) unknown covariance matrix.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源