论文标题

高维位置高斯混合物的最佳估计

Optimal estimation of high-dimensional location Gaussian mixtures

论文作者

Doss, Natalie, Wu, Yihong, Yang, Pengkun, Zhou, Harrison H.

论文摘要

本文研究了有限的高斯位置混合模型中的最佳估计速率,而没有分离条件。我们假设$ k $的组件数量是有限的,并且中心位于有界半径的球中,同时允许尺寸$ d $与样本量$ n $一样大。扩展了Heinrich和Kahn \ Cite {HK2015}的一维结果,我们表明,估计Wasserstein距离中混合分布的最小值为$θ((D/N) $ O(nd^2+n^{5/4})$。此外,我们表明,可以在Hellinger距离中以最佳参数速率$θ(\ sqrt {d/n})$估算混合物密度,并提供计算上有效的算法,以在$ k = 2 $的特殊情况下达到此速率。 理论发展和方法论发展都依赖于仔细应用时刻方法。我们结果的核心是观察到有限高斯混合物的信息几何形状的特征是混合分布的矩张量,其低级别结构可以被利用以获得尖锐的局部熵边界。

This paper studies the optimal rate of estimation in a finite Gaussian location mixture model in high dimensions without separation conditions. We assume that the number of components $k$ is bounded and that the centers lie in a ball of bounded radius, while allowing the dimension $d$ to be as large as the sample size $n$. Extending the one-dimensional result of Heinrich and Kahn \cite{HK2015}, we show that the minimax rate of estimating the mixing distribution in Wasserstein distance is $Θ((d/n)^{1/4} + n^{-1/(4k-2)})$, achieved by an estimator computable in time $O(nd^2+n^{5/4})$. Furthermore, we show that the mixture density can be estimated at the optimal parametric rate $Θ(\sqrt{d/n})$ in Hellinger distance and provide a computationally efficient algorithm to achieve this rate in the special case of $k=2$. Both the theoretical and methodological development rely on a careful application of the method of moments. Central to our results is the observation that the information geometry of finite Gaussian mixtures is characterized by the moment tensors of the mixing distribution, whose low-rank structure can be exploited to obtain a sharp local entropy bound.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源