论文标题
认证从总体总和集群中认证群集
Certifying clusters from sum-of-norms clustering
论文作者
论文摘要
总体总和聚类是基于凸优化的聚类公式,自动诱导层次结构。已经提出了多种算法来解决优化问题:Hocking等人,Chi和Lange的ADMM和ADA的亚级别下降,Panahi等人的随机增量算法。 Sun等人的Lagrangian方法和半齿牛顿CG。即使需要精确的解决方案来确定正确的群集分配,所有算法也会产生近似解决方案。本文的目的是缩小现有算法的输出与优化问题的精确解决方案之间的差距。我们提出了一项聚类测试,该测试可以识别并证明从任何原始偶算法产生的近似解决方案中的正确集群分配。我们的认证验证了单位和乘法权重的聚类。如果近似值不准确,则该测试可能不会成功。但是,我们显示正确的群集分配可以保证在足够多的迭代后通过算法后的原始偶路径认证,前提是模型参数$λ$避免了有限数量的不良值。数值实验是在高斯混合物和半月份数据上进行的,这表明精心选择的乘量增加了 - 标记群集的恢复能力。
Sum-of-norms clustering is a clustering formulation based on convex optimization that automatically induces hierarchy. Multiple algorithms have been proposed to solve the optimization problem: subgradient descent by Hocking et al., ADMM and ADA by Chi and Lange, stochastic incremental algorithm by Panahi et al. and semismooth Newton-CG augmented Lagrangian method by Sun et al. All algorithms yield approximate solutions, even though an exact solution is demanded to determine the correct cluster assignment. The purpose of this paper is to close the gap between the output from existing algorithms and the exact solution to the optimization problem. We present a clustering test that identifies and certifies the correct cluster assignment from an approximate solution yielded by any primal-dual algorithm. Our certification validates clustering for both unit and multiplicative weights. The test may not succeed if the approximation is inaccurate. However, we show the correct cluster assignment is guaranteed to be certified by a primal-dual path following algorithm after sufficiently many iterations, provided that the model parameter $λ$ avoids a finite number of bad values. Numerical experiments are conducted on Gaussian mixture and half-moon data, which indicate that carefully chosen multiplicative weights increase the recovery power of sum-of-norms clustering.