通过缩放梯度下降快速且可证明的稳定的主成分分析

论文标题

通过缩放梯度下降快速且可证明的稳定的主成分分析

Fast and Provable Tensor Robust Principal Component Analysis via Scaled Gradient Descent

论文作者

Dong, Harry, Tong, Tian, Ma, Cong, Chi, Yuejie

论文摘要

越来越多的数据科学和机器学习问题取决于张量的计算，与矩阵相比，它更好地捕获数据的多路关系和数据相互作用。当利用这一关键优势时，一个关键的挑战是开发计算上有效的算法，以从张量数据中提取有用的信息，这些信息同时对腐败和不良条件也很强。本文解决了张量稳定的主成分分析（RPCA），该分析旨在从塔克分解下的稀疏腐败污染的观察结果中回收低排名的张量。为了最大程度地减少计算和内存足迹，我们建议通过缩放梯度下降（ScaledGD）直接恢复低维张量因子（从量身定制的光谱初始化开始），并与迭代变化的阈值操作相结合，以适应腐败的影响。从理论上讲，我们确定所提出的算法以恒定的速率线性收敛到真正的低级张量，只要损坏的水平不太大，该算法与其状态数独立于其状态数。从经验上讲，我们证明了所提出的算法比最新的矩阵和张量RPCA算法更好，更可扩展的性能，通过合成实验和现实世界应用。

An increasing number of data science and machine learning problems rely on computation with tensors, which better capture the multi-way relationships and interactions of data than matrices. When tapping into this critical advantage, a key challenge is to develop computationally efficient and provably correct algorithms for extracting useful information from tensor data that are simultaneously robust to corruptions and ill-conditioning. This paper tackles tensor robust principal component analysis (RPCA), which aims to recover a low-rank tensor from its observations contaminated by sparse corruptions, under the Tucker decomposition. To minimize the computation and memory footprints, we propose to directly recover the low-dimensional tensor factors -- starting from a tailored spectral initialization -- via scaled gradient descent (ScaledGD), coupled with an iteration-varying thresholding operation to adaptively remove the impact of corruptions. Theoretically, we establish that the proposed algorithm converges linearly to the true low-rank tensor at a constant rate that is independent with its condition number, as long as the level of corruptions is not too large. Empirically, we demonstrate that the proposed algorithm achieves better and more scalable performance than state-of-the-art matrix and tensor RPCA algorithms through synthetic experiments and real-world applications.

下载PDF全文

下载文献需遵守相关版权规定

论文标题