论文标题

使用随机投影的最小二乘的下限和近乎最佳的收缩估计器

Lower Bounds and a Near-Optimal Shrinkage Estimator for Least Squares using Random Projections

论文作者

Sridhar, Srivatsan, Pilanci, Mert, Özgür, Ayfer

论文摘要

在这项工作中,我们将使用随机投影的确定性优化视为统计估计问题,其中与估计器的预测与真实解决方案之间的平方距离是误差度量。在使用高斯草图大约解决最小二乘问题问题时,我们表明该草图的解决方案具有条件性高斯分布,而真实的解决方案是其平均值。首先,使用高斯草图为任何估计器得出了带有显式常数的紧密最坏情况误差,并且经典素描被证明是最佳的无偏估计器。对于有偏见的估计器,下限还包含了有关真实解决方案的先验知识。 其次,我们使用James-Stein估计器使用高斯草图来得出最小二乘解决方案的改进估计器。该估计器的预期误差上的上限被得出,该估计值小于任何给定数据的经典高斯草图解决方案的误差。当已知真正解决方案的SNR小时,并且数据矩阵的条件良好时,上限和下限匹配。从经验上讲,该估计器在模拟和真实数据集上达到了较小的错误,并且也适用于其他常见的素描方法。

In this work, we consider the deterministic optimization using random projections as a statistical estimation problem, where the squared distance between the predictions from the estimator and the true solution is the error metric. In approximately solving a large scale least squares problem using Gaussian sketches, we show that the sketched solution has a conditional Gaussian distribution with the true solution as its mean. Firstly, tight worst case error lower bounds with explicit constants are derived for any estimator using the Gaussian sketch, and the classical sketching is shown to be the optimal unbiased estimator. For biased estimators, the lower bound also incorporates prior knowledge about the true solution. Secondly, we use the James-Stein estimator to derive an improved estimator for the least squares solution using the Gaussian sketch. An upper bound on the expected error of this estimator is derived, which is smaller than the error of the classical Gaussian sketch solution for any given data. The upper and lower bounds match when the SNR of the true solution is known to be small and the data matrix is well conditioned. Empirically, this estimator achieves smaller error on simulated and real datasets, and works for other common sketching methods as well.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源