论文标题
通过较弱的近端甲骨
Faster Projection-Free Augmented Lagrangian Methods via Weak Proximal Oracle
论文作者
论文摘要
本文考虑了具有仿射约束的凸复合优化问题,其中包括在(简单)凸集的交点上最小化光滑的凸目标函数的形式,或使用多个(简单)函数正规化。由高维应用程序进行的激励,其中确切的投影/近端计算是无法处理的,我们提出了一个基于拉格朗日的方法\ textIt {fovidention-doffiction-nree-doffiction-forive-tovention-doffection-forive-toverion-texpimal {wpo)。在较早的工作中,WPO被证明比标准\ textit {线性最小化oracle}(lmo)强大,该(lmo)是基于条件梯度方法的基础(又称Frank-Wolfe方法)。此外,WPO在许多高维问题上是可以在计算上进行的,包括以低级矩阵和张力量的恢复,以及对允许有效LMO的多元化的优化。本文的主要结果表明,在一定的曲率假设(比强凸度弱)下,我们的基于WPO的算法对于客观残留和可行性差距而言,基于WPO的算法达到了$ O(1/t)$的依望收敛速率。据我们所知,该结果可改善此类问题现有的基于LMO的无投影方法的$ O(1/\ sqrt {t})$ rates。对低级别和稀疏协方差矩阵估计任务和最大切割半闪烁弛豫的经验实验表明,我们的方法可以超越基于LMO的最先进的基于Lagrangian的方法。
This paper considers a convex composite optimization problem with affine constraints, which includes problems that take the form of minimizing a smooth convex objective function over the intersection of (simple) convex sets, or regularized with multiple (simple) functions. Motivated by high-dimensional applications in which exact projection/proximal computations are not tractable, we propose a \textit{projection-free} augmented Lagrangian-based method, in which primal updates are carried out using a \textit{weak proximal oracle} (WPO). In an earlier work, WPO was shown to be more powerful than the standard \textit{linear minimization oracle} (LMO) that underlies conditional gradient-based methods (aka Frank-Wolfe methods). Moreover, WPO is computationally tractable for many high-dimensional problems of interest, including those motivated by recovery of low-rank matrices and tensors, and optimization over polytopes which admit efficient LMOs. The main result of this paper shows that under a certain curvature assumption (which is weaker than strong convexity), our WPO-based algorithm achieves an ergodic rate of convergence of $O(1/T)$ for both the objective residual and feasibility gap. This result, to the best of our knowledge, improves upon the $O(1/\sqrt{T})$ rate for existing LMO-based projection-free methods for this class of problems. Empirical experiments on a low-rank and sparse covariance matrix estimation task and the Max Cut semidefinite relaxation demonstrate that of our method can outperform state-of-the-art LMO-based Lagrangian-based methods.