论文标题
规模稀疏回归:分支和结合植根于一阶优化
Sparse Regression at Scale: Branch-and-Bound rooted in First-Order Optimization
论文作者
论文摘要
我们考虑最小二乘回归问题,并以$ \ ell_ {0} $的组合和平方$ \ ell_ {2} $惩罚函数(a.k.a. $ \ ell_0 \ ell_2 $正则化)。最近的工作表明,在许多高维统计环境中,由此产生的估计器至关重要。但是,这些估计器的精确计算仍然是一个主要挑战。实际上,当功能$ p \ sim 10^4 $的数量时,基于混合整数编程(MIP)的现代精确方法面临困难。在这项工作中,我们为$ \ ell_0 \ ell_2 $ regarlized回归提供了一个新的精确MIP框架,该回归可以扩展到$ p \ sim 10^7 $,与最新的精确方法相比,达到至少$ 5000 $ x的加速。与最近依靠现代商用MIP求解器的工作不同,我们通过严格利用问题结构来设计专门的非线性分支机构(BNB)框架。我们框架中的关键区分分量在于基于坐标下降(CD)的专用一阶方法有效地解决节点弛豫。我们的基于CD的方法可以通过使用温暖的开始,主动集和梯度筛选有效地利用BNB节点的信息。此外,我们设计了一种新的方法,用于从原始CD溶液中获得双重界限,该溶液可在高维度上使用该方法。关于合成和实际高维数据集的实验表明,我们的框架不仅要比最新的状态快得多,而且还可以为无法使用现有方法处理的统计上挑战性实例提供认证的最佳解决方案。我们通过工具包L0BNB开源实现。
We consider the least squares regression problem, penalized with a combination of the $\ell_{0}$ and squared $\ell_{2}$ penalty functions (a.k.a. $\ell_0 \ell_2$ regularization). Recent work shows that the resulting estimators are of key importance in many high-dimensional statistical settings. However, exact computation of these estimators remains a major challenge. Indeed, modern exact methods, based on mixed integer programming (MIP), face difficulties when the number of features $p \sim 10^4$. In this work, we present a new exact MIP framework for $\ell_0\ell_2$-regularized regression that can scale to $p \sim 10^7$, achieving speedups of at least $5000$x, compared to state-of-the-art exact methods. Unlike recent work, which relies on modern commercial MIP solvers, we design a specialized nonlinear branch-and-bound (BnB) framework, by critically exploiting the problem structure. A key distinguishing component in our framework lies in efficiently solving the node relaxations using a specialized first-order method, based on coordinate descent (CD). Our CD-based method effectively leverages information across the BnB nodes, through using warm starts, active sets, and gradient screening. In addition, we design a novel method for obtaining dual bounds from primal CD solutions, which certifiably works in high dimensions. Experiments on synthetic and real high-dimensional datasets demonstrate that our framework is not only significantly faster than the state of the art, but can also deliver certifiably optimal solutions to statistically challenging instances that cannot be handled with existing methods. We open source the implementation through our toolkit L0BnB.