论文标题
用于计算近端运算符和神经网络特征向量的非线性功率法
Nonlinear Power Method for Computing Eigenvectors of Proximal Operators and Neural Networks
论文作者
论文摘要
神经网络已彻底改变了数据科学领域,以数据驱动的方式产生了出色的解决方案。例如,在数学成像领域,它们基于凸正则化超过了传统方法。但是,支持实际应用的基本理论仍处于发展的早期阶段。我们仔细研究神经网络,并通过非线性特征值分析对其进行检查。非线性光谱理论领域仍在出现,提供了有关非线性操作员和系统的见解。在本文中,我们将神经网络视为复杂的非线性操作员,并试图找到其非线性特征向量。我们首先讨论了这种特征向量的存在,并分析了Relu网络的内核。然后,我们研究通用非线性算子的非线性功率方法。对于与绝对一系列凸正则化功能相关的近端运算符,我们可以证明该方法将方法收敛于近端运算符的特征向量。这促使我们将非线性方法应用于被训练的网络,这些网络被训练为近端运营商。为了考虑神经网络的非均匀性,我们定义了功率方法的修改版本。 我们为不同的近端操作员以及设计用于图像denoising的各种浅层和深神网络进行了广泛的实验。近端特征向量将用于图形的几何分析,作为聚类或距离函数的计算。对于简单的神经网,我们观察到训练数据对特征向量的影响。对于最先进的denoising网络,我们表明特征向量可以将其解释为网络的(联合国)稳定模式,当被噪声或其他降解污染时。
Neural networks have revolutionized the field of data science, yielding remarkable solutions in a data-driven manner. For instance, in the field of mathematical imaging, they have surpassed traditional methods based on convex regularization. However, a fundamental theory supporting the practical applications is still in the early stages of development. We take a fresh look at neural networks and examine them via nonlinear eigenvalue analysis. The field of nonlinear spectral theory is still emerging, providing insights about nonlinear operators and systems. In this paper we view a neural network as a complex nonlinear operator and attempt to find its nonlinear eigenvectors. We first discuss the existence of such eigenvectors and analyze the kernel of ReLU networks. Then we study a nonlinear power method for generic nonlinear operators. For proximal operators associated to absolutely one-homogeneous convex regularization functionals, we can prove convergence of the method to an eigenvector of the proximal operator. This motivates us to apply a nonlinear method to networks which are trained to act similarly as a proximal operator. In order to take the non-homogeneity of neural networks into account we define a modified version of the power method. We perform extensive experiments for different proximal operators and on various shallow and deep neural networks designed for image denoising. Proximal eigenvectors will be used for geometric analysis of graphs, as clustering or the computation of distance functions. For simple neural nets, we observe the influence of training data on the eigenvectors. For state-of-the-art denoising networks, we show that eigenvectors can be interpreted as (un)stable modes of the network, when contaminated with noise or other degradations.