论文标题

界定具有relu激活的神经网络的局部区域线性区域的数量

Bounding The Number of Linear Regions in Local Area for Neural Networks with ReLU Activations

论文作者

Zhu, Rui, Lin, Bo, Tang, Haixu

论文摘要

线性区域的数量是使用分段线性激活函数(例如Relu)的神经网络的不同属性之一,与使用其他激活函数进行的传统函数进行比较。先前的研究表明,该特性反映了神经网络家族的表现性([14])。结果,它可以用来表征神经网络模型的结构复杂性如何影响其旨在计算的功能。尽管如此,直接计算线性区域的数量是一个挑战。因此,许多研究人员专注于估计使用Relu的深神经网络线性区域数量的界限(尤其是上限)。但是,这些方法试图估计整个输入空间中的上限。仍然缺乏理论方法来估计输入空间特定区域内的线性区域的数量,例如,以训练数据点为中心的球体,例如对抗性示例或后门触发器。在本文中,我们提出了第一种估计给定Relu神经网络输入空间中任何球体中线性区域数量上限的方法。我们实现了该方法,并使用曲线线性活动函数计算了深神经网络中的边界。我们的实验表明,在训练神经网络时,线性区域的边界往往会偏离训练数据点。此外,我们观察到以训练数据点为中心的球体往往包含比输入空间中任何任意点的线性区域。据我们所知,这是对特定数据点围绕线性区域边界区域的首次研究。我们认为我们的工作是研究特定输入区域深神经网络的结构复杂性的第一步。

The number of linear regions is one of the distinct properties of the neural networks using piecewise linear activation functions such as ReLU, comparing with those conventional ones using other activation functions. Previous studies showed this property reflected the expressivity of a neural network family ([14]); as a result, it can be used to characterize how the structural complexity of a neural network model affects the function it aims to compute. Nonetheless, it is challenging to directly compute the number of linear regions; therefore, many researchers focus on estimating the bounds (in particular the upper bound) of the number of linear regions for deep neural networks using ReLU. These methods, however, attempted to estimate the upper bound in the entire input space. The theoretical methods are still lacking to estimate the number of linear regions within a specific area of the input space, e.g., a sphere centered at a training data point such as an adversarial example or a backdoor trigger. In this paper, we present the first method to estimate the upper bound of the number of linear regions in any sphere in the input space of a given ReLU neural network. We implemented the method, and computed the bounds in deep neural networks using the piece-wise linear active function. Our experiments showed that, while training a neural network, the boundaries of the linear regions tend to move away from the training data points. In addition, we observe that the spheres centered at the training data points tend to contain more linear regions than any arbitrary points in the input space. To the best of our knowledge, this is the first study of bounding linear regions around a specific data point. We consider our work as a first step toward the investigation of the structural complexity of deep neural networks in a specific input area.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源