论文标题
信息理论下限,用于前馈完全连接的深网
Information Theoretic Lower Bounds for Feed-Forward Fully-Connected Deep Networks
论文作者
论文摘要
在本文中,我们研究了参数的确切恢复的样本复杂性下限,并使用信息理论工具出现了馈送前向,完全连接的神经网络进行二进制分类的积极风险。我们通过存在以向后数据生成过程为特征的生成网络的存在来证明这些下限,其中输入是基于二进制输出生成的,并且网络由隐藏层的权重参数进行参数化。要精确恢复参数的样本复杂性下限为$ω(d r \ log(r) + p)$,正值超额风险为$ω(r \ log(r) + p)$,其中$ p $是输入的尺寸,$ r $反映了重量矩阵的等级,$ d $是$ d $ $ d $是隐藏层的数量。据我们所知,我们的结果是第一个信息理论下限。
In this paper, we study the sample complexity lower bounds for the exact recovery of parameters and for a positive excess risk of a feed-forward, fully-connected neural network for binary classification, using information-theoretic tools. We prove these lower bounds by the existence of a generative network characterized by a backwards data generating process, where the input is generated based on the binary output, and the network is parametrized by weight parameters for the hidden layers. The sample complexity lower bound for the exact recovery of parameters is $Ω(d r \log(r) + p )$ and for a positive excess risk is $Ω(r \log(r) + p )$, where $p$ is the dimension of the input, $r$ reflects the rank of the weight matrices and $d$ is the number of hidden layers. To the best of our knowledge, our results are the first information theoretic lower bounds.