基于第一原理的神经网络

论文标题

基于第一原理的神经网络

A Neural Network Based on First Principles

论文作者

Baggenstoss, Paul M

论文摘要

在本文中，神经网络源自第一原理，只是假设每一层始于线性减少尺寸的转换。该方法吸引了最大熵（Maxent）的原理，以找到以层输出变量为条件的每一层输入数据的后验分布。该后验具有明确定义的条件平均估计器，该平均值是使用具有理论上衍生的激活函数的一种类型的神经网络计算得出的，类似于Sigmoid，SoftPlus和Relu。这隐含地为他们的使用提供了理论上的理由。提出了一个定理，该定理提出了最高先验的条件分布和条件均值估计器，并为特殊情况统一结果。结合层会导致自动编码器与常规的进纸分析网络以及重建路径中的一种线性贝叶斯信念网络。

In this paper, a Neural network is derived from first principles, assuming only that each layer begins with a linear dimension-reducing transformation. The approach appeals to the principle of Maximum Entropy (MaxEnt) to find the posterior distribution of the input data of each layer, conditioned on the layer output variables. This posterior has a well-defined mean, the conditional mean estimator, that is calculated using a type of neural network with theoretically-derived activation functions similar to sigmoid, softplus, and relu. This implicitly provides a theoretical justification for their use. A theorem that finds the conditional distribution and conditional mean estimator under the MaxEnt prior is proposed, unifying results for special cases. Combining layers results in an auto-encoder with conventional feed-forward analysis network and a type of linear Bayesian belief network in the reconstruction path.

下载PDF全文

下载文献需遵守相关版权规定

论文标题