论文标题
关于减少多尺度神经网络的频谱偏差,用于回归问题
On Spectral Bias Reduction of Multi-scale Neural Networks for Regression Problems
论文作者
论文摘要
在本文中,我们在光谱域中得出了扩散方程模型,以使两层多尺度深神经网络(MSCALEDNN)的训练误差的演变\ cite {caixu2019,liu2020multi},旨在减少完全连接的深层神经网络在近似近似osscill osscill ofimions中的频谱。扩散模型是从Mscalednn的误差方程的光谱形式获得的,该模型以神经切线核方法和梯度下降训练以及正弦激活函数得出,假设学习率消失了,并且无限网络宽度和域大小。如果在MSCALEDNN中使用了更多尺度,则显示出涉及的扩散系数具有较大的支撑,因此,频域中提出的扩散方程模型解释了MSCALEDNN的光谱偏置降低能力。两层MSCALEDNN训练的扩散模型的数值结果与实际梯度下降训练的误差演变具有相当大的网络宽度,从而验证了扩散模型的有效性。同时,MSCALEDNN的数值结果显示在较大频率范围内的误差衰减,并确认使用MSCALEDNN在具有广泛频率范围的近似函数中使用的优势。
In this paper, we derive diffusion equation models in the spectral domain for the evolution of training errors of two-layer multi-scale deep neural networks (MscaleDNN) \cite{caixu2019,liu2020multi}, designed to reduce the spectral bias of fully connected deep neural networks in approximating oscillatory functions. The diffusion models are obtained from the spectral form of the error equation of the MscaleDNN, derived with a neural tangent kernel approach and gradient descent training and a sine activation function, assuming a vanishing learning rate and infinite network width and domain size. The involved diffusion coefficients are shown to have larger supports if more scales are used in the MscaleDNN, and thus, the proposed diffusion equation models in the frequency domain explain the MscaleDNN's spectral bias reduction capability. Numerical results of the diffusion models for a two-layer MscaleDNN training match with the error evolution of actual gradient descent training with a reasonably large network width, thus validating the effectiveness of the diffusion models. Meanwhile, the numerical results for MscaleDNN show error decay over a wide frequency range and confirm the advantage of using the MscaleDNN in approximating functions with a wide range of frequencies.