通过约束优化的分数深神经网络

论文标题

通过约束优化的分数深神经网络

Fractional Deep Neural Network via Constrained Optimization

论文作者

Antil, Harbir, Khatri, Ratna, Löhner, Rainald, Verma, Deepanshu

论文摘要

本文为深神经网络（DNN）介绍了一种新颖的算法框架，该框架以数学上严格的方式使我们能够将历史记录（或内存）纳入网络 - 它确保所有层都可以相互连接。该DNN称为分数-DNN，可以看作是时间非线性普通微分方程（ODE）的分数的时间差异。然后，学习问题是一个最小化的问题，但要遵守该部分颂歌作为约束。我们强调，现在众所周知，现有的DNN和ODE与标准时间导数之间的类比。我们工作的重点是分数-DNN。使用Lagrangian方法，我们提供了向后传播和设计方程的推导。我们在几个数据集上测试我们的网络是否存在分类问题。与现有DNN相比，分数DNN具有各种优势。由于内存效应，由于网络能够近似非平滑函数的能力，由于记忆效应而对消失的梯度问题进行了重大改进。

This paper introduces a novel algorithmic framework for a deep neural network (DNN), which in a mathematically rigorous manner, allows us to incorporate history (or memory) into the network -- it ensures all layers are connected to one another. This DNN, called Fractional-DNN, can be viewed as a time-discretization of a fractional in time nonlinear ordinary differential equation (ODE). The learning problem then is a minimization problem subject to that fractional ODE as constraints. We emphasize that an analogy between the existing DNN and ODEs, with standard time derivative, is well-known by now. The focus of our work is the Fractional-DNN. Using the Lagrangian approach, we provide a derivation of the backward propagation and the design equations. We test our network on several datasets for classification problems. Fractional-DNN offers various advantages over the existing DNN. The key benefits are a significant improvement to the vanishing gradient issue due to the memory effect, and better handling of nonsmooth data due to the network's ability to approximate non-smooth functions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题