ADASWARM：通过群体智能增强基于梯度的优化者

论文标题

ADASWARM：通过群体智能增强基于梯度的优化者

AdaSwarm: Augmenting Gradient-Based optimizers in Deep Learning with Swarm Intelligence

论文作者

Mohapatra, Rohan, Saha, Snehanshu, Coello, Carlos A. Coello, Bhattacharya, Anwesh, Dhavala, Soma S., Saha, Sriparna

论文摘要

本文介绍了Adaswarm，这是一种新型的无梯度优化器，其性能甚至比神经网络中采用的ADAM优化器相似甚至更好。为了支持我们所提出的ADASWARM，提出了一种新型的增强动量颗粒颗粒优化器（EMPSO）。 ADASWARM解决优化问题的能力归因于其执行良好梯度近似的能力。我们表明，可以使用EMPSO的参数近似任何函数的梯度。这是一种模拟GD的新技术，它位于数值方法和群智能之间的边界。还提供了产生的梯度近似的数学证明。 Adaswarm与几个最先进的（SOTA）优化器紧密竞争。我们还表明，ADASWARM能够在反向传播过程中处理各种损失功能，包括最大绝对误差（MAE）。

This paper introduces AdaSwarm, a novel gradient-free optimizer which has similar or even better performance than the Adam optimizer adopted in neural networks. In order to support our proposed AdaSwarm, a novel Exponentially weighted Momentum Particle Swarm Optimizer (EMPSO), is proposed. The ability of AdaSwarm to tackle optimization problems is attributed to its capability to perform good gradient approximations. We show that, the gradient of any function, differentiable or not, can be approximated by using the parameters of EMPSO. This is a novel technique to simulate GD which lies at the boundary between numerical methods and swarm intelligence. Mathematical proofs of the gradient approximation produced are also provided. AdaSwarm competes closely with several state-of-the-art (SOTA) optimizers. We also show that AdaSwarm is able to handle a variety of loss functions during backpropagation, including the maximum absolute error (MAE).

下载PDF全文

下载文献需遵守相关版权规定

论文标题