使用方差的联合学习减少了概率激活剂的随机梯度

论文标题

使用方差的联合学习减少了概率激活剂的随机梯度

Federated Learning Using Variance Reduced Stochastic Gradient for Probabilistically Activated Agents

论文作者

Rostami, M. R., Kia, S. S.

论文摘要

本文提出了一种具有两层结构的联合学习算法（FL），该算法既可以降低方差降低，又可以在每个剂量都有任意选择选择的概率的设置中，又可以更快地收敛到最佳解决方案。在分布式机器学习中，当隐私问题很重要时，FL是一种功能性工具。将FL放置在具有某种不规则连接（设备）的环境中，以经济和快速的方式达到训练有素的模型可能是一项苛刻的工作。我们算法的第一层对应于服务器所做的代理之间的模型参数传播。在第二层中，每个代理都使用称为随机方差降低梯度（SVRG）的随机和差异技术进行本地更新。当代理想要执行本地更新步骤以减少由随机梯度下降（SGD）引起的方差时，我们利用从随机优化的差异概念。我们为我们的算法提供了收敛键，该算法从$ o（\ frac {1} {\ sqrt {k}}}）$转换为$ o（\ frac {1} {k} {k}）$，通过使用恒定的步进大小。我们使用数值示例演示了算法的性能。

This paper proposes an algorithm for Federated Learning (FL) with a two-layer structure that achieves both variance reduction and a faster convergence rate to an optimal solution in the setting where each agent has an arbitrary probability of selection in each iteration. In distributed machine learning, when privacy matters, FL is a functional tool. Placing FL in an environment where it has some irregular connections of agents (devices), reaching a trained model in both an economical and quick way can be a demanding job. The first layer of our algorithm corresponds to the model parameter propagation across agents done by the server. In the second layer, each agent does its local update with a stochastic and variance-reduced technique called Stochastic Variance Reduced Gradient (SVRG). We leverage the concept of variance reduction from stochastic optimization when the agents want to do their local update step to reduce the variance caused by stochastic gradient descent (SGD). We provide a convergence bound for our algorithm which improves the rate from $O(\frac{1}{\sqrt{K}})$ to $O(\frac{1}{K})$ by using a constant step-size. We demonstrate the performance of our algorithm using numerical examples.

下载PDF全文

下载文献需遵守相关版权规定

论文标题