FedDCT：使用鸿沟和协作培训对资源约束设备上的大型卷积神经网络的联合学习

论文标题

FedDCT：使用鸿沟和协作培训对资源约束设备上的大型卷积神经网络的联合学习

FedDCT: Federated Learning of Large Convolutional Neural Networks on Resource Constrained Devices using Divide and Collaborative Training

论文作者

Nguyen, Quan, Pham, Hieu H., Wong, Kok-Seng, Nguyen, Phi Le, Nguyen, Truong Thao, Do, Minh N.

论文摘要

我们介绍了FedDCT，这是一种新颖的分布式学习范式，可在资源有限的边缘设备上使用大型，高性能的CNN。与传统的FL方法相反，在每个培训回合期间都要求每个客户独立培训全尺寸神经网络，拟议中的FedDCT允许几个客户组合通过将其分为几个小型子模型的集合并在多个设备上训练多个设备，同时保持私密性，同时维持私密性。在这个协作培训过程中，来自同一集群的客户也可以互相学习，从而进一步提高他们的合奏表现。在聚合阶段，服务器采用了由所有集群训练的所有集合模型的加权平均值。 FedDCT减少了内存要求，并允许低端设备参与FL。我们对标准化数据集进行了广泛的实验，包括CIFAR-10，CIFAR-100和两个现实世界中的医疗数据集HAM10000和VAIPE。实验结果表明，FedDCT优于具有有趣的收敛行为的一组当前的SOTA FL方法。此外，与其他现有方法相比，FedDCT实现了更高的准确性，并大大减少了通信次数（$ 4-8美元的内存要求），以达到测试数据集上所需的准确性，而不会在服务器方面遇到任何额外的培训成本。

We introduce FedDCT, a novel distributed learning paradigm that enables the usage of large, high-performance CNNs on resource-limited edge devices. As opposed to traditional FL approaches, which require each client to train the full-size neural network independently during each training round, the proposed FedDCT allows a cluster of several clients to collaboratively train a large deep learning model by dividing it into an ensemble of several small sub-models and train them on multiple devices in parallel while maintaining privacy. In this collaborative training process, clients from the same cluster can also learn from each other, further improving their ensemble performance. In the aggregation stage, the server takes a weighted average of all the ensemble models trained by all the clusters. FedDCT reduces the memory requirements and allows low-end devices to participate in FL. We empirically conduct extensive experiments on standardized datasets, including CIFAR-10, CIFAR-100, and two real-world medical datasets HAM10000 and VAIPE. Experimental results show that FedDCT outperforms a set of current SOTA FL methods with interesting convergence behaviors. Furthermore, compared to other existing approaches, FedDCT achieves higher accuracy and substantially reduces the number of communication rounds (with $4-8$ times fewer memory requirements) to achieve the desired accuracy on the testing dataset without incurring any extra training cost on the server side.

下载PDF全文

下载文献需遵守相关版权规定

论文标题