层次数据的联合潜在类回归

论文标题

层次数据的联合潜在类回归

Federated Latent Class Regression for Hierarchical Data

论文作者

Yang, Bin, Carette, Thomas, Jimbo, Masanobu, Maruyama, Shinya

论文摘要

联合学习（FL）允许许多代理参与培训全球机器学习模型，而无需披露本地存储的数据。与传统的分布式学习相比，药物的异质性（非IID）减慢了FL中的收敛性。此外，许多数据集太嘈杂或太小，很容易被复杂的模型（例如深神经网络）过多。在这里，我们考虑了在嘈杂，分层和表格数据集上使用FL回归的问题，其中用户分布有显着差异。受潜在类回归（LCR）的启发，我们提出了一种新颖的概率模型，分层潜在的阶级回归（HLCR），并扩展到联邦学习的扩展。 FEDHLCR由线性回归模型的混合物组成，比简单的线性回归允许更好的准确性，同时保持其分析属性并避免过度拟合。我们的推论算法源自贝叶斯理论，为过度拟合提供了强大的融合保证和良好的鲁棒性。实验结果表明，即使在非IID数据集中，FedHLCR也提供快速收敛。

Federated Learning (FL) allows a number of agents to participate in training a global machine learning model without disclosing locally stored data. Compared to traditional distributed learning, the heterogeneity (non-IID) of the agents slows down the convergence in FL. Furthermore, many datasets, being too noisy or too small, are easily overfitted by complex models, such as deep neural networks. Here, we consider the problem of using FL regression on noisy, hierarchical and tabular datasets in which user distributions are significantly different. Inspired by Latent Class Regression (LCR), we propose a novel probabilistic model, Hierarchical Latent Class Regression (HLCR), and its extension to Federated Learning, FEDHLCR. FEDHLCR consists of a mixture of linear regression models, allowing better accuracy than simple linear regression, while at the same time maintaining its analytical properties and avoiding overfitting. Our inference algorithm, being derived from Bayesian theory, provides strong convergence guarantees and good robustness to overfitting. Experimental results show that FEDHLCR offers fast convergence even in non-IID datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题