当同态加密与秘密共享结合时：安全的大规模稀疏逻辑回归和风险控制中的应用

论文标题

当同态加密与秘密共享结合时：安全的大规模稀疏逻辑回归和风险控制中的应用

When Homomorphic Encryption Marries Secret Sharing: Secure Large-Scale Sparse Logistic Regression and Applications in Risk Control

论文作者

Chen, Chaochao, Zhou, Jun, Wang, Li, Wu, Xibin, Fang, Wenjing, Tan, Jin, Wang, Lei, Liu, Alex X., Wang, Hao, Hong, Cheng

论文摘要

逻辑回归（LR）是行业中最广泛使用的机器学习模型，其效率，鲁棒性和解释性。由于数据隔离问题和高模型性能的要求，行业中的许多应用都要求为多方建立安全有效的LR模型。大多数现有的工作都使用同态加密（HE）或秘密共享（SS）来构建安全的LR。基于他的方法可以处理高维稀疏功能，但它们会带来潜在的安全风险。基于SS的方法具有可证明的安全性，但是在高维稀疏功能下它们存在效率问题。在本文中，我们首先提出了凯撒（Caesar），该凯撒（Caesar）结合了HE和SS来构建安全的大规模稀疏逻辑回归模型，并达到效率和安全性。然后，我们介绍凯撒的分布式实现，以供伸缩性。我们已将凯撒在风险控制任务中部署，并进行了全面的实验。我们的实验结果表明，凯撒将最新模型提高了约130次。

Logistic Regression (LR) is the most widely used machine learning model in industry for its efficiency, robustness, and interpretability. Due to the problem of data isolation and the requirement of high model performance, many applications in industry call for building a secure and efficient LR model for multiple parties. Most existing work uses either Homomorphic Encryption (HE) or Secret Sharing (SS) to build secure LR. HE based methods can deal with high-dimensional sparse features, but they incur potential security risks. SS based methods have provable security, but they have efficiency issue under high-dimensional sparse features. In this paper, we first present CAESAR, which combines HE and SS to build secure large-scale sparse logistic regression model and achieves both efficiency and security. We then present the distributed implementation of CAESAR for scalability requirement. We have deployed CAESAR in a risk control task and conducted comprehensive experiments. Our experimental results show that CAESAR improves the state-of-the-art model by around 130 times.

下载PDF全文

下载文献需遵守相关版权规定

论文标题