论文标题

有效的ML模型用于实用安全推断

Efficient ML Models for Practical Secure Inference

论文作者

Ganesan, Vinod, Bhattacharya, Anwesh, Kumar, Pratyush, Gupta, Divya, Sharma, Rahul, Chandran, Nishanth

论文摘要

ML-AS-A-Service继续增长,对非常强大的隐私保证的需求也是如此。安全推论已成为一种潜在的解决方案,其中加密原始图允许推理,而无需向用户向用户揭示用户的输入或模型的权重。例如,模型提供商可以是一家诊断公司,该公司已经培训了一种最先进的Densenet-121模型来解释胸部X射线,并且用户可以是医院的患者。尽管对于这种环境,但安全推论原则上是可行的,但没有现有的技术使其大规模实用。 Cryptflow2框架提供了一种潜在的解决方案,其能力自动,正确地将清晰文本推理转换为安全模型的推断。但是,Cryptflow2产生的安全推断在不切实际上很昂贵:在Densenet-121上解释单个X射线需要几乎3TB的通信。在本文中,我们解决了针对安全推断的效率低下的出色挑战,并提出了三项贡献。首先,我们表明,安全推理中的主要瓶颈是大型线性层,可以通过选择网络骨干的选择来优化,并且使用运算符的使用是为有效的清晰文本推断。这一发现和强调与许多最近的作品相偏离,这些作品着重于在执行较小网络的安全推断时优化非线性激活层。其次,基于对瓶颈卷积层的分析,我们设计了一个更有效的置换式X操作器。第三,我们表明,快速的Winograd卷积算法进一步提高了安全推断的效率。结合起来,这三个优化被证明对在CHEXPERT数据集中训练的X射线解释问题非常有效。

ML-as-a-service continues to grow, and so does the need for very strong privacy guarantees. Secure inference has emerged as a potential solution, wherein cryptographic primitives allow inference without revealing users' inputs to a model provider or model's weights to a user. For instance, the model provider could be a diagnostics company that has trained a state-of-the-art DenseNet-121 model for interpreting a chest X-ray and the user could be a patient at a hospital. While secure inference is in principle feasible for this setting, there are no existing techniques that make it practical at scale. The CrypTFlow2 framework provides a potential solution with its ability to automatically and correctly translate clear-text inference to secure inference for arbitrary models. However, the resultant secure inference from CrypTFlow2 is impractically expensive: Almost 3TB of communication is required to interpret a single X-ray on DenseNet-121. In this paper, we address this outstanding challenge of inefficiency of secure inference with three contributions. First, we show that the primary bottlenecks in secure inference are large linear layers which can be optimized with the choice of network backbone and the use of operators developed for efficient clear-text inference. This finding and emphasis deviates from many recent works which focus on optimizing non-linear activation layers when performing secure inference of smaller networks. Second, based on analysis of a bottle-necked convolution layer, we design a X-operator which is a more efficient drop-in replacement. Third, we show that the fast Winograd convolution algorithm further improves efficiency of secure inference. In combination, these three optimizations prove to be highly effective for the problem of X-ray interpretation trained on the CheXpert dataset.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源