论文标题
爱马仕攻击:以无损推理精度窃取DNN模型
Hermes Attack: Steal DNN Models with Lossless Inference Accuracy
论文作者
论文摘要
深度神经网络(DNNS)模型由于其在应用程序的各个方面的关键作用而成为最有价值的企业资产之一。随着DNN模型私有化部署的趋势,DNN模型的数据泄漏变得越来越严重和广泛。所有现有的模型攻击攻击只能泄漏具有较低准确性或高空费用的目标DNN型号的部分。在本文中,我们首先确定一个新的攻击表面 - 未加密的PCIE流量,以泄漏DNN模型。基于这种新的攻击表面,我们提出了一种新型的模型划分攻击,即爱马仕攻击,这是第一次窃取整个受害者DNN模型的攻击。被盗的DNN模型具有与原始体系结构相同的超参数,参数和语义相同的体系结构。由于封闭源的CUDA运行时,驱动程序和GPU内部设备以及无证件的数据结构以及PCIE流量中某些关键语义的丢失,因此具有挑战性。此外,还有数百万个PCIE数据包,上面有许多噪音和混乱订单。我们的爱马仕攻击通过巨大的反向工程工作和可靠的语义重建以及熟练的数据包选择和订单校正来解决这些问题。我们在三个NVIDIA GPU平台上实施了HERMES攻击的原型,并评估了两个顺序的DNN模型(即Minist和VGG)和一个相关的DNN模型(即RESNET),即NVIDIA GEFORCE GT 730,NVIDIA GEFORCE GEFX GEFORCE GTX GETX GETX GETX 1080 TI ti ti ti ti ti ti ti ti ti ti ti ti ti rtiia。评估结果表明,我们的方案能够有效,完全重建所有图像,从而对任何图像进行推断。通过包含10,000张图像的CIFAR10测试数据集进行评估,实验结果表明,被盗模型的推理精度与原始模型相同(即无损推理精度)。
Deep Neural Networks (DNNs) models become one of the most valuable enterprise assets due to their critical roles in all aspects of applications. With the trend of privatization deployment of DNN models, the data leakage of the DNN models is becoming increasingly serious and widespread. All existing model-extraction attacks can only leak parts of targeted DNN models with low accuracy or high overhead. In this paper, we first identify a new attack surface -- unencrypted PCIe traffic, to leak DNN models. Based on this new attack surface, we propose a novel model-extraction attack, namely Hermes Attack, which is the first attack to fully steal the whole victim DNN model. The stolen DNN models have the same hyper-parameters, parameters, and semantically identical architecture as the original ones. It is challenging due to the closed-source CUDA runtime, driver, and GPU internals, as well as the undocumented data structures and the loss of some critical semantics in the PCIe traffic. Additionally, there are millions of PCIe packets with numerous noises and chaos orders. Our Hermes Attack addresses these issues by huge reverse engineering efforts and reliable semantic reconstruction, as well as skillful packet selection and order correction. We implement a prototype of the Hermes Attack, and evaluate two sequential DNN models (i.e., MINIST and VGG) and one consequential DNN model (i.e., ResNet) on three NVIDIA GPU platforms, i.e., NVIDIA Geforce GT 730, NVIDIA Geforce GTX 1080 Ti, and NVIDIA Geforce RTX 2080 Ti. The evaluation results indicate that our scheme is able to efficiently and completely reconstruct ALL of them with making inferences on any one image. Evaluated with Cifar10 test dataset that contains 10,000 images, the experiment results show that the stolen models have the same inference accuracy as the original ones (i.e., lossless inference accuracy).