从ASR模型中提取有针对性的培训数据，以及如何减轻它

论文标题

从ASR模型中提取有针对性的培训数据，以及如何减轻它

Extracting Targeted Training Data from ASR Models, and How to Mitigate It

论文作者

Amid, Ehsan, Thakkar, Om, Narayanan, Arun, Mathews, Rajiv, Beaufays, Françoise

论文摘要

最近的工作设计了方法来证明ASR培训中的模型更新可以泄漏计算更新中使用的话语的潜在敏感属性。在这项工作中，我们设计了第一种方法，以证明有关培训ASR模型培训数据的信息泄漏。我们设计了噪声遮罩，这是一种填充风格的方法，用于从训练有素的ASR模型中提取培训数据的目标部分。我们通过在四个设置中使用用于训练最先进的构象模型的Librispeech数据集中提取名称来证明噪声掩蔽的成功。特别是，我们表明我们能够以11.8％的精度从蒙面的训练语言中提取正确的名称，而该模型的时间为55.2％的时间。此外，我们表明，即使在使用合成音频和部分成绩单的设置中，我们的方法也达到了2.5％的正确名称准确性（47.7％的任何名称成功率）。最后，我们设计了单词辍学，这是一种数据增强方法，我们在训练中与多类培训一起使用时显示了可比的实用程序作为基线，并通过在四个评估的设置中通过噪声掩饰进行了显着减轻提取。

Recent work has designed methods to demonstrate that model updates in ASR training can leak potentially sensitive attributes of the utterances used in computing the updates. In this work, we design the first method to demonstrate information leakage about training data from trained ASR models. We design Noise Masking, a fill-in-the-blank style method for extracting targeted parts of training data from trained ASR models. We demonstrate the success of Noise Masking by using it in four settings for extracting names from the LibriSpeech dataset used for training a state-of-the-art Conformer model. In particular, we show that we are able to extract the correct names from masked training utterances with 11.8% accuracy, while the model outputs some name from the train set 55.2% of the time. Further, we show that even in a setting that uses synthetic audio and partial transcripts from the test set, our method achieves 2.5% correct name accuracy (47.7% any name success rate). Lastly, we design Word Dropout, a data augmentation method that we show when used in training along with Multistyle TRaining (MTR), provides comparable utility as the baseline, along with significantly mitigating extraction via Noise Masking across the four evaluated settings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题