论文标题
部分可观测时空混沌系统的无模型预测
A Speech Representation Anonymization Framework via Selective Noise Perturbation
论文作者
论文摘要
当将语音信号传达给诸如自动语音识别(ASR)和语音情感识别(SER)之类的云服务时,隐私和安全是主要问题。语音匿名化的现有解决方案主要集中于语音转换或语音修改,以将原始话语转换为具有相似内容但与身份相关的信息相似或没有与身份相关的信息。但是,在隐私保护表示形式下共享语音数据的另一种方法在很大程度上已经探索了。在本文中,我们提出了一个语音匿名框架,该框架通过噪声驱动到使用预训练的语音编码器提取的高utility表示子集实现隐私。该子集由基于变压器的隐私风险显着性估计器选择。我们在四个任务上验证了我们的框架,即自动说话者验证(ASV),ASR,SER和意图分类(IC)进行隐私和公用事业评估。实验结果表明,与VoicePrivacy2022挑战的语音匿名基线相比,我们的方法能够实现竞争性甚至更好的实用性,从而提供了相同的隐私水平。此外,易于控制的扰动量使我们的框架可以在不重新训练任何组件的情况下具有灵活的隐私性权衡权衡范围。
Privacy and security are major concerns when communicating speech signals to cloud services such as automatic speech recognition (ASR) and speech emotion recognition (SER). Existing solutions for speech anonymization mainly focus on voice conversion or voice modification to convert a raw utterance into another one with similar content but different, or no, identity-related information. However, an alternative approach to share speech data under the form of privacy-preserving representation has been largely under-explored. In this paper, we propose a speech anonymization framework that achieves privacy via noise perturbation to a selected subset of the high-utility representations extracted using a pre-trained speech encoder. The subset is chosen with a Transformer-based privacy-risk saliency estimator. We validate our framework on four tasks, namely, Automatic Speaker Verification (ASV), ASR, SER and Intent Classification (IC) for privacy and utility assessment. Experimental results show that our approach is able to achieve a competitive, or even better, utility compared to the speech anonymization baselines from the VoicePrivacy2022 Challenges, providing the same level of privacy. Moreover, the easily-controlled amount of perturbation allows our framework to have a flexible range of privacy-utility trade-offs without re-training any component.