一个女孩有一个名字：检测作者身份混淆

论文标题

一个女孩有一个名字：检测作者身份混淆

A Girl Has A Name: Detecting Authorship Obfuscation

论文作者

Mahmood, Asad, Shafiq, Zubair, Srinivasan, Padmini

论文摘要

作者归因旨在基于造型分析来确定文本的作者。另一方面，作者的混淆旨在通过修改文本的样式来防止作者归因。在本文中，我们评估了对抗性威胁模型下最先进的作者混淆方法的隐身性。混淆器是隐秘的，在对手发现检测offuscator修改的文本是否被混淆的范围内 - 这是对作者归因感兴趣的对手的关键的决定。我们表明，现有的作者混淆方法并不是隐秘的，因为可以识别出其混淆的文本，平均F1分数为0.87。缺乏隐秘性的原因是，这些混淆的人以可检测的方式降低了文本平滑度，如神经语言模型所确定的那样。我们的结果强调了需要开发隐形作者混淆方法的必要性，以更好地保护作者寻求匿名的身份。

Authorship attribution aims to identify the author of a text based on the stylometric analysis. Authorship obfuscation, on the other hand, aims to protect against authorship attribution by modifying a text's style. In this paper, we evaluate the stealthiness of state-of-the-art authorship obfuscation methods under an adversarial threat model. An obfuscator is stealthy to the extent an adversary finds it challenging to detect whether or not a text modified by the obfuscator is obfuscated - a decision that is key to the adversary interested in authorship attribution. We show that the existing authorship obfuscation methods are not stealthy as their obfuscated texts can be identified with an average F1 score of 0.87. The reason for the lack of stealthiness is that these obfuscators degrade text smoothness, as ascertained by neural language models, in a detectable manner. Our results highlight the need to develop stealthy authorship obfuscation methods that can better protect the identity of an author seeking anonymity.

下载PDF全文

下载文献需遵守相关版权规定

论文标题