论文标题
预先训练语言模型的强大彩票门票
Robust Lottery Tickets for Pre-trained Language Models
论文作者
论文摘要
彩票票证假设的最新作品表明,预训练的语言模型(PLM)包含较小的匹配子网(获胜票),这些子网(获胜票)能够达到与原始型号相当的准确性。但是,这些门票被证明是对对抗性示例的不可思议,甚至比PLM同行还差。为了解决这个问题,我们提出了一种基于学习二进制重量面罩的新方法,以识别隐藏在原始PLM中的强大门票。由于二进制面膜的损失是无法差异的,因此我们使用L0正则化的平滑近似近似将硬性混凝土分配分配给面具,并鼓励其稀疏性。Furthermore。我们设计了一个对抗性损失目标,以指导搜索强大的门票并确保门票的准确性和稳健性。实验结果表明,在对抗性鲁棒性评估方面,所提出的方法的显着改善。
Recent works on Lottery Ticket Hypothesis have shown that pre-trained language models (PLMs) contain smaller matching subnetworks(winning tickets) which are capable of reaching accuracy comparable to the original models. However, these tickets are proved to be notrobust to adversarial examples, and even worse than their PLM counterparts. To address this problem, we propose a novel method based on learning binary weight masks to identify robust tickets hidden in the original PLMs. Since the loss is not differentiable for the binary mask, we assign the hard concrete distribution to the masks and encourage their sparsity using a smoothing approximation of L0 regularization.Furthermore, we design an adversarial loss objective to guide the search for robust tickets and ensure that the tickets perform well bothin accuracy and robustness. Experimental results show the significant improvement of the proposed method over previous work on adversarial robustness evaluation.