论文标题
Auber:自动化的Bert正则化
AUBER: Automated BERT Regularization
论文作者
论文摘要
我们如何有效地规范伯特?尽管伯特证明了其在各种下游自然语言处理任务中的有效性,但在只有少量培训实例时,它通常会过度贴合。正规化BERT的一个有希望的方向是基于基于对头部重要性的代理分数修剪注意力的方向。但是,基于启发式的方法通常是最佳选择,因为它们预先确定了注意力头的修剪顺序。为了克服这种限制,我们提出了Auber,这是一种有效的正则化方法,利用强化学习从BERT中自动修剪注意力。 Auber不依赖于启发式方法或基于规则的政策,而是学会了修剪政策,该政策决定了哪些注意力负责人应该或不应修剪正规化。实验结果表明,Auber通过提高准确性高达10%的精度来优于现有的修剪方法。此外,我们的消融研究从经验上证明了我们设计选择对Auber的有效性。
How can we effectively regularize BERT? Although BERT proves its effectiveness in various downstream natural language processing tasks, it often overfits when there are only a small number of training instances. A promising direction to regularize BERT is based on pruning its attention heads based on a proxy score for head importance. However, heuristic-based methods are usually suboptimal since they predetermine the order by which attention heads are pruned. In order to overcome such a limitation, we propose AUBER, an effective regularization method that leverages reinforcement learning to automatically prune attention heads from BERT. Instead of depending on heuristics or rule-based policies, AUBER learns a pruning policy that determines which attention heads should or should not be pruned for regularization. Experimental results show that AUBER outperforms existing pruning methods by achieving up to 10% better accuracy. In addition, our ablation study empirically demonstrates the effectiveness of our design choices for AUBER.