论文标题
用掩盖语言模型对时间表的多语言标准化
Multilingual Normalization of Temporal Expressions with Masked Language Models
论文作者
论文摘要
时间表的检测和归一化是许多应用程序的重要任务和预处理步骤。但是,先前的关于规范化的工作是基于规则的,这严重限制了现实世界中多语言设置的适用性,这是由于新规则的昂贵。我们提出了一种基于掩盖语言建模的时间表达式标准化的新型神经方法。我们的多语言方法以多种语言,尤其是低资源语言的基于规则的系统优于先前的基于规则的系统,而与艺术的状态相比,高达33 F1的性能提高了33 f1。
The detection and normalization of temporal expressions is an important task and preprocessing step for many applications. However, prior work on normalization is rule-based, which severely limits the applicability in real-world multilingual settings, due to the costly creation of new rules. We propose a novel neural method for normalizing temporal expressions based on masked language modeling. Our multilingual method outperforms prior rule-based systems in many languages, and in particular, for low-resource languages with performance improvements of up to 33 F1 on average compared to the state of the art.