论文标题
自由复数:无限制的分配安装分辨率
Free the Plural: Unrestricted Split-Antecedent Anaphora Resolution
论文作者
论文摘要
既然Coreference解析器在更简单的Ahaphoric参考形式上的性能得到了很大改善,则更多的关注专门用于更复杂的Anaphora方面。几乎所有核心分辨率模型的一个局限性是关注单反性传动机。具有多个先决条件的复数传真器(就像约翰·麦克·玛丽(John Met Mary)一样。他们去看电影)并未得到广泛的研究,因为它们在Ontonotes中没有注释,并且在其他语料库中相对较少。在本文中,我们介绍了第一个用于不受限制的分裂抗态图的模型。我们从BERT嵌入增强的强大基线开始,并表明我们可以通过解决稀疏问题来大大提高其性能。为此,我们尝试了辅助语料库,其中人群注释了分裂的放电器,并使用桥接元素参考和单次抗议核心作为辅助任务的转移学习模型。对黄金注释的Arrau语料库进行的评估表明,OUT最佳模型使用三个辅助语料库的组合,当在宽松和严格的环境中评估时,F1分数达到70%和43.6%,即与我们的基线相比,11%和21个百分点的增长。
Now that the performance of coreference resolvers on the simpler forms of anaphoric reference has greatly improved, more attention is devoted to more complex aspects of anaphora. One limitation of virtually all coreference resolution models is the focus on single-antecedent anaphors. Plural anaphors with multiple antecedents-so-called split-antecedent anaphors (as in John met Mary. They went to the movies) have not been widely studied, because they are not annotated in ONTONOTES and are relatively infrequent in other corpora. In this paper, we introduce the first model for unrestricted resolution of split-antecedent anaphors. We start with a strong baseline enhanced by BERT embeddings, and show that we can substantially improve its performance by addressing the sparsity issue. To do this, we experiment with auxiliary corpora where split-antecedent anaphors were annotated by the crowd, and with transfer learning models using element-of bridging references and single-antecedent coreference as auxiliary tasks. Evaluation on the gold annotated ARRAU corpus shows that the out best model uses a combination of three auxiliary corpora achieved F1 scores of 70% and 43.6% when evaluated in a lenient and strict setting, respectively, i.e., 11 and 21 percentage points gain when compared with our baseline.