用树形成树木

论文标题

用树形成树木

Forming Trees with Treeformers

论文作者

Patel, Nilay, Flanigan, Jeffrey

论文摘要

众所周知，人类语言表现出嵌套的层次结构，使我们能够用较小的作品形成复杂的句子。但是，许多最先进的神经网络模型（例如变形金刚）在其架构中没有明确的层次结构 - 也就是说，它们对层次结构没有诱导偏见。此外，已知变压器在需要此类结构的组成概括任务上表现不佳。在本文中，我们介绍了TreeFormer，这是一个由CKY算法启发的通用编码器模块，该模块灵感来自CKY算法，该模块学习了组成操作员和池函数，以构建短语和句子的层次编码。我们的广泛实验表明，将层次结构纳入变压器的好处，并在构图概括以及下游任务（例如机器翻译，抽象性摘要以及各种自然语言理解任务）方面显示出显着改善。

Human language is known to exhibit a nested, hierarchical structure, allowing us to form complex sentences out of smaller pieces. However, many state-of-the-art neural networks models such as Transformers have no explicit hierarchical structure in its architecture -- that is, they don't have an inductive bias toward hierarchical structure. Additionally, Transformers are known to perform poorly on compositional generalization tasks which require such structures. In this paper, we introduce Treeformer, a general-purpose encoder module inspired by the CKY algorithm which learns a composition operator and pooling function to construct hierarchical encodings for phrases and sentences. Our extensive experiments demonstrate the benefits of incorporating hierarchical structure into the Transformer and show significant improvements in compositional generalization as well as in downstream tasks such as machine translation, abstractive summarization, and various natural language understanding tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题