基于基础语言理解的系统概括的基准

论文标题

基于基础语言理解的系统概括的基准

A Benchmark for Systematic Generalization in Grounded Language Understanding

论文作者

Ruis, Laura, Andreas, Jacob, Baroni, Marco, Bouchacourt, Diane, Lake, Brenden M.

论文摘要

人类很容易解释描述由熟悉的部分组成的陌生情况的表达（“迎接素食轮的粉红色棕褐色”）。相比之下，现代神经网络很难解释新颖的作品。在本文中，我们介绍了一个新的基准GSCAN，用于评估位置语言理解中的组成概括。 GSCAN超越了关注概括的句法方面的相关基准，它定义了一种基于网格世界状态的语言，从而促进了获取语言动机规则的新颖评估。例如，代理商必须了解相对于当前世界状态的形容词，例如“小”，或者如何与“谨慎”等副词与新动词相结合。我们测试了强大的多模式基线模型和最先进的组成方法，发现在大多数情况下，当概括需要系统的组成规则时，它们会急剧失败。

Humans easily interpret expressions that describe unfamiliar situations composed from familiar parts ("greet the pink brontosaurus by the ferris wheel"). Modern neural networks, by contrast, struggle to interpret novel compositions. In this paper, we introduce a new benchmark, gSCAN, for evaluating compositional generalization in situated language understanding. Going beyond a related benchmark that focused on syntactic aspects of generalization, gSCAN defines a language grounded in the states of a grid world, facilitating novel evaluations of acquiring linguistically motivated rules. For example, agents must understand how adjectives such as 'small' are interpreted relative to the current world state or how adverbs such as 'cautiously' combine with new verbs. We test a strong multi-modal baseline model and a state-of-the-art compositional method finding that, in most cases, they fail dramatically when generalization requires systematic compositional rules.

下载PDF全文

下载文献需遵守相关版权规定

论文标题