框架的问题：语言形式主义对探测结果的影响

论文标题

框架的问题：语言形式主义对探测结果的影响

A Matter of Framing: The Impact of Linguistic Formalism on Probing Results

论文作者

Kuznetsov, Ilia, Gurevych, Iryna

论文摘要

像Bert这样的深度预训练的上下文化编码器（Delvin等，2019）在一系列下游任务上表现出了出色的性能。最近的一项探测研究线研究了这些模型在预训练期间隐含地学习的语言知识。尽管大多数探测工作都在任务级别上运作，但语言任务很少统一，并且可以用各种形式主义表示。任何基于语言学的探测研究都不可避免地会承担用来注释基础数据的形式主义。形式主义的选择会影响探测结果吗？为了研究，我们在角色语义中进行了深入的跨格式层探测研究。我们发现，根据形式主义，伯特发现语言角色和原始角色信息的编码具有有意义的差异，并证明层探测可以检测到相同语言形式主义的实现之间的细微差异。我们的结果表明，语言形式主义是探测研究的重要方面，以及常用的跨任务和跨语言实验环境。

Deep pre-trained contextualized encoders like BERT (Delvin et al., 2019) demonstrate remarkable performance on a range of downstream tasks. A recent line of research in probing investigates the linguistic knowledge implicitly learned by these models during pre-training. While most work in probing operates on the task level, linguistic tasks are rarely uniform and can be represented in a variety of formalisms. Any linguistics-based probing study thereby inevitably commits to the formalism used to annotate the underlying data. Can the choice of formalism affect probing results? To investigate, we conduct an in-depth cross-formalism layer probing study in role semantics. We find linguistically meaningful differences in the encoding of semantic role- and proto-role information by BERT depending on the formalism and demonstrate that layer probing can detect subtle differences between the implementations of the same linguistic formalism. Our results suggest that linguistic formalism is an important dimension in probing studies, along with the commonly used cross-task and cross-lingual experimental settings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题