Cikqa：学习常识性推断与统一的知识QA范式

论文标题

Cikqa：学习常识性推断与统一的知识QA范式

CIKQA: Learning Commonsense Inference with a Unified Knowledge-in-the-loop QA Paradigm

论文作者

Zhang, Hongming, Huo, Yintong, Elazar, Yanai, Song, Yangqiu, Goldberg, Yoav, Roth, Dan

论文摘要

最近，社区在许多常识性推理基准上取得了重大进展。但是，目前尚不清楚从培训过程中学到了什么：知识，推理能力或两者兼而有之？我们认为，由于大规模的常识性知识，可以注释足够大的培训设置以涵盖所有通用性学习。因此，我们应该将常识性知识获取和对常识性知识的推论分为两个独立的任务。在这项工作中，我们专注于从两个角度研究模型的常识推理能力：（1）模型是否可以知道他们拥有的知识是否足以解决任务；（2）模型是否可以开发常识性推理能力，从而跨常识任务概括。我们首先将常识任务与来自常识性知识基础的相关知识保持一致，并要求人类注释知识是否足够。然后，我们将不同的常识任务转换为统一的问题回答格式，以评估模型的泛化功能。我们将基准标记为常识性推断，并在循环问题上回答（CIKQA）。

Recently, the community has achieved substantial progress on many commonsense reasoning benchmarks. However, it is still unclear what is learned from the training process: the knowledge, inference capability, or both? We argue that due to the large scale of commonsense knowledge, it is infeasible to annotate a large enough training set for each task to cover all commonsense for learning. Thus we should separate the commonsense knowledge acquisition and inference over commonsense knowledge as two separate tasks. In this work, we focus on investigating models' commonsense inference capabilities from two perspectives: (1) Whether models can know if the knowledge they have is enough to solve the task; (2) Whether models can develop commonsense inference capabilities that generalize across commonsense tasks. We first align commonsense tasks with relevant knowledge from commonsense knowledge bases and ask humans to annotate whether the knowledge is enough or not. Then, we convert different commonsense tasks into a unified question answering format to evaluate models' generalization capabilities. We name the benchmark as Commonsense Inference with Knowledge-in-the-loop Question Answering (CIKQA).

下载PDF全文

下载文献需遵守相关版权规定

论文标题