CDEVALSUMM：神经摘要系统跨数据库评估的经验研究

论文标题

CDEVALSUMM：神经摘要系统跨数据库评估的经验研究

CDEvalSumm: An Empirical Study of Cross-Dataset Evaluation for Neural Summarization Systems

论文作者

Chen, Yiran, Liu, Pengfei, Zhong, Ming, Dou, Zi-Yi, Wang, Danqing, Qiu, Xipeng, Huang, Xuanjing

论文摘要

基于神经网络的模型以无监督的预培训知识增强，在文本摘要上取得了令人印象深刻的表现。但是，大多数现有的评估方法仅限于在同一数据集上对摘要者进行培训和评估的内域设置。我们认为，这种方法可以缩小我们对不同摘要系统的概括能力的理解。在本文中，我们对不同数据集的特征进行了深入的分析，并研究了在交叉数据集设置下不同摘要模型的性能，其中将在一个语料库上进行训练的摘要器，以一系列室外语料库进行评估。对来自不同领域的5个数据集上的11个代表性摘要系统的全面研究揭示了模型架构和生成方式（即抽象和挖掘）对模型概括能力的影响。此外，实验结果阐明了现有摘要的局限性。简要介绍和补充代码可以在https://github.com/zide05/cdevalsump中找到。

Neural network-based models augmented with unsupervised pre-trained knowledge have achieved impressive performance on text summarization. However, most existing evaluation methods are limited to an in-domain setting, where summarizers are trained and evaluated on the same dataset. We argue that this approach can narrow our understanding of the generalization ability for different summarization systems. In this paper, we perform an in-depth analysis of characteristics of different datasets and investigate the performance of different summarization models under a cross-dataset setting, in which a summarizer trained on one corpus will be evaluated on a range of out-of-domain corpora. A comprehensive study of 11 representative summarization systems on 5 datasets from different domains reveals the effect of model architectures and generation ways (i.e. abstractive and extractive) on model generalization ability. Further, experimental results shed light on the limitations of existing summarizers. Brief introduction and supplementary code can be found in https://github.com/zide05/CDEvalSumm.

下载PDF全文

下载文献需遵守相关版权规定

论文标题