评估有问题的多样性通过文本产生

论文标题

评估有问题的多样性通过文本产生

Evaluating for Diversity in Question Generation over Text

论文作者

Schlichtkrull, Michael Sejr, Cheng, Weiwei

论文摘要

通过文本产生多样化和相关的问题是一项具有广泛应用程序的任务。我们认为，由于参考问题的固有多样性，BLEU和流星等常用的评估指标不适合此任务，并提出了一种扩展常规指标以反映多样性的计划。我们此外，为此任务提出了一个差异编码器模型。我们通过自动和人类评估表明，我们的变分模型改善了多样性而不会损失质量，并证明了我们的评估方案如何反映这一改进。

Generating diverse and relevant questions over text is a task with widespread applications. We argue that commonly-used evaluation metrics such as BLEU and METEOR are not suitable for this task due to the inherent diversity of reference questions, and propose a scheme for extending conventional metrics to reflect diversity. We furthermore propose a variational encoder-decoder model for this task. We show through automatic and human evaluation that our variational model improves diversity without loss of quality, and demonstrate how our evaluation scheme reflects this improvement.

下载PDF全文

下载文献需遵守相关版权规定

论文标题