图表到文本：图表汇总的大规模基准

论文标题

图表到文本：图表汇总的大规模基准

Chart-to-Text: A Large-Scale Benchmark for Chart Summarization

论文作者

Kantharaj, Shankar, Leong, Rixie Tiffany Ko, Lin, Xiang, Masry, Ahmed, Thakkar, Megh, Hoque, Enamul, Joty, Shafiq

论文摘要

图表通常用于探索数据和交流见解。从图表中生成自然语言摘要对于推断关键见解的人们非常有帮助，否则这些见解将需要大量的认知和感知努力。我们介绍图表到文本，是一个大规模的基准测试，其中有两个数据集和44,096个图表，其中涵盖了广泛的主题和图表类型。我们解释数据集构建过程并分析数据集。我们还引入了许多最先进的神经模型作为基线，这些基线利用图像字幕和数据对文本生成技术来解决两个问题的变化：一个假设可以使用图表的基础数据表，而另一个则需要从图表图像中提取数据。我们对自动和人类评估的分析表明，尽管我们的最佳模型通常会产生流利的摘要并产生合理的BLEU分数，但它们也遭受了幻觉和事实错误以及正确解释图表中复杂模式和趋势的困难。

Charts are commonly used for exploring data and communicating insights. Generating natural language summaries from charts can be very helpful for people in inferring key insights that would otherwise require a lot of cognitive and perceptual efforts. We present Chart-to-text, a large-scale benchmark with two datasets and a total of 44,096 charts covering a wide range of topics and chart types. We explain the dataset construction process and analyze the datasets. We also introduce a number of state-of-the-art neural models as baselines that utilize image captioning and data-to-text generation techniques to tackle two problem variations: one assumes the underlying data table of the chart is available while the other needs to extract data from chart images. Our analysis with automatic and human evaluation shows that while our best models usually generate fluent summaries and yield reasonable BLEU scores, they also suffer from hallucinations and factual errors as well as difficulties in correctly explaining complex patterns and trends in charts.

下载PDF全文

下载文献需遵守相关版权规定

论文标题