论文标题

视觉问题回答的神经符号ASP管道

A Neuro-Symbolic ASP Pipeline for Visual Question Answering

论文作者

Eiter, Thomas, Higuera, Nelson, Oetsch, Johannes, Pritz, Michael

论文摘要

我们为CLEVR提供了一个神经符号视觉询问答案(VQA)管道,该管道是一个著名的数据集,它由图片组成,显示带有对象和与之相关的问题的场景。 Our pipeline covers (i) training neural networks for object classification and bounding-box prediction of the CLEVR scenes, (ii) statistical analysis on the distribution of prediction values of the neural networks to determine a threshold for high-confidence predictions, and (iii) a translation of CLEVR questions and network predictions that pass confidence thresholds into logic programs so that we can compute the answers using an ASP solver.通过利用选择规则,我们考虑确定性和非确定性场景编码。我们的实验表明,即使与确定性方法相比,神经网络对训练的训练相当糟糕,但编码非确定性的场景也取得了良好的结果。如果网络预测不太完美,这对于构建强大的VQA系统很重要。此外,我们表明,将非确定性限制在合理的选择中可以与相关的神经符号方法相比,可以更有效地实现,而不会失去太多准确性。这项工作正在考虑在TPLP中接受。

We present a neuro-symbolic visual question answering (VQA) pipeline for CLEVR, which is a well-known dataset that consists of pictures showing scenes with objects and questions related to them. Our pipeline covers (i) training neural networks for object classification and bounding-box prediction of the CLEVR scenes, (ii) statistical analysis on the distribution of prediction values of the neural networks to determine a threshold for high-confidence predictions, and (iii) a translation of CLEVR questions and network predictions that pass confidence thresholds into logic programs so that we can compute the answers using an ASP solver. By exploiting choice rules, we consider deterministic and non-deterministic scene encodings. Our experiments show that the non-deterministic scene encoding achieves good results even if the neural networks are trained rather poorly in comparison with the deterministic approach. This is important for building robust VQA systems if network predictions are less-than perfect. Furthermore, we show that restricting non-determinism to reasonable choices allows for more efficient implementations in comparison with related neuro-symbolic approaches without loosing much accuracy. This work is under consideration for acceptance in TPLP.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源