论文标题
使用图形注意力网络图像美学评估
Image Aesthetics Assessment Using Graph Attention Network
论文作者
论文摘要
纵横比和空间布局是确定照片美学价值的两个主要因素。但是,将它们纳入传统的基于卷积的框架中,以进行图像美学评估的任务是有问题的。当照片调整/裁剪到固定尺寸时,它们的纵横比会扭曲,以促进培训批次采样。另一方面,卷积过滤器在本地处理信息,并且其对照片的全球空间布局进行建模的能力有限。在这项工作中,我们提出了一个基于图神经网络的两阶段框架,并共同解决这两个问题。首先,我们提出了一个特征图表表示,其中将输入图像建模为图形,并保持其原始纵横比和分辨率。其次,我们提出了一个图形神经网络体系结构,该架构采用此特征图,并使用视觉注意力捕获输入图像的不同区域之间的语义关系。我们的实验表明,所提出的框架进步最新的框架会导致美学视觉分析(AVA)基准的美学得分回归。
Aspect ratio and spatial layout are two of the principal factors determining the aesthetic value of a photograph. But, incorporating these into the traditional convolution-based frameworks for the task of image aesthetics assessment is problematic. The aspect ratio of the photographs gets distorted while they are resized/cropped to a fixed dimension to facilitate training batch sampling. On the other hand, the convolutional filters process information locally and are limited in their ability to model the global spatial layout of a photograph. In this work, we present a two-stage framework based on graph neural networks and address both these problems jointly. First, we propose a feature-graph representation in which the input image is modelled as a graph, maintaining its original aspect ratio and resolution. Second, we propose a graph neural network architecture that takes this feature-graph and captures the semantic relationship between the different regions of the input image using visual attention. Our experiments show that the proposed framework advances the state-of-the-art results in aesthetic score regression on the Aesthetic Visual Analysis (AVA) benchmark.