论文标题
从图像中生成食谱的结构感知生成网络
Structure-Aware Generation Network for Recipe Generation from Images
论文作者
论文摘要
在社交媒体的发展中,共享食物已经非常受欢迎。对于许多实际应用,人们渴望了解食品的基本食谱。在本文中,我们有兴趣自动为食物生成烹饪说明。我们研究了仅根据食物图像和成分生成烹饪说明的开放研究任务,这与图像字幕的任务相似。但是,与图像字幕数据集相比,目标配方是长长的段落,并且没有结构信息的注释。为了解决上述局限性,我们提出了一个新颖的结构吸引生成网络(SGN)的框架,以应对食品配方生成任务。我们的方法在系统的框架中汇集了一些新颖的想法:(1)利用一种无监督的学习方法来在训练之前获得句子级别的树结构标签; (2)通过从(1)中学到的树木结构标签的监督从图像中产生目标食谱的树; (3)将推断的树结构与食谱生成过程集成在一起。我们提出的模型可以产生高质量和连贯的配方,并在基准配方1M数据集上实现最先进的性能。
Sharing food has become very popular with the development of social media. For many real-world applications, people are keen to know the underlying recipes of a food item. In this paper, we are interested in automatically generating cooking instructions for food. We investigate an open research task of generating cooking instructions based on only food images and ingredients, which is similar to the image captioning task. However, compared with image captioning datasets, the target recipes are long-length paragraphs and do not have annotations on structure information. To address the above limitations, we propose a novel framework of Structure-aware Generation Network (SGN) to tackle the food recipe generation task. Our approach brings together several novel ideas in a systematic framework: (1) exploiting an unsupervised learning approach to obtain the sentence-level tree structure labels before training; (2) generating trees of target recipes from images with the supervision of tree structure labels learned from (1); and (3) integrating the inferred tree structures with the recipe generation procedure. Our proposed model can produce high-quality and coherent recipes, and achieve the state-of-the-art performance on the benchmark Recipe1M dataset.