食谱2VEC：多模式食谱表示与图神经网络的学习

论文标题

食谱2VEC：多模式食谱表示与图神经网络的学习

Recipe2Vec: Multi-modal Recipe Representation Learning with Graph Neural Networks

论文作者

Tian, Yijun, Zhang, Chuxu, Guo, Zhichun, Ma, Yihong, Metoyer, Ronald, Chawla, Nitesh V.

论文摘要

学习有效的配方表示在食品研究中至关重要。与为基于图像的食谱检索或学习结构文本嵌入而开发的内容不同，多模式信息（即食谱图像，文本和关系数据）的组合效果较少。在本文中，我们将多模式配方表示的问题正式化，以将视觉，文本和关系信息整合到食谱嵌入中。特别是，我们首先介绍了一个大型RG，这是一个具有超过500万个节点的新食谱图数据，使其成为迄今为止最大的配方图。然后，我们提出了配方2VEC，这是一种基于图形神经网络的新型食谱嵌入模型，以捕获多模式信息。此外，我们引入了一种对抗性攻击策略，以确保稳定学习并提高绩效。最后，我们设计了节点分类和对抗性学习的联合目标功能，以优化模型。广泛的实验表明，食谱2VEC在两项经典食品研究任务（即美食类别分类和区域预测）上的最先进基线优于最先进的基线。数据集和代码可在https://github.com/meettyj/recipe2vec上找到。

Learning effective recipe representations is essential in food studies. Unlike what has been developed for image-based recipe retrieval or learning structural text embeddings, the combined effect of multi-modal information (i.e., recipe images, text, and relation data) receives less attention. In this paper, we formalize the problem of multi-modal recipe representation learning to integrate the visual, textual, and relational information into recipe embeddings. In particular, we first present Large-RG, a new recipe graph data with over half a million nodes, making it the largest recipe graph to date. We then propose Recipe2Vec, a novel graph neural network based recipe embedding model to capture multi-modal information. Additionally, we introduce an adversarial attack strategy to ensure stable learning and improve performance. Finally, we design a joint objective function of node classification and adversarial learning to optimize the model. Extensive experiments demonstrate that Recipe2Vec outperforms state-of-the-art baselines on two classic food study tasks, i.e., cuisine category classification and region prediction. Dataset and codes are available at https://github.com/meettyj/Recipe2Vec.

下载PDF全文

下载文献需遵守相关版权规定

论文标题