论文标题
MEGCF:用于个性化建议的多模式实体图表协作过滤
MEGCF: Multimodal Entity Graph Collaborative Filtering for Personalized Recommendation
论文作者
论文摘要
在大多数电子商务平台中,是否触发了显示的项目,在很大程度上取决于其最引人注目的多模式内容。因此,越来越多的努力专注于建模多模式用户偏好,而紧迫的范式是将项目的完整多模式深度特征纳入建议模块。但是,现有的研究忽略了多模式特征提取(MFE)和用户兴趣建模(UIM)之间的不匹配问题。也就是说,MFE和UIM有不同的重点。具体而言,MFE迁移并适应上游任务,例如图像分类。此外,它主要是一个面向内容的和非个人化的过程,而UIM则更加专注于理解用户互动,这本质上是一个面向用户和个性化的过程。因此,将MFE直接集成到UIM中,以实现纯粹以用户为导向的任务,倾向于引入大量独立于首选项的多模式噪声并污染UIM中的嵌入式表示形式。 本文旨在解决MFE和UIM之间的不匹配问题,以生成高质量的嵌入表示形式和更好的模型多模式用户偏好。为此,我们开发了一种新颖的模型MEGCF。所提出的模型的UIM捕获了相互作用与从MFE获得的特征之间的语义相关性,从而使MFE和UIM之间的匹配更好。更准确地说,语义丰富的实体首先是从多模式数据中提取的,因为它们与用户偏好相比与其他多模式信息更相关。然后将这些实体集成到用户项目交互图中。之后,构建了对称线性图卷积网络(GCN)模块以在图上执行消息传播,以捕获高阶语义相关和协作过滤信号。
In most E-commerce platforms, whether the displayed items trigger the user's interest largely depends on their most eye-catching multimodal content. Consequently, increasing efforts focus on modeling multimodal user preference, and the pressing paradigm is to incorporate complete multimodal deep features of the items into the recommendation module. However, the existing studies ignore the mismatch problem between multimodal feature extraction (MFE) and user interest modeling (UIM). That is, MFE and UIM have different emphases. Specifically, MFE is migrated from and adapted to upstream tasks such as image classification. In addition, it is mainly a content-oriented and non-personalized process, while UIM, with its greater focus on understanding user interaction, is essentially a user-oriented and personalized process. Therefore, the direct incorporation of MFE into UIM for purely user-oriented tasks, tends to introduce a large number of preference-independent multimodal noise and contaminate the embedding representations in UIM. This paper aims at solving the mismatch problem between MFE and UIM, so as to generate high-quality embedding representations and better model multimodal user preferences. Towards this end, we develop a novel model, MEGCF. The UIM of the proposed model captures the semantic correlation between interactions and the features obtained from MFE, thus making a better match between MFE and UIM. More precisely, semantic-rich entities are first extracted from the multimodal data, since they are more relevant to user preferences than other multimodal information. These entities are then integrated into the user-item interaction graph. Afterwards, a symmetric linear Graph Convolution Network (GCN) module is constructed to perform message propagation over the graph, in order to capture both high-order semantic correlation and collaborative filtering signals.