带有多模式侧信息的预训练图形变压器供推荐

论文标题

带有多模式侧信息的预训练图形变压器供推荐

Pre-training Graph Transformer with Multimodal Side Information for Recommendation

论文作者

Liu, Yong, Yang, Susen, Lei, Chenyi, Wang, Guoxin, Tang, Haihong, Zhang, Juyong, Sun, Aixin, Miao, Chunyan

论文摘要

项目（例如图像和文本描述）的侧面信息已显示有效地有助于准确的建议。受到自然语言和图像的预训练模型的最新成功的启发，我们提出了一种培训策略，以通过考虑项目侧面信息及其关系来学习项目表示形式。我们通过常见的用户活动（例如共购买）与项目相关联，并构建一个均匀的项目图。该图提供了项目关系及其相关的侧面信息的统一视图。我们开发了一种名为MCNSAMPLING的新颖采样算法，以选择每个项目的上下文邻居。提出的预训练的多模式图形变压器（PMGT）以两个目标学习项目表示：1）图结构重建，以及2）掩盖节点特征重建。实际数据集的实验结果表明，提出的PMGT模型有效利用了多模式侧信息，以在下游任务中获得更好的精度，包括项目建议，项目分类和点击率预测。我们还报告了一项案例研究，该案例研究在与60万用户的在线环境中测试了拟议的PMGT模型。

Side information of items, e.g., images and text description, has shown to be effective in contributing to accurate recommendations. Inspired by the recent success of pre-training models on natural language and images, we propose a pre-training strategy to learn item representations by considering both item side information and their relationships. We relate items by common user activities, e.g., co-purchase, and construct a homogeneous item graph. This graph provides a unified view of item relations and their associated side information in multimodality. We develop a novel sampling algorithm named MCNSampling to select contextual neighbors for each item. The proposed Pre-trained Multimodal Graph Transformer (PMGT) learns item representations with two objectives: 1) graph structure reconstruction, and 2) masked node feature reconstruction. Experimental results on real datasets demonstrate that the proposed PMGT model effectively exploits the multimodality side information to achieve better accuracies in downstream tasks including item recommendation, item classification, and click-through ratio prediction. We also report a case study of testing the proposed PMGT model in an online setting with 600 thousand users.

下载PDF全文

下载文献需遵守相关版权规定

论文标题