由多元正态分布激发的隐式数据的新建议算法

论文标题

由多元正态分布激发的隐式数据的新建议算法

New Recommendation Algorithm for Implicit Data Motivated by the Multivariate Normal Distribution

论文作者

Viljanen, Markus, Pahikkala, Tapio

论文摘要

推荐系统的目的是通过为每个用户提供项目建议列表来帮助用户从大量项目目录中找到有用的项目。基于隐式数据收集的数据集具有许多特殊特征。用户和项目交互矩阵通常是完整的，即每个用户和项目对都有相互作用值或零互动的零，其目标是为每个用户排名。这项研究为隐式数据提供了一种简单的新算法，该算法与准确性相匹配或匹配基准。该算法可以通过多元正态分布（MVN）直观地激发，其中给定给定用户的交互的非相互作用的排名具有封闭形式的表达式。 KNN和SVD基准的主要区别在于仅使用已知相互作用进行预测。使用此技巧的修改基线具有更好的准确性，但是它也会导致更简单的模型，而对于隐式数据而言，具有更少的超参数。我们的结果表明，该想法应在Top-N推荐中使用，其种子尺寸很小，而MVN是一种简单的方法。

The goal of recommender systems is to help users find useful items from a large catalog of items by producing a list of item recommendations for every user. Data sets based on implicit data collection have a number of special characteristics. The user and item interaction matrix is often complete, i.e. every user and item pair has an interaction value or zero for no interaction, and the goal is to rank the items for every user. This study presents a simple new algorithm for implicit data that matches or outperforms baselines in accuracy. The algorithm can be motivated intuitively by the Multivariate Normal Distribution (MVN), where have a closed form expression for the ranking of non-interactions given user's interactions. The main difference to kNN and SVD baselines is that predictions are carried out using only the known interactions. Modified baselines with this trick have a better accuracy, however it also results in simpler models with fewer hyperparameters for implicit data. Our results suggest that this idea should used in Top-N recommendation with small seed sizes and the MVN is a simple way to do so.

下载PDF全文

下载文献需遵守相关版权规定

论文标题