论文标题
重新访问用于评估顶级项目建议算法的替代实验设置
Revisiting Alternative Experimental Settings for Evaluating Top-N Item Recommendation Algorithms
论文作者
论文摘要
Top-N项目推荐是隐性反馈中广泛研究的任务。尽管神经方法已经取得了很多进展,但人们对建议算法的适当评估越来越关注。在本文中,我们重新访问了评估顶级建议算法的替代实验设置,考虑了三个重要因素,即数据集分裂,采样指标和域选择。我们选择了八种代表性建议算法(涵盖传统方法和神经方法),并在非常大的数据集中构建广泛的实验。通过仔细重新审视不同的选择,我们就三个因素做出了几个重要发现,这些发现直接提供了有关如何适当设置Top-N项目建议的实验的有用建议。
Top-N item recommendation has been a widely studied task from implicit feedback. Although much progress has been made with neural methods, there is increasing concern on appropriate evaluation of recommendation algorithms. In this paper, we revisit alternative experimental settings for evaluating top-N recommendation algorithms, considering three important factors, namely dataset splitting, sampled metrics and domain selection. We select eight representative recommendation algorithms (covering both traditional and neural methods) and construct extensive experiments on a very large dataset. By carefully revisiting different options, we make several important findings on the three factors, which directly provide useful suggestions on how to appropriately set up the experiments for top-N item recommendation.