从视频内容中提取的各种深度学习功能的互补性以供视频推荐

论文标题

从视频内容中提取的各种深度学习功能的互补性以供视频推荐

The complementarity of a diverse range of deep learning features extracted from video content for video recommendation

论文作者

Almeida, Adolfo, de Villiers, Johan Pieter, De Freitas, Allan, Velayudan, Mergandran

论文摘要

媒体流的普及之后，许多视频流服务正在不断购买新的视频内容，以从中挖掘出潜在的利润。因此，新添加的内容必须得到很好的处理，以建议合适的用户。在本文中，我们通过探索各种深度学习功能提供视频建议的潜力来解决新项目冷启动问题。研究的深度学习功能包括捕获视频内容的视觉表现，音频和运动信息的功能。我们还探索了不同的融合方法，以评估这些特征方式如何合并以充分利用它们捕获的互补信息。电影推荐的现实世界视频数据集上的实验表明，深度学习的特征优于手工制作的功能。特别是，具有深度学习音频功能和以动作为中心的深度学习功能生成的建议优于MFCC和最先进的IDT功能。此外，与仅组合前者相比，各种深度学习功能与手工制作的特征和文本元数据的结合在建议方面有了显着改善。

Following the popularisation of media streaming, a number of video streaming services are continuously buying new video content to mine the potential profit from them. As such, the newly added content has to be handled well to be recommended to suitable users. In this paper, we address the new item cold-start problem by exploring the potential of various deep learning features to provide video recommendations. The deep learning features investigated include features that capture the visual-appearance, audio and motion information from video content. We also explore different fusion methods to evaluate how well these feature modalities can be combined to fully exploit the complementary information captured by them. Experiments on a real-world video dataset for movie recommendations show that deep learning features outperform hand-crafted features. In particular, recommendations generated with deep learning audio features and action-centric deep learning features are superior to MFCC and state-of-the-art iDT features. In addition, the combination of various deep learning features with hand-crafted features and textual metadata yields significant improvement in recommendations compared to combining only the former.

下载PDF全文

下载文献需遵守相关版权规定

论文标题