论文标题

高效的语义摘要图,用于查询大型知识图

Efficient Semantic Summary Graphs for Querying Large Knowledge Graphs

论文作者

Niazmand, Emetis, Sejdiu, Gezim, Graux, Damien, Vidal, Maria-Esther

论文摘要

知识图(KGS)整合了异质数据,但一个挑战是开发有效的工具,以允许最终用户从这些知识来源中提取有用的见解。在这种情况下,减少资源说明框架(RDF)图的大小,同时保留所有信息可以通过限制数据混乱来加快查询引擎,尤其是在分布式设置中。本文介绍了两种用于RDF图摘要的算法:基于分组的摘要(GBS)和基于查询的摘要(QBS)。后者是对以前方法的优化且无损的方法。我们通过凭经验研究了提出的无损RDF图摘要的有效性以检索完整数据,通过使用语义相似性重写较少的三重模式的RDF查询语言,称为SPARQL查询。我们在四个不同大小的数据集的实例中进行了实验研究。与原始RDF图作为基线执行的最新查询引擎相比,QBS查询执行时间最多减少了80%,汇总的RDF图减少了99%。

Knowledge Graphs (KGs) integrate heterogeneous data, but one challenge is the development of efficient tools for allowing end users to extract useful insights from these sources of knowledge. In such a context, reducing the size of a Resource Description Framework (RDF) graph while preserving all information can speed up query engines by limiting data shuffle, especially in a distributed setting. This paper presents two algorithms for RDF graph summarization: Grouping Based Summarization (GBS) and Query Based Summarization (QBS). The latter is an optimized and lossless approach for the former method. We empirically study the effectiveness of the proposed lossless RDF graph summarization to retrieve complete data, by rewriting an RDF Query Language called SPARQL query with fewer triple patterns using a semantic similarity. We conduct our experimental study in instances of four datasets with different sizes. Compared with the state-of-the-art query engine Sparklify executed over the original RDF graphs as a baseline, QBS query execution time is reduced by up to 80% and the summarized RDF graph is decreased by up to 99%.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源