论文标题

用上下文算术曲目压缩整数列表

Compressing integer lists with Contextual Arithmetic Trits

论文作者

Barsamian, Yann, Chailloux, André

论文摘要

倒置索引允许查询大型数据库,而无需在每个查询的数据库中搜索。一个重要的研究线是在压缩比和时间效率方面构建最有效的倒置索引。在本文中,我们将展示如何使用TRIT编码,并结合上下文方法来计算倒置索引。我们对这些方法的不同变体进行了广泛的研究,并表明我们的方法始终优于二进制插值方法(这是该主题中的金标准之一)在压缩大小方面。我们将方法应用于各种数据集,并提供产生结果的源代码以及我们所有数据集。

Inverted indexes allow to query large databases without needing to search in the database at each query. An important line of research is to construct the most efficient inverted indexes, both in terms of compression ratio and time efficiency. In this article, we show how to use trit encoding, combined with contextual methods for computing inverted indexes. We perform an extensive study of different variants of these methods and show that our method consistently outperforms the Binary Interpolative Method -- which is one of the golden standards in this topic -- with respect to compression size. We apply our methods to a variety of datasets and make available the source code that produced the results, together with all our datasets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源