论文标题
计算具有有限倍数的袋装的预期多重性
Computing expected multiplicities for bag-TIDBs with bounded multiplicities
论文作者
论文摘要
在这项工作中,我们研究了与概率数据库计算元组的预期多样性的问题,该数据库具有Bag语义(每个元组都与多样性相关联),并且完全且大致。我们考虑在每个元组和元素的最大多样性上具有绑定的$ c $是独立的概率事件(我们是指诸如c-tidbs之类的数据库。我们对计算预期倍增性的细粒度复杂性特别感兴趣数据库。独立于块的数据库(BIDB)。
In this work, we study the problem of computing a tuple's expected multiplicity over probabilistic databases with bag semantics (where each tuple is associated with a multiplicity) exactly and approximately. We consider bag-TIDBs where we have a bound $c$ on the maximum multiplicity of each tuple and tuples are independent probabilistic events (we refer to such databases as c-TIDBs. We are specifically interested in the fine-grained complexity of computing expected multiplicities and how it compares to the complexity of deterministic query evaluation algorithms -- if these complexities are comparable, it opens the door to practical deployment of probabilistic databases. Unfortunately, our results imply that computing expected multiplicities for c-TIDBs based on the results produced by such query evaluation algorithms introduces super-linear overhead (under parameterized complexity hardness assumptions/conjectures). We proceed to study approximation of expected result tuple multiplicities for positive relational algebra queries ($RA^+$) over c-TIDBs and for a non-trivial subclass of block-independent databases (BIDBs). We develop a sampling algorithm that computes a 1$\pmε$ approximation of the expected multiplicity of an output tuple in time linear in the runtime of the corresponding deterministic query for any $RA^+$ query.