统计关系模型的投影率的完整表征

论文标题

统计关系模型的投影率的完整表征

A Complete Characterization of Projectivity for Statistical Relational Models

论文作者

Jaeger, Manfred, Schulte, Oliver

论文摘要

关系数据的生成概率模型由一个概率分布族组成，用于不同大小的域上的关系结构。在大多数现有的统计关系学习（SRL）框架中，这些模型并不是投影的，因为在诱导的大小$ k <n $的诱导子结构上的分布的边际分布的边缘等于尺寸-$ k $结构的给定分布。投影率非常有益，因为它可以直接从子采样的关系结构中提升推理和统计上一致的学习。在较早的工作中，已经确定了代表投影模型的SRL语言的一些简单片段。但是，尚未给出投影模型的完整表征和表示框架。在本文中，我们填补了此差距：为无限可交换阵列的利用表示定理，我们引入了一类有向图形的潜在变量模型，这些模型与射击关系模型完全相对应。作为副产品，我们还可以获得一个特征，即当给定的分布尺寸为$ k $结构是尺寸为$ k $ supstructures的统计频率分布，以更大的尺寸-$ n $结构。这些结果为如何应用Halpern等人的“随机世界方法”进行了新的开放问题，以对一般关系信号进行概率推断。

A generative probabilistic model for relational data consists of a family of probability distributions for relational structures over domains of different sizes. In most existing statistical relational learning (SRL) frameworks, these models are not projective in the sense that the marginal of the distribution for size-$n$ structures on induced sub-structures of size $k<n$ is equal to the given distribution for size-$k$ structures. Projectivity is very beneficial in that it directly enables lifted inference and statistically consistent learning from sub-sampled relational structures. In earlier work some simple fragments of SRL languages have been identified that represent projective models. However, no complete characterization of, and representation framework for projective models has been given. In this paper we fill this gap: exploiting representation theorems for infinite exchangeable arrays we introduce a class of directed graphical latent variable models that precisely correspond to the class of projective relational models. As a by-product we also obtain a characterization for when a given distribution over size-$k$ structures is the statistical frequency distribution of size-$k$ sub-structures in much larger size-$n$ structures. These results shed new light onto the old open problem of how to apply Halpern et al.'s "random worlds approach" for probabilistic inference to general relational signatures.

下载PDF全文

下载文献需遵守相关版权规定

论文标题