论文标题
线性时间sindhorn发散使用积极特征
Linear Time Sinkhorn Divergences using Positive Features
论文作者
论文摘要
尽管现在在数据科学中通常使用sindhorn差异来比较概率分布,但是计算它们所需的计算工作仍然很昂贵,通常在支撑这些分布的大小$ n $上通常增长。实际上,用熵正则化解决最佳传输(OT)需要计算$ n \ times n $ kernel矩阵($ n \ times n $ n $成对地面成本成本矩阵的负指数),该矩阵被反复应用于向量。我们建议使用$ c(x,y)= - \ log \ dotp {φ(x)} {φ(y)} $的表格的地面成本,其中$φ$是从地面空间上的地图上的地图,直达正偏$ \ rr^r _+$,带有$ r \ ll n n $。等效地,这种选择可以产生$ k(x,y)= \ dotp {φ(x)} {φ(y)} $,并确保sindhorn迭代的成本缩放为$ o(nr)$。我们表明,使用此表格可以近似通常的成本功能。另外,我们利用了这一事实,即我们的方法产生的近似值与输入分布保持完全差异,而不是先前提出的内核矩阵的自适应低级别近似值,以训练ot-gan \ cite \ cite {salimans2018improving}更快的变体。
Although Sinkhorn divergences are now routinely used in data sciences to compare probability distributions, the computational effort required to compute them remains expensive, growing in general quadratically in the size $n$ of the support of these distributions. Indeed, solving optimal transport (OT) with an entropic regularization requires computing a $n\times n$ kernel matrix (the neg-exponential of a $n\times n$ pairwise ground cost matrix) that is repeatedly applied to a vector. We propose to use instead ground costs of the form $c(x,y)=-\log\dotp{φ(x)}{φ(y)}$ where $φ$ is a map from the ground space onto the positive orthant $\RR^r_+$, with $r\ll n$. This choice yields, equivalently, a kernel $k(x,y)=\dotp{φ(x)}{φ(y)}$, and ensures that the cost of Sinkhorn iterations scales as $O(nr)$. We show that usual cost functions can be approximated using this form. Additionaly, we take advantage of the fact that our approach yields approximation that remain fully differentiable with respect to input distributions, as opposed to previously proposed adaptive low-rank approximations of the kernel matrix, to train a faster variant of OT-GAN \cite{salimans2018improving}.