论文标题
具有统计保证的差异私人(梯度)期望最大化算法
Differentially Private (Gradient) Expectation Maximization Algorithm with Statistical Guarantees
论文作者
论文摘要
(梯度)期望最大化(EM)是一种广泛使用的算法,用于估计混合模型或不完整数据问题的最大可能性。这种流行技术面临的一个重大挑战是如何有效地保留敏感数据的隐私。对此问题的先前研究已经导致(梯度)EM发现了一些差异私有(DP)算法。但是,与非私人情况不同,现有技术尚未提供有限的样本统计保证。为了解决这个问题,我们在本文中提出了具有统计保证的(梯度)EM算法的第一个DP版本。此外,我们将一般框架应用于三个规范模型:高斯混合模型(GMM),回归模型(MRM)的混合物和线性回归与缺失协变量(RMC)。具体而言,对于DP模型中的GMM,在某些情况下,我们的估计误差几乎是最佳的。对于其他两个模型,我们提供了第一个有限的样本统计保证。我们的理论得到了彻底的数值实验的支持。
(Gradient) Expectation Maximization (EM) is a widely used algorithm for estimating the maximum likelihood of mixture models or incomplete data problems. A major challenge facing this popular technique is how to effectively preserve the privacy of sensitive data. Previous research on this problem has already lead to the discovery of some Differentially Private (DP) algorithms for (Gradient) EM. However, unlike in the non-private case, existing techniques are not yet able to provide finite sample statistical guarantees. To address this issue, we propose in this paper the first DP version of (Gradient) EM algorithm with statistical guarantees. Moreover, we apply our general framework to three canonical models: Gaussian Mixture Model (GMM), Mixture of Regressions Model (MRM) and Linear Regression with Missing Covariates (RMC). Specifically, for GMM in the DP model, our estimation error is near optimal in some cases. For the other two models, we provide the first finite sample statistical guarantees. Our theory is supported by thorough numerical experiments.