论文标题
在机器学习中建模和量化会员资格信息泄漏
Modelling and Quantifying Membership Information Leakage in Machine Learning
论文作者
论文摘要
机器学习模型已被证明容易受到会员推理攻击的影响,即推断个人的数据是否已用于培训模型。缺乏对促进这些攻击成功的因素的理解,激发了使用信息理论进行建模成员资格信息泄漏的必要性,并研究了可以减少会员信息泄漏的机器学习模型和培训算法的属性。我们使用有条件的相互信息泄漏来衡量训练有素的机器学习模型中有关个人在培训数据集中存在的信息的量。我们使用kullback-leibler Divergence设计了一个上限,以实现此量度泄漏,这更适合数值计算。我们证明了Kullback-Leibler成员信息泄漏与假设测试对手的成功概率之间的直接关系,以检查特定数据记录是否属于机器学习模型的培训数据集。我们表明,相互信息泄漏是训练数据集大小和正则重量的降低功能。我们还证明,如果机器学习模型的敏感性(根据适应性相对于模型参数定义)很高,则可能会泄漏更多的成员信息。这表明,与较少的自由度的更简单模型相比,复杂的模型(例如深神经网络)更容易受到会员推理攻击的影响。我们表明,使用高斯$(δ{ - 1})$ \ log^{1/2}(δ^{ - 1})ε^{ - 1})$减少了会员信息泄漏的量。
Machine learning models have been shown to be vulnerable to membership inference attacks, i.e., inferring whether individuals' data have been used for training models. The lack of understanding about factors contributing success of these attacks motivates the need for modelling membership information leakage using information theory and for investigating properties of machine learning models and training algorithms that can reduce membership information leakage. We use conditional mutual information leakage to measure the amount of information leakage from the trained machine learning model about the presence of an individual in the training dataset. We devise an upper bound for this measure of information leakage using Kullback--Leibler divergence that is more amenable to numerical computation. We prove a direct relationship between the Kullback--Leibler membership information leakage and the probability of success for a hypothesis-testing adversary examining whether a particular data record belongs to the training dataset of a machine learning model. We show that the mutual information leakage is a decreasing function of the training dataset size and the regularization weight. We also prove that, if the sensitivity of the machine learning model (defined in terms of the derivatives of the fitness with respect to model parameters) is high, more membership information is potentially leaked. This illustrates that complex models, such as deep neural networks, are more susceptible to membership inference attacks in comparison to simpler models with fewer degrees of freedom. We show that the amount of the membership information leakage is reduced by $\mathcal{O}(\log^{1/2}(δ^{-1})ε^{-1})$ when using Gaussian $(ε,δ)$-differentially-private additive noises.