联合学习的新实施，以增强隐私和安全性

论文标题

联合学习的新实施，以增强隐私和安全性

A New Implementation of Federated Learning for Privacy and Security Enhancement

论文作者

Ma, Xiang, Sun, Haijian, Hu, Rose Qingyang, Qian, Yi

论文摘要

Federated Learning（FL）对个人数据隐私的不断增长的担忧和迅速增长的数据量的促进，已成为一种新的机器学习设置。 FL系统由中央参数服务器和多个本地客户端组成。它可以将数据保留在本地客户端，并通过共享本地学到的模型参数来学习集中式模型。不需要共享本地数据，并且可以很好地保护隐私。但是，由于它是模型而不是共享的原始数据，因此该系统可以暴露于恶意客户端发起的中毒模型攻击。此外，由于服务器上没有本地客户端数据，因此确定恶意客户端是一项挑战。此外，仍然可以使用上载模型估算客户的本地数据，从而导致隐私披露。在这项工作中，我们首先提出了一个基于模型更新的联邦平均算法，以防止拜占庭式攻击，例如加性噪声攻击和弹药攻击。提出了单个客户模型初始化方法，以通过隐藏各个本地机器学习模型来提供会员推理攻击的进一步隐私保护。在结合这两个方案时，隐私和安全性都可以有效地增强。当没有攻击时，提出的方案被证明在非IID数据分布下实验会收敛。在拜占庭式攻击下，提出的方案的表现要比基于经典模型的FedAvg算法要好得多。

Motivated by the ever-increasing concerns on personal data privacy and the rapidly growing data volume at local clients, federated learning (FL) has emerged as a new machine learning setting. An FL system is comprised of a central parameter server and multiple local clients. It keeps data at local clients and learns a centralized model by sharing the model parameters learned locally. No local data needs to be shared, and privacy can be well protected. Nevertheless, since it is the model instead of the raw data that is shared, the system can be exposed to the poisoning model attacks launched by malicious clients. Furthermore, it is challenging to identify malicious clients since no local client data is available on the server. Besides, membership inference attacks can still be performed by using the uploaded model to estimate the client's local data, leading to privacy disclosure. In this work, we first propose a model update based federated averaging algorithm to defend against Byzantine attacks such as additive noise attacks and sign-flipping attacks. The individual client model initialization method is presented to provide further privacy protections from the membership inference attacks by hiding the individual local machine learning model. When combining these two schemes, privacy and security can be both effectively enhanced. The proposed schemes are proved to converge experimentally under non-IID data distribution when there are no attacks. Under Byzantine attacks, the proposed schemes perform much better than the classical model based FedAvg algorithm.

下载PDF全文

下载文献需遵守相关版权规定

论文标题