差异私人联合组合匪徒和约束

论文标题

差异私人联合组合匪徒和约束

Differentially Private Federated Combinatorial Bandits with Constraints

论文作者

Solanki, Sambhav, Kanaparthy, Samhita, Damle, Sankarshan, Gujar, Sujit

论文摘要

在线学习环境中，即联合学习（FL）中的合作学习范式有很快增加。与大多数FL设置不同，在许多情况下，代理商具有竞争力。每个代理商都想向他人学习，但是它为他人所分享的信息的一部分可能很敏感。因此，它希望其隐私。这项工作调查了一组代理人同时工作，以解决类似的组合匪徒问题，同时保持质量限制。这些代理商可以通过采用差异隐私来保留其敏感信息机密的同时学习吗？我们观察到沟通可以减少遗憾。但是，保护敏感信息的差异隐私技术使数据嘈杂，并且可能会恶化，而不是帮助改善遗憾。因此，我们注意到，必须决定何时交流，以及学习的共享数据，以学会在遗憾和隐私之间取得功能平衡。对于这样的联合组合MAB设置，我们提出了一个保护隐私的联合组合匪徒算法，P-FCB。我们通过模拟说明了P-FCB的功效。我们进一步表明，我们的算法在遗憾方面提供了改善，同时维护质量阈值和有意义的隐私保证。

There is a rapid increase in the cooperative learning paradigm in online learning settings, i.e., federated learning (FL). Unlike most FL settings, there are many situations where the agents are competitive. Each agent would like to learn from others, but the part of the information it shares for others to learn from could be sensitive; thus, it desires its privacy. This work investigates a group of agents working concurrently to solve similar combinatorial bandit problems while maintaining quality constraints. Can these agents collectively learn while keeping their sensitive information confidential by employing differential privacy? We observe that communicating can reduce the regret. However, differential privacy techniques for protecting sensitive information makes the data noisy and may deteriorate than help to improve regret. Hence, we note that it is essential to decide when to communicate and what shared data to learn to strike a functional balance between regret and privacy. For such a federated combinatorial MAB setting, we propose a Privacy-preserving Federated Combinatorial Bandit algorithm, P-FCB. We illustrate the efficacy of P-FCB through simulations. We further show that our algorithm provides an improvement in terms of regret while upholding quality threshold and meaningful privacy guarantees.

下载PDF全文

下载文献需遵守相关版权规定

论文标题