论文标题
PORYMPCNET:基于两党计算的私人推理,朝着无恢复的神经体系结构搜索
PolyMPCNet: Towards ReLU-free Neural Architecture Search in Two-party Computation Based Private Inference
论文作者
论文摘要
深度学习(DL)的快速增长和部署目睹了新兴的隐私和安全问题。为了减轻这些问题,已经讨论了安全的多方计算(MPC),以实现隐私保护DL计算。在实践中,它们通常会以很高的计算和沟通开销,并有可能禁止其在大规模系统中的受欢迎程度。两种正交的研究趋势吸引了人们对解决安全深度学习的能源效率的巨大兴趣,即减少MPC比较方案和硬件加速度。但是,他们要么达到较低的减少率,因此由于计算和通信节省有限而遭受了高潜伏期,或者是渴望的,因为现有作品主要集中在CPU和GPU等一般计算平台上。 在这项工作中,作为第一次尝试,我们通过将加密构件构建块的硬件潜伏期集成到DNN损耗功能中,以实现高能量效率,准确性和安全保证,从而开发了一个系统的polympcnet,以减少MPC比较协议和硬件加速度的联合间接费用。我们的关键设计原理不是在DNN进行良好训练之后(通过删除或删除某些非多功能操作员)之后检查模型敏感性(通过删除或删除某些非多功能操作员),而是要准确地执行DNN设计中假设的内容 - 培训既有硬件有效又安全的DNN,同时又逃脱了当地的最小值和SADDLE和SADDLE和SADDLE和SADDLE和高度准确的点。更具体地说,我们通过多项式激活初始化方法直接提出了友好的可训练多项式激活功能,以替代昂贵的2P-RELU操作员。我们为现场可编程门阵列(FPGA)平台开发了一个加密硬件调度程序和相应的性能模型。
The rapid growth and deployment of deep learning (DL) has witnessed emerging privacy and security concerns. To mitigate these issues, secure multi-party computation (MPC) has been discussed, to enable the privacy-preserving DL computation. In practice, they often come at very high computation and communication overhead, and potentially prohibit their popularity in large scale systems. Two orthogonal research trends have attracted enormous interests in addressing the energy efficiency in secure deep learning, i.e., overhead reduction of MPC comparison protocol, and hardware acceleration. However, they either achieve a low reduction ratio and suffer from high latency due to limited computation and communication saving, or are power-hungry as existing works mainly focus on general computing platforms such as CPUs and GPUs. In this work, as the first attempt, we develop a systematic framework, PolyMPCNet, of joint overhead reduction of MPC comparison protocol and hardware acceleration, by integrating hardware latency of the cryptographic building block into the DNN loss function to achieve high energy efficiency, accuracy, and security guarantee. Instead of heuristically checking the model sensitivity after a DNN is well-trained (through deleting or dropping some non-polynomial operators), our key design principle is to em enforce exactly what is assumed in the DNN design -- training a DNN that is both hardware efficient and secure, while escaping the local minima and saddle points and maintaining high accuracy. More specifically, we propose a straight through polynomial activation initialization method for cryptographic hardware friendly trainable polynomial activation function to replace the expensive 2P-ReLU operator. We develop a cryptographic hardware scheduler and the corresponding performance model for Field Programmable Gate Arrays (FPGA) platform.