论文标题

FPT:用于圆环完全同态加密的定点加速器

FPT: a Fixed-Point Accelerator for Torus Fully Homomorphic Encryption

论文作者

Van Beirendonck, Michiel, D'Anvers, Jan-Pieter, Turan, Furkan, Verbauwhede, Ingrid

论文摘要

完全同态加密是一种允许对加密数据进行计算的技术。它有可能在云中改变隐私注意事项,但是计算和内存开销阻止了其采用。 TFHE是一种有前途的基于圆环的FHE方案,依赖于自举,这是在每个加密的逻辑/算术操作后调用的噪声驱动工具。 我们提出了FPT,这是用于tfhe自举的定点FPGA加速器。 FPT是第一个利用计算中存在固有噪声的硬件加速器。它不是双重或单精度的浮点算术,而是用近似的固定点算术完全实现了tfhe boottrate。 FPT使用引导FFT计算中的噪声传播的深入分析,能够使用比先前实现小50%的噪声定位点表示。 FPT是作为受传统流式DSP启发的流处理器构建的:它具有直接级联的高通量计算阶段的实例化,并具有最小的控制逻辑和路由网络。我们探索具有用户可容纳的流式流宽度的流核的吞吐量平衡组成,以构建完整的自举管道。我们的方法允许100%利用算术单元,并且只需要一个小的自举键缓存,从而使一个完全计算的boottrapping吞吐量为1 BS / 35US。这与经典的CPU方法形成了鲜明对比的是引导加速度的经典方法,该方法通常受到内存和带宽的约束。 FPT是为ALVEO U280数据中心加速器卡的自举fpga内核而实现和评估的。与现有的ASIC仿真实验相比,FPT比现有的基于CPU的实现要获得比现有的2.5倍的吞吐量的两到三个数量级高2.5倍。

Fully Homomorphic Encryption is a technique that allows computation on encrypted data. It has the potential to change privacy considerations in the cloud, but computational and memory overheads are preventing its adoption. TFHE is a promising Torus-based FHE scheme that relies on bootstrapping, the noise-removal tool invoked after each encrypted logical/arithmetical operation. We present FPT, a Fixed-Point FPGA accelerator for TFHE bootstrapping. FPT is the first hardware accelerator to exploit the inherent noise present in FHE calculations. Instead of double or single-precision floating-point arithmetic, it implements TFHE bootstrapping entirely with approximate fixed-point arithmetic. Using an in-depth analysis of noise propagation in bootstrapping FFT computations, FPT is able to use noise-trimmed fixed-point representations that are up to 50% smaller than prior implementations. FPT is built as a streaming processor inspired by traditional streaming DSPs: it instantiates directly cascaded high-throughput computational stages, with minimal control logic and routing networks. We explore throughput-balanced compositions of streaming kernels with a user-configurable streaming width in order to construct a full bootstrapping pipeline. Our approach allows 100% utilization of arithmetic units and requires only a small bootstrapping key cache, enabling an entirely compute-bound bootstrapping throughput of 1 BS / 35us. This is in stark contrast to the classical CPU approach to FHE bootstrapping acceleration, which is typically constrained by memory and bandwidth. FPT is implemented and evaluated as a bootstrapping FPGA kernel for an Alveo U280 datacenter accelerator card. FPT achieves two to three orders of magnitude higher bootstrapping throughput than existing CPU-based implementations, and 2.5x higher throughput compared to recent ASIC emulation experiments.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源