HAAC：硬件软件共同设计，用于加速乱码电路

论文标题

HAAC：硬件软件共同设计，用于加速乱码电路

HAAC: A Hardware-Software Co-Design to Accelerate Garbled Circuits

论文作者

Mo, Jianqiao, Gopinath, Jayanth, Reagen, Brandon

论文摘要

隐私和安全性已迅速作为系统设计的优先事项。提供两者的强大解决方案是隐私保护计算，其中可以在使用数据的使用方式上直接在加密数据上计算函数。乱码电路（GCS）是PPC技术，既可以提供机密计算和控制数据的使用方式。面临的挑战是，与明文相比，它们会产生大量的性能开销。本文提出了一个新颖的插入电路加速器和编译器，名为HAAC，以减轻性能开销，并使保护隐私的计算更加实用。 HAAC是硬件软件共同设计。 GC是共同设计的示例，因为在编译时完全知道了程序，即，所有依赖性，内存访问和控制流程均已固定。 HAAC的设计理念是使硬件简单有效，最大化专门针对我们提出的自定义执行单元和其他对于高性能必不可少的电路（例如，芯片存储）。编译器可以通过生成有效的说明时间表，数据布局和策划片外事件来利用其程序理解来实现硬件的性能潜力。在采用这种方法时，我们可以在不牺牲一般性的情况下实现ASIC的性能/效率。我们方法的洞察力包括如何启用共同设计将任意的GCS程序表示为流，从而简化了硬件并启用了完整的内存计算解耦，以及开发scratchpad，通过跟踪程序执行来捕获数据重复使用，从而消除了对成本高昂的硬件管理的需求和标记逻辑的需求。我们使用VIP板凳评估HAAC，并在4.3mm $^2 $的区域中以DDR4（2,627 $ \ times with HBM2的2,627 $ \ times $）达到589 $ \ times $的平均速度。

Privacy and security have rapidly emerged as priorities in system design. One powerful solution for providing both is privacy-preserving computation, where functions are computed directly on encrypted data and control can be provided over how data is used. Garbled circuits (GCs) are a PPC technology that provide both confidential computing and control over how data is used. The challenge is that they incur significant performance overheads compared to plaintext. This paper proposes a novel garbled circuits accelerator and compiler, named HAAC, to mitigate performance overheads and make privacy-preserving computation more practical. HAAC is a hardware-software co-design. GCs are exemplars of co-design as programs are completely known at compile time, i.e., all dependence, memory accesses, and control flow are fixed. The design philosophy of HAAC is to keep hardware simple and efficient, maximizing area devoted to our proposed custom execution units and other circuits essential for high performance (e.g., on-chip storage). The compiler can leverage its program understanding to realize hardware's performance potential by generating effective instruction schedules, data layouts, and orchestrating off-chip events. In taking this approach we can achieve ASIC performance/efficiency without sacrificing generality. Insights of our approach include how co-design enables expressing arbitrary GCs programs as streams, which simplifies hardware and enables complete memory-compute decoupling, and the development of a scratchpad that captures data reuse by tracking program execution, eliminating the need for costly hardware managed caches and tagging logic. We evaluate HAAC with VIP-Bench and achieve an average speedup of 589$\times$ with DDR4 (2,627$\times$ with HBM2) in 4.3mm$^2$ of area.

下载PDF全文

下载文献需遵守相关版权规定

论文标题