金牛座：每包ML的数据平面体系结构

论文标题

金牛座：每包ML的数据平面体系结构

Taurus: A Data Plane Architecture for Per-Packet ML

论文作者

Swamy, Tushar, Rucker, Alexander, Shahbaz, Muhammad, Gaur, Ishan, Olukotun, Kunle

论文摘要

新兴应用程序 - 云计算，物联网以及增强/虚拟现实 - 要求响应，安全和可扩展的数据中心网络。这些网络当前在缓慢的，毫秒的延迟控制平面下实施简单的，每包，数据平面启发式方法（例如ECMP和Sketches），该平面运行数据驱动的性能和安全策略。但是，要满足现代数据中心中应用程序的服务级目标（SLO），网络必须弥合线路率，每包执行和复杂决策之间的差距。在这项工作中，我们介绍了金牛座的设计和实现，金牛座是线路速率推断的数据平面。金牛座基于灵活的，平行的图案（MapReduce）抽象添加自定义硬件，例如交换机和NICS；该新硬件使用管道的SIMD并行性来启用每包MAPREDUCE操作（例如推理）。我们对金牛座开关ASIC的评估 - 支持多种现实世界的模型 - 表明，金牛座的运行速度比基于服务器的控制平面快，同时将面积增加3.8％，线路速率ML模型的延迟高达221 ns。此外，我们的金牛座FPGA原型可实现完整的模型准确性，并比最先进的控制平面异常检测系统检测到两个数量级的事件。

Emerging applications -- cloud computing, the internet of things, and augmented/virtual reality -- demand responsive, secure, and scalable datacenter networks. These networks currently implement simple, per-packet, data-plane heuristics (e.g., ECMP and sketches) under a slow, millisecond-latency control plane that runs data-driven performance and security policies. However, to meet applications' service-level objectives (SLOs) in a modern data center, networks must bridge the gap between line-rate, per-packet execution and complex decision making. In this work, we present the design and implementation of Taurus, a data plane for line-rate inference. Taurus adds custom hardware based on a flexible, parallel-patterns (MapReduce) abstraction to programmable network devices, such as switches and NICs; this new hardware uses pipelined SIMD parallelism to enable per-packet MapReduce operations (e.g., inference). Our evaluation of a Taurus switch ASIC -- supporting several real-world models -- shows that Taurus operates orders of magnitude faster than a server-based control plane while increasing area by 3.8% and latency for line-rate ML models by up to 221 ns. Furthermore, our Taurus FPGA prototype achieves full model accuracy and detects two orders of magnitude more events than a state-of-the-art control-plane anomaly-detection system.

下载PDF全文

下载文献需遵守相关版权规定

论文标题