论文标题
金牛座:每包ML的数据平面体系结构
Taurus: A Data Plane Architecture for Per-Packet ML
论文作者
论文摘要
新兴应用程序 - 云计算,物联网以及增强/虚拟现实 - 要求响应,安全和可扩展的数据中心网络。这些网络当前在缓慢的,毫秒的延迟控制平面下实施简单的,每包,数据平面启发式方法(例如ECMP和Sketches),该平面运行数据驱动的性能和安全策略。但是,要满足现代数据中心中应用程序的服务级目标(SLO),网络必须弥合线路率,每包执行和复杂决策之间的差距。 在这项工作中,我们介绍了金牛座的设计和实现,金牛座是线路速率推断的数据平面。金牛座基于灵活的,平行的图案(MapReduce)抽象添加自定义硬件,例如交换机和NICS;该新硬件使用管道的SIMD并行性来启用每包MAPREDUCE操作(例如推理)。我们对金牛座开关ASIC的评估 - 支持多种现实世界的模型 - 表明,金牛座的运行速度比基于服务器的控制平面快,同时将面积增加3.8%,线路速率ML模型的延迟高达221 ns。此外,我们的金牛座FPGA原型可实现完整的模型准确性,并比最先进的控制平面异常检测系统检测到两个数量级的事件。
Emerging applications -- cloud computing, the internet of things, and augmented/virtual reality -- demand responsive, secure, and scalable datacenter networks. These networks currently implement simple, per-packet, data-plane heuristics (e.g., ECMP and sketches) under a slow, millisecond-latency control plane that runs data-driven performance and security policies. However, to meet applications' service-level objectives (SLOs) in a modern data center, networks must bridge the gap between line-rate, per-packet execution and complex decision making. In this work, we present the design and implementation of Taurus, a data plane for line-rate inference. Taurus adds custom hardware based on a flexible, parallel-patterns (MapReduce) abstraction to programmable network devices, such as switches and NICs; this new hardware uses pipelined SIMD parallelism to enable per-packet MapReduce operations (e.g., inference). Our evaluation of a Taurus switch ASIC -- supporting several real-world models -- shows that Taurus operates orders of magnitude faster than a server-based control plane while increasing area by 3.8% and latency for line-rate ML models by up to 221 ns. Furthermore, our Taurus FPGA prototype achieves full model accuracy and detects two orders of magnitude more events than a state-of-the-art control-plane anomaly-detection system.