论文标题
基于预测的功率超额订购在云平台中
Prediction-Based Power Oversubscription in Cloud Platforms
论文作者
论文摘要
数据中心设计人员依靠对IT设备功率抽取的保守估计来提供资源。这使资源不足,需要建立更多的数据中心。先前的工作已使用功率封盖来剃光稀有功率峰,并在数据中心添加更多服务器,从而超额订购其资源并降低资本成本。当已知工作负载及其服务器位置时,这效果很好。不幸的是,这些因素在公共云中是未知的,迫使提供者限制了超额认购,因此绩效永远不会受到影响。 在本文中,我们认为提供商可以使用工作负载绩效关键性和虚拟机(VM)资源利用的预测来增加超级标准。这带来了许多挑战,例如从黑框VM中确定关键的绩效工作负载,为关键感力 - 感知能力管理提供支持,并增加了超额检查,同时限制了封盖的影响。我们解决了Microsoft Azure的硬件和软件基础架构的这些挑战。结果表明,我们可以使超额订购的2倍增加,对关键工作量的影响最小。
Datacenter designers rely on conservative estimates of IT equipment power draw to provision resources. This leaves resources underutilized and requires more datacenters to be built. Prior work has used power capping to shave the rare power peaks and add more servers to the datacenter, thereby oversubscribing its resources and lowering capital costs. This works well when the workloads and their server placements are known. Unfortunately, these factors are unknown in public clouds, forcing providers to limit the oversubscription so that performance is never impacted. In this paper, we argue that providers can use predictions of workload performance criticality and virtual machine (VM) resource utilization to increase oversubscription. This poses many challenges, such as identifying the performance-critical workloads from black-box VMs, creating support for criticality-aware power management, and increasing oversubscription while limiting the impact of capping. We address these challenges for the hardware and software infrastructures of Microsoft Azure. The results show that we enable a 2x increase in oversubscription with minimum impact to critical workloads.