论文标题
探索使用触发操作的GPU流态消息传递
Exploring GPU Stream-Aware Message Passing using Triggered Operations
论文作者
论文摘要
现代的异质超级计算系统由提供CPU和GPU的计算叶片组成。在这样的系统上,必须在这些不同的计算引擎之间在高速网络之间有效地移动数据。尽管当前一代的科学应用程序和系统软件堆栈是GPU的,但仍需要CPU线程来管理数据移动通信操作和过程间同步操作。 探索了一种新的GPU流动感知MPI通信策略,称为流触发(ST)通信,以允许将计算和通信控制路径同时卸载到GPU。提出的ST通信策略是在使用受支持的触发操作功能的新专有HPE Slingshot NIC(Slingshot 11)上对HPE弹弓互连实施的。基于由AMD CPU和GPU组成的异质节点架构,使用称为Faces的微型标准内核评估了所提出的新通信策略的性能。
Modern heterogeneous supercomputing systems are comprised of compute blades that offer CPUs and GPUs. On such systems, it is essential to move data efficiently between these different compute engines across a high-speed network. While current generation scientific applications and systems software stacks are GPU-aware, CPU threads are still required to orchestrate data moving communication operations and inter-process synchronization operations. A new GPU stream-aware MPI communication strategy called stream-triggered (ST) communication is explored to allow offloading both computation and communication control paths to the GPU. The proposed ST communication strategy is implemented on HPE Slingshot Interconnects over a new proprietary HPE Slingshot NIC (Slingshot 11) using the supported triggered operations feature. Performance of the proposed new communication strategy is evaluated using a microbenchmark kernel called Faces, based on the nearest-neighbor communication pattern in the CORAL-2 Nekbone benchmark, over a heterogeneous node architecture consisting of AMD CPUs and GPUs.