在深度学习中建模并行性的线性代数方法

论文标题

在深度学习中建模并行性的线性代数方法

A Linear Algebraic Approach to Model Parallelism in Deep Learning

论文作者

Hewett, Russell J., Grady II, Thomas J.

论文摘要

随着网络的规模和复杂性的增长，越来越必要在大集群计算环境中训练深层神经网络（DNN）。本地内存和处理局限性需要强大的数据和模型并行性，以跨越计算节点边界。我们提出了一种线性 - 代数方法来模拟深度学习中的并行性，该方法允许在DNN中平行任何张量。我们表明，我们表明并非依赖不普遍支持分布式存储器并行模型的自动差分工具，而是表明数据移动操作，例如广播，汇总和光晕交换是线性操作员，并且通过定义相关的空间和内部产品，我们会手动开发旁边或向后发展的，或者向后发展基于DNNS的基于DNNS的梯度培训所需的操作员。我们使用这些平行的原始词构建分布式DNN层，该层由顺序层实现组成，并通过使用DistDL（Pytorch和MPI基于Pytorch and MPI和基于MPI的分布式深度学习工具包）来构建和培训分布式DNN来演示其应用。

Training deep neural networks (DNNs) in large-cluster computing environments is increasingly necessary, as networks grow in size and complexity. Local memory and processing limitations require robust data and model parallelism for crossing compute node boundaries. We propose a linear-algebraic approach to model parallelism in deep learning, which allows parallel distribution of any tensor in the DNN. Rather than rely on automatic differentiation tools, which do not universally support distributed memory parallelism models, we show that parallel data movement operations, e.g., broadcast, sum-reduce, and halo exchange, are linear operators, and by defining the relevant spaces and inner products, we manually develop the adjoint, or backward, operators required for gradient-based training of DNNs. We build distributed DNN layers using these parallel primitives, composed with sequential layer implementations, and demonstrate their application by building and training a distributed DNN using DistDL, a PyTorch and MPI-based distributed deep learning toolkit.

下载PDF全文

下载文献需遵守相关版权规定

论文标题