ESPNET-ONNX：弥合研究与生产之间的差距

论文标题

ESPNET-ONNX：弥合研究与生产之间的差距

ESPnet-ONNX: Bridging a Gap Between Research and Production

论文作者

Someki, Masao, Higuchi, Yosuke, Hayashi, Tomoki, Watanabe, Shinji

论文摘要

在深度学习领域，研究人员经常专注于发明新颖的神经网络模型并改善基准。相比之下，应用程序开发人员有兴趣制作适合实际产品的模型，该模型涉及优化一个模型，以更快地推断并将模型调整为各种平台（例如C ++和Python）。在这项工作中，为了填补两者之间的差距，我们建立了一个有效的程序，以优化基于Pytorch的研究模型进行部署，以ESPNET为例，这是一种广泛使用的语音处理工具包。我们向ESPNET介绍了不同的技术，包括将模型转换为onnx格式，将节点融合在图中，并量化参数，这会导致大约1.3-2 $ \ times $ $加速在各种任务中（即ASR，TTS，TTS，语音翻译和语言理解），同时保持其性能而没有进行任何其他培训。我们的ESPNET-ONNX将在https://github.com/espnet/espnet_onnx上公开获得

In the field of deep learning, researchers often focus on inventing novel neural network models and improving benchmarks. In contrast, application developers are interested in making models suitable for actual products, which involves optimizing a model for faster inference and adapting a model to various platforms (e.g., C++ and Python). In this work, to fill the gap between the two, we establish an effective procedure for optimizing a PyTorch-based research-oriented model for deployment, taking ESPnet, a widely used toolkit for speech processing, as an instance. We introduce different techniques to ESPnet, including converting a model into an ONNX format, fusing nodes in a graph, and quantizing parameters, which lead to approximately 1.3-2$\times$ speedup in various tasks (i.e., ASR, TTS, speech translation, and spoken language understanding) while keeping its performance without any additional training. Our ESPnet-ONNX will be publicly available at https://github.com/espnet/espnet_onnx

下载PDF全文

下载文献需遵守相关版权规定

论文标题