轻量级卷积表示自然语言处理

论文标题

轻量级卷积表示自然语言处理

Lightweight Convolutional Representations for On-Device Natural Language Processing

论文作者

Desai, Shrey, Goh, Geoffrey, Babu, Arun, Aly, Ahmed

论文摘要

深度神经网络的计算和记忆复杂性的日益增加使得它们很难在低资源电子设备（例如手机，平板电脑，可穿戴设备）上部署它们。从业者开发了许多模型压缩方法来解决这些问题，但是很少有人凝结了输入表示。在这项工作中，我们提出了一种快速，准确和轻巧的卷积表示，可以将其换成任何神经模型并显着压缩（最高32倍），而性能的降低却忽略不计。此外，在考虑三星Galaxy S9上，考虑以资源为中心的指标（例如，模型文件大小，延迟，内存使用情况）时，我们会显示复发表示的收益。

The increasing computational and memory complexities of deep neural networks have made it difficult to deploy them on low-resource electronic devices (e.g., mobile phones, tablets, wearables). Practitioners have developed numerous model compression methods to address these concerns, but few have condensed input representations themselves. In this work, we propose a fast, accurate, and lightweight convolutional representation that can be swapped into any neural model and compressed significantly (up to 32x) with a negligible reduction in performance. In addition, we show gains over recurrent representations when considering resource-centric metrics (e.g., model file size, latency, memory usage) on a Samsung Galaxy S9.

下载PDF全文

下载文献需遵守相关版权规定

论文标题