NAS Bench-NLP：自然语言处理的神经体系结构搜索基准

论文标题

NAS Bench-NLP：自然语言处理的神经体系结构搜索基准

NAS-Bench-NLP: Neural Architecture Search Benchmark for Natural Language Processing

论文作者

Klyuchnikov, Nikita, Trofimov, Ilya, Artemova, Ekaterina, Salnikov, Mikhail, Fedorov, Maxim, Burnaev, Evgeny

论文摘要

神经建筑搜索（NAS）是一个有前途且迅速发展的研究领域。培训大量神经网络需要大量的计算能力，这对于那些有限或无法访问高性能群集和超级计算机的研究人员来说，NAS无法到达。最近引入了一些具有预成立神经体系结构表演的基准测试，以克服此问题并确保更多可重现的实验。但是，这些基准仅用于计算机视觉域，因此是由图像数据集和卷积衍生的架构构建的。在这项工作中，我们通过利用语言建模任务（这是自然语言处理（NLP）的核心）来超越计算机视觉域。我们的主要贡献如下：我们在文本数据集中提供了经常性神经网络的搜索空间，并在其中训练了14K架构；我们使用数据集对训练有素的模型进行了固有和外在评估，以进行语义相关性和语言理解评估；最后，我们已经测试了几种NAS算法，以证明如何利用预报的结果。我们认为，NAS和NLP社区的结果具有很高的使用潜力。

Neural Architecture Search (NAS) is a promising and rapidly evolving research area. Training a large number of neural networks requires an exceptional amount of computational power, which makes NAS unreachable for those researchers who have limited or no access to high-performance clusters and supercomputers. A few benchmarks with precomputed neural architectures performances have been recently introduced to overcome this problem and ensure more reproducible experiments. However, these benchmarks are only for the computer vision domain and, thus, are built from the image datasets and convolution-derived architectures. In this work, we step outside the computer vision domain by leveraging the language modeling task, which is the core of natural language processing (NLP). Our main contribution is as follows: we have provided search space of recurrent neural networks on the text datasets and trained 14k architectures within it; we have conducted both intrinsic and extrinsic evaluation of the trained models using datasets for semantic relatedness and language understanding evaluation; finally, we have tested several NAS algorithms to demonstrate how the precomputed results can be utilized. We believe that our results have high potential of usage for both NAS and NLP communities.

下载PDF全文

下载文献需遵守相关版权规定

论文标题