蒙面Elmo：Elmo向完全上下文的RNN语言模型的演变

论文标题

蒙面Elmo：Elmo向完全上下文的RNN语言模型的演变

Masked ELMo: An evolution of ELMo towards fully contextual RNN language models

论文作者

Senay, Gregory, Salin, Emmanuelle

论文摘要

本文介绍了蒙面Elmo，这是一种基于RNN的新型语言模型预训练模型，它源自Elmo语言模型。与仅使用独立的从左到右和左下语的Elmo相反，蒙面的Elmo学习了完全双向的单词表示。为了实现这一目标，我们使用与Bert相同的蒙版语言模型目标。此外，由于对LSTM神经元的优化，掩盖掩盖和双向截断的反向传播的整合，我们已经大大提高了模型的训练速度。所有这些改进使得与Elmo更好的语言模型预先培训成为可能，同时保持低计算成本。我们通过在胶水基准上的相同协议中比较Elmo来评估蒙版ELMO，我们的模型的表现明显优于Elmo，并且与变压器方法具有竞争力。

This paper presents Masked ELMo, a new RNN-based model for language model pre-training, evolved from the ELMo language model. Contrary to ELMo which only uses independent left-to-right and right-to-left contexts, Masked ELMo learns fully bidirectional word representations. To achieve this, we use the same Masked language model objective as BERT. Additionally, thanks to optimizations on the LSTM neuron, the integration of mask accumulation and bidirectional truncated backpropagation through time, we have increased the training speed of the model substantially. All these improvements make it possible to pre-train a better language model than ELMo while maintaining a low computational cost. We evaluate Masked ELMo by comparing it to ELMo within the same protocol on the GLUE benchmark, where our model outperforms significantly ELMo and is competitive with transformer approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题