论文标题
上下文表示学习超出蒙版语言建模
Contextual Representation Learning beyond Masked Language Modeling
论文作者
论文摘要
蒙版语言模型(MLM)如何学习上下文表示?在这项工作中,我们分析了MLM的学习动力。我们发现,MLMS采用采样的嵌入为锚定,以估算和注入上下文语义对表示,这限制了MLM的效率和有效性。为了解决这些问题,我们提出了炸玉米饼,炸玉米饼是一种直接建模全球语义的简单而有效的表示方法。炸玉米饼提取和对齐上下文语义语义上隐藏在上下文表示表示中,以鼓励模型在生成上下文化表示时参加全局语义。胶水基准上的实验表明,塔科的速度高达5倍,比现有MLM的平均改善达到1.2分。该代码可在https://github.com/fuzhiyi/taco上找到。
How do masked language models (MLMs) such as BERT learn contextual representations? In this work, we analyze the learning dynamics of MLMs. We find that MLMs adopt sampled embeddings as anchors to estimate and inject contextual semantics to representations, which limits the efficiency and effectiveness of MLMs. To address these issues, we propose TACO, a simple yet effective representation learning approach to directly model global semantics. TACO extracts and aligns contextual semantics hidden in contextualized representations to encourage models to attend global semantics when generating contextualized representations. Experiments on the GLUE benchmark show that TACO achieves up to 5x speedup and up to 1.2 points average improvement over existing MLMs. The code is available at https://github.com/FUZHIYI/TACO.