论文标题
在Semeval-2022任务2:习惯检测的预训练语言模型
HIT at SemEval-2022 Task 2: Pre-trained Language Model for Idioms Detection
论文作者
论文摘要
相同的多字表达式在不同的句子中可能具有不同的含义。它们可以主要分为两类,这是字面意义和惯用含义。基于非上下文的方法在此问题上的性能较差,我们需要上下文嵌入以正确理解多字表达的惯用含义。我们使用预先训练的语言模型,该模型可以提供上下文感知的句子嵌入,以检测句子中的多词表达式是否是惯用的用法。
The same multi-word expressions may have different meanings in different sentences. They can be mainly divided into two categories, which are literal meaning and idiomatic meaning. Non-contextual-based methods perform poorly on this problem, and we need contextual embedding to understand the idiomatic meaning of multi-word expressions correctly. We use a pre-trained language model, which can provide a context-aware sentence embedding, to detect whether multi-word expression in the sentence is idiomatic usage.