论文标题
winodict:探测语言模型中的语言模型获取
WinoDict: Probing language models for in-context word acquisition
论文作者
论文摘要
我们介绍了一种新的文化学习范式,以测量在推理过程中学习新颖单词的大型语言模型(LLMS)。特别是,我们通过用合成但合理的单词替换关键概念词来重写Winograd风格的共同分辨率问题,该词必须理解该模型以完成任务。解决此任务需要模型使用提示中给出的新单词的字典定义。这个基准介绍了单词获取,这是折磨llms已知的历时降解的一个重要方面。由于LLM在训练的那一刻及时被冻结,因此通常无法反映语言随着时间的变化方式。我们表明,LLM与原始Winograd任务相比的准确性在我们的基准测试中从根本上降低,从而确定了当前模型的局限性,并提供了基准,以衡量LLMS未来在进行内部文本学习能力的改善。
We introduce a new in-context learning paradigm to measure Large Language Models' (LLMs) ability to learn novel words during inference. In particular, we rewrite Winograd-style co-reference resolution problems by replacing the key concept word with a synthetic but plausible word that the model must understand to complete the task. Solving this task requires the model to make use of the dictionary definition of the new word given in the prompt. This benchmark addresses word acquisition, one important aspect of the diachronic degradation known to afflict LLMs. As LLMs are frozen in time at the moment they are trained, they are normally unable to reflect the way language changes over time. We show that the accuracy of LLMs compared to the original Winograd tasks decreases radically in our benchmark, thus identifying a limitation of current models and providing a benchmark to measure future improvements in LLMs ability to do in-context learning.