论文标题
在变压器的知识归因中找到模式
Finding patterns in Knowledge Attribution for Transformers
论文作者
论文摘要
我们分析了知识神经元框架,以将事实和关系知识归因于变压器网络中的特定神经元。我们为实验使用12层多语言BERT模型。我们的研究揭示了各种有趣的现象。我们观察到,大多数事实知识可以归因于网络的中层和更高层($ \ ge 6 $)。进一步的分析表明,中间层($ 6-9 $)主要负责关系信息,这进一步完善了实际的事实知识或最后几层中的“正确答案”($ 10-12 $)。我们的实验还表明,该模型以不同的语言处理提示,但代表同样的事实,同样,提供了更多语言预训练有效性的进一步证据。将归因方案应用于语法知识时,我们发现语法知识比事实知识更分散在神经元中。
We analyze the Knowledge Neurons framework for the attribution of factual and relational knowledge to particular neurons in the transformer network. We use a 12-layer multi-lingual BERT model for our experiments. Our study reveals various interesting phenomena. We observe that mostly factual knowledge can be attributed to middle and higher layers of the network($\ge 6$). Further analysis reveals that the middle layers($6-9$) are mostly responsible for relational information, which is further refined into actual factual knowledge or the "correct answer" in the last few layers($10-12$). Our experiments also show that the model handles prompts in different languages, but representing the same fact, similarly, providing further evidence for effectiveness of multi-lingual pre-training. Applying the attribution scheme for grammatical knowledge, we find that grammatical knowledge is far more dispersed among the neurons than factual knowledge.