论文标题

蒙版的言论部分模型:建模长上下文是否有助于无监督的POS标签?

Masked Part-Of-Speech Model: Does Modeling Long Context Help Unsupervised POS-tagging?

论文作者

Zhou, Xiang, Zhang, Shiyue, Bansal, Mohit

论文摘要

以前的语音(POS)归纳模型通常假设某些独立假设(例如,马尔可夫,单向,本地依赖性),这些假设不具有真实语言。例如,主题 - 动词一致性可以是长期和双向的。为了促进灵活的依赖性建模,我们提出了一个受掩盖语言模型(MLM)成功启发的言论部分模型(MPOSM)。 MPOSM可以通过掩盖POS重建的目标对任意标签依赖性进行建模并执行POS归纳。我们在英语Penn WSJ数据集以及包含10种不同语言的通用树库中取得了竞争成果。尽管对长期依赖性进行建模应该理想地有助于这项任务,但我们的消融研究表明,不同语言的趋势不同。为了更好地理解这种现象,我们设计了一个新颖的合成实验,可以专门诊断该模型的学习标签一致性。令人惊讶的是,我们发现即使是强大的基线也无法在非常简化的环境中始终如一地解决这个问题:相邻单词之间的一致性。尽管如此,MPOSM可以取得更好的表现。最后,我们进行了详细的错误分析,以阐明其他剩余挑战。我们的代码可从https://github.com/owenzx/mposm获得

Previous Part-Of-Speech (POS) induction models usually assume certain independence assumptions (e.g., Markov, unidirectional, local dependency) that do not hold in real languages. For example, the subject-verb agreement can be both long-term and bidirectional. To facilitate flexible dependency modeling, we propose a Masked Part-of-Speech Model (MPoSM), inspired by the recent success of Masked Language Models (MLM). MPoSM can model arbitrary tag dependency and perform POS induction through the objective of masked POS reconstruction. We achieve competitive results on both the English Penn WSJ dataset as well as the universal treebank containing 10 diverse languages. Though modeling the long-term dependency should ideally help this task, our ablation study shows mixed trends in different languages. To better understand this phenomenon, we design a novel synthetic experiment that can specifically diagnose the model's ability to learn tag agreement. Surprisingly, we find that even strong baselines fail to solve this problem consistently in a very simplified setting: the agreement between adjacent words. Nonetheless, MPoSM achieves overall better performance. Lastly, we conduct a detailed error analysis to shed light on other remaining challenges. Our code is available at https://github.com/owenzx/MPoSM

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源