论文标题

检测Twitter上指定实体和标签的政治偏见

Detecting Political Biases of Named Entities and Hashtags on Twitter

论文作者

Xiao, Zhiping, Zhu, Jeffrey, Wang, Yining, Zhou, Pei, Lam, Wen Hong, Porter, Mason A., Sun, Yizhou

论文摘要

在美国的意识形态分裂在日常交流中变得越来越重要。因此,关于政治两极分化的研究已经进行了很多研究,其中包括采取计算观点的许多努力。通过检测文本语料库中的政治偏见,可以尝试描述和辨别该文本的两极分性。从直觉上讲,命名的实体(即名词和用作名词的名词和短语)和文本中的标签经常带有有关政治观点的信息。例如,使用“ Pro-Choice”一词的人可能是自由的,而使用“亲生生命”一词的人可能是保守的。在本文中,我们试图揭示社交媒体文本数据中的政治极性,并通过将极性得分分配给实体和标签来量化这些极性。尽管这个想法很简单,但很难以可信赖的定量方式进行这种推论。关键挑战包括少数已知标签,连续的政治观点,以及在嵌入单词媒介中的极性得分和极性中性语义含义的保存。为了克服这些挑战,我们提出了极性感知的嵌入多任务学习(PEM)模型。该模型由(1)自我监督的上下文保护任务,(2)基于注意力的推文级别的极性 - 推导任务以及(3)对抗性学习任务,该任务促进了嵌入式的极性维度与其语义维度之间的独立性。我们的实验结果表明,我们的PEM模型可以成功学习执行良好分类任务的极性感知的嵌入。我们检查了各种应用,从而证明了PEM模型的有效性。我们还讨论了我们工作的重要局限性,并鼓励将IT应用于实际情况时谨慎。

Ideological divisions in the United States have become increasingly prominent in daily communication. Accordingly, there has been much research on political polarization, including many recent efforts that take a computational perspective. By detecting political biases in a corpus of text, one can attempt to describe and discern the polarity of that text. Intuitively, the named entities (i.e., the nouns and the phrases that act as nouns) and hashtags in text often carry information about political views. For example, people who use the term "pro-choice" are likely to be liberal, whereas people who use the term "pro-life" are likely to be conservative. In this paper, we seek to reveal political polarities in social-media text data and to quantify these polarities by explicitly assigning a polarity score to entities and hashtags. Although this idea is straightforward, it is difficult to perform such inference in a trustworthy quantitative way. Key challenges include the small number of known labels, the continuous spectrum of political views, and the preservation of both a polarity score and a polarity-neutral semantic meaning in an embedding vector of words. To attempt to overcome these challenges, we propose the Polarity-aware Embedding Multi-task learning (PEM) model. This model consists of (1) a self-supervised context-preservation task, (2) an attention-based tweet-level polarity-inference task, and (3) an adversarial learning task that promotes independence between an embedding's polarity dimension and its semantic dimensions. Our experimental results demonstrate that our PEM model can successfully learn polarity-aware embeddings that perform well classification tasks. We examine a variety of applications and we thereby demonstrate the effectiveness of our PEM model. We also discuss important limitations of our work and encourage caution when applying the it to real-world scenarios.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源