在英语模型中对美国社会刻板印象的理论基础测量

论文标题

在英语模型中对美国社会刻板印象的理论基础测量

Theory-Grounded Measurement of U.S. Social Stereotypes in English Language Models

论文作者

Cao, Yang Trista, Sotnikova, Anna, Daumé III, Hal, Rudinger, Rachel, Zou, Linda

论文摘要

已显示在文本上训练的NLP模型可以重现人类的刻板印象，当系统大规模部署系统时，可以放大边缘化组的危害。我们适应了Koch等人的代理 - 贝利夫 - 通讯（ABC）刻板印象模型。（2016年）从社会心理学作为系统研究和发现语言模型（LMS）中刻板印象群体特征协会的框架。我们介绍了用于测量语言模型的刻板印象关联的灵敏度测试（集合）。为了使用ABC模型评估集合和其他措施，我们从美国受试者那里收集群体特征判断，以与英语LM刻板印象进行比较。最后，我们扩展了该框架以测量相互切换身份的LM定型观念。

NLP models trained on text have been shown to reproduce human stereotypes, which can magnify harms to marginalized groups when systems are deployed at scale. We adapt the Agency-Belief-Communion (ABC) stereotype model of Koch et al. (2016) from social psychology as a framework for the systematic study and discovery of stereotypic group-trait associations in language models (LMs). We introduce the sensitivity test (SeT) for measuring stereotypical associations from language models. To evaluate SeT and other measures using the ABC model, we collect group-trait judgments from U.S.-based subjects to compare with English LM stereotypes. Finally, we extend this framework to measure LM stereotyping of intersectional identities.

下载PDF全文

下载文献需遵守相关版权规定

论文标题