通过对抗性学习来减轻神经对话的性别偏见

论文标题

通过对抗性学习来减轻神经对话的性别偏见

Mitigating Gender Bias for Neural Dialogue Generation with Adversarial Learning

论文作者

Liu, Haochen, Wang, Wentao, Wang, Yiqi, Liu, Hui, Liu, Zitao, Tang, Jiliang

论文摘要

对话系统在我们日常生活的各个方面都起着越来越重要的作用。从最近的研究中可以明显看出，对人类对话数据培训的对话系统有偏见。特别是，它们可以产生反映人们性别偏见的回应。已经为各种NLP任务（例如单词嵌入）开发了许多辩护方法。但是，它们不直接适用于对话系统，因为它们很可能迫使对话模型为不同的性别产生类似的反应。这极大地降低了产生的响应的多样性，并极大地损害了对话模型的性能。在本文中，我们提出了一个新颖的对抗性学习框架，以辩解为chat，以训练对话模型，而在保持表现的同时，没有性别偏见。在两个真实世界对话数据集上进行的广泛实验表明，我们的框架在维持响应质量的同时大大降低了对话模型中的性别偏见。提出的框架的实现已发布。

Dialogue systems play an increasingly important role in various aspects of our daily life. It is evident from recent research that dialogue systems trained on human conversation data are biased. In particular, they can produce responses that reflect people's gender prejudice. Many debiasing methods have been developed for various NLP tasks, such as word embedding. However, they are not directly applicable to dialogue systems because they are likely to force dialogue models to generate similar responses for different genders. This greatly degrades the diversity of the generated responses and immensely hurts the performance of the dialogue models. In this paper, we propose a novel adversarial learning framework Debiased-Chat to train dialogue models free from gender bias while keeping their performance. Extensive experiments on two real-world conversation datasets show that our framework significantly reduces gender bias in dialogue models while maintaining the response quality. The implementation of the proposed framework is released.

下载PDF全文

下载文献需遵守相关版权规定

论文标题