论文标题
文字到图像生成:没有留下语言
Text to Image Generation: Leaving no Language Behind
论文作者
论文摘要
人工智能(AI)的最新应用之一是从自然语言描述中生成图像。这些发电机现在已经变得可用,并取得了令人印象深刻的结果,例如在杂志的前封面上使用的结果。由于发电机的输入是自然语言文本的形式,因此立即出现的问题是这些模型在用不同语言编写的输入时如何行为。在本文中,我们对三个流行的文本到图像发生器的性能如何取决于语言进行了初步探索。结果表明,使用英语以外的其他语言时,尤其是对于不广泛使用的语言时会有显着的性能降级。该观察结果使我们讨论了如何改善文本到图像发生器,以使不同语言的性能保持一致。这是确保这项新技术可以由非母语说话的人使用并保留语言多样性的基础。
One of the latest applications of Artificial Intelligence (AI) is to generate images from natural language descriptions. These generators are now becoming available and achieve impressive results that have been used for example in the front cover of magazines. As the input to the generators is in the form of a natural language text, a question that arises immediately is how these models behave when the input is written in different languages. In this paper we perform an initial exploration of how the performance of three popular text-to-image generators depends on the language. The results show that there is a significant performance degradation when using languages other than English, especially for languages that are not widely used. This observation leads us to discuss different alternatives on how text-to-image generators can be improved so that performance is consistent across different languages. This is fundamental to ensure that this new technology can be used by non-native English speakers and to preserve linguistic diversity.