论文标题
在Semeval-2020的UPB任务8:多任务学习体系结构中的联合文本和视觉建模
UPB at SemEval-2020 Task 8: Joint Textual and Visual Modeling in a Multi-Task Learning Architecture for Memotion Analysis
论文作者
论文摘要
来自在线环境的用户可以创建不同的方式来表达自己的思想,观点或娱乐概念。互联网模因是专门针对这些情况的。他们的主要目的是通过使用图像和文本的组合来传输想法,以便根据模因必须发送的信息为受体创建某种状态。这些帖子可能与各种情况或事件有关,从而在我们的世界所在的任何情况下都增加了有趣的一面。在本文中,我们描述了我们团队为Semeval-2020任务8:MEMOTION分析开发的系统。更具体地说,我们介绍了一个新型系统来分析这些帖子,这些帖子是一种多模式的多任务学习体系结构,将Albert与文本编码与VGG-16结合使用,以用于图像表示。通过这种方式,我们表明可以正确揭示其背后的信息。我们的方法在当前比赛的三个子任务中都取得了良好的表现,子任务为A(0.3453宏观F1得分)排名第11位,子任务B(0.5183宏F1分数)的第1位,子任务为第三名,子任务C(0.3171 Macro F1-SCORE)超过了官方基线,而子任务C(0.3171 Macro F1-SCORE)超过了官方基线。
Users from the online environment can create different ways of expressing their thoughts, opinions, or conception of amusement. Internet memes were created specifically for these situations. Their main purpose is to transmit ideas by using combinations of images and texts such that they will create a certain state for the receptor, depending on the message the meme has to send. These posts can be related to various situations or events, thus adding a funny side to any circumstance our world is situated in. In this paper, we describe the system developed by our team for SemEval-2020 Task 8: Memotion Analysis. More specifically, we introduce a novel system to analyze these posts, a multimodal multi-task learning architecture that combines ALBERT for text encoding with VGG-16 for image representation. In this manner, we show that the information behind them can be properly revealed. Our approach achieves good performance on each of the three subtasks of the current competition, ranking 11th for Subtask A (0.3453 macro F1-score), 1st for Subtask B (0.5183 macro F1-score), and 3rd for Subtask C (0.3171 macro F1-score) while exceeding the official baseline results by high margins.