论文标题
UKP-Square V2:可信赖的QA的解释性和对抗性攻击
UKP-SQuARE v2: Explainability and Adversarial Attacks for Trustworthy QA
论文作者
论文摘要
问答(QA)系统越来越多地部署在支持现实决定的应用程序中。但是,最新的模型依赖于深层神经网络,这些网络很难被人类解释。固有的可解释模型或事后解释性方法可以帮助用户理解模型如何到达其预测,并在成功的情况下增加对系统的信任。此外,研究人员可以利用这些见解来开发更准确且偏见的新方法。在本文中,我们介绍了Square V2(Square的新版本),以根据图形和基于图的说明等方法进行比较模型提供解释性基础架构。尽管显着图对于检查每个输入令牌对于模型的预测的重要性很有用,但来自外部知识图的基于图的解释使用户能够验证模型预测背后的推理。此外,我们提供了多种对抗性攻击,以比较质量检查模型的鲁棒性。通过这些解释性方法和对抗性攻击,我们旨在简化对可信赖的质量检查模型的研究。 Square可在https://square.ukp-lab.de上找到。
Question Answering (QA) systems are increasingly deployed in applications where they support real-world decisions. However, state-of-the-art models rely on deep neural networks, which are difficult to interpret by humans. Inherently interpretable models or post hoc explainability methods can help users to comprehend how a model arrives at its prediction and, if successful, increase their trust in the system. Furthermore, researchers can leverage these insights to develop new methods that are more accurate and less biased. In this paper, we introduce SQuARE v2, the new version of SQuARE, to provide an explainability infrastructure for comparing models based on methods such as saliency maps and graph-based explanations. While saliency maps are useful to inspect the importance of each input token for the model's prediction, graph-based explanations from external Knowledge Graphs enable the users to verify the reasoning behind the model prediction. In addition, we provide multiple adversarial attacks to compare the robustness of QA models. With these explainability methods and adversarial attacks, we aim to ease the research on trustworthy QA models. SQuARE is available on https://square.ukp-lab.de.