基于流行分类模型对随机和有针对性损坏的鲁棒性

论文标题

基于流行分类模型对随机和有针对性损坏的鲁棒性

Benchmarking Popular Classification Models' Robustness to Random and Targeted Corruptions

论文作者

Desai, Utkarsh, Tamilselvam, Srikanth, Kaur, Jassimran, Mani, Senthil, Khare, Shreya

论文摘要

文本分类模型，尤其是基于神经网络的模型，在许多流行的基准数据集上达到了很高的精度。但是，当部署在现实世界应用中时，这种模型往往会表现不佳。主要原因是这些模型未针对足够的现实世界自然数据进行测试。根据应用程序用户，模型输入的词汇和样式可能会有所不同。这强调了模型不可知测试数据集的需求，该数据集由各种自然出现在野外的腐败组成。在此类基准数据集上训练和测试的模型将对现实世界数据更强大。但是，此类数据集并不容易可用。在这项工作中，我们通过将基准数据集扩展到自然发生的损坏，例如拼写错误，文本噪声和同义词，并使它们公开可用，从而解决了这个问题。通过广泛的实验，我们使用局部可解释的模型不足的解释（LIME）比较了随机和有针对性的腐败策略。我们在这些腐败沿两个流行的文本分类模型中报告了漏洞，并且发现有针对性的腐败可以比大多数情况下的随机选择更好地暴露模型的漏洞。

Text classification models, especially neural networks based models, have reached very high accuracy on many popular benchmark datasets. Yet, such models when deployed in real world applications, tend to perform badly. The primary reason is that these models are not tested against sufficient real world natural data. Based on the application users, the vocabulary and the style of the model's input may greatly vary. This emphasizes the need for a model agnostic test dataset, which consists of various corruptions that are natural to appear in the wild. Models trained and tested on such benchmark datasets, will be more robust against real world data. However, such data sets are not easily available. In this work, we address this problem, by extending the benchmark datasets along naturally occurring corruptions such as Spelling Errors, Text Noise and Synonyms and making them publicly available. Through extensive experiments, we compare random and targeted corruption strategies using Local Interpretable Model-Agnostic Explanations(LIME). We report the vulnerabilities in two popular text classification models along these corruptions and also find that targeted corruptions can expose vulnerabilities of a model better than random choices in most cases.

下载PDF全文

下载文献需遵守相关版权规定

论文标题