关于基于机器学习的网络入侵检测系统的普遍性

论文标题

关于基于机器学习的网络入侵检测系统的普遍性

On Generalisability of Machine Learning-based Network Intrusion Detection Systems

论文作者

Layeghy, Siamak, Portmann, Marius

论文摘要

当对合成基准数据集进行评估时，许多基于机器学习（ML）的网络入侵检测系统（NIDSS）几乎可以实现完美的检测性能。但是，没有记录这些结果以及如何推广到其他网络方案，尤其是现实世界网络的记录。在本文中，我们通过在最近发布的四个最近发布的基准NIDS数据集上广泛评估了七个受监督和无监督的学习模型，研究了基于ML的NIDS的普遍性属性。我们的调查表明，没有考虑的模型能够概括所有研究的数据集。有趣的是，我们的结果还表明，概括性具有高度的不对称性，即交换源域和目标域可以显着改变分类性能。我们的调查还表明，在我们考虑的情况下，总体而言，无监督的学习方法比监督学习模型更好地概括了。使用Shap值来解释这些结果，表明缺乏通用性主要是由于一个或多个特征的值与一个数据集模型组合中的一个或多个特征和攻击/良性类之间存在强烈的对应，并且在具有不同特征分布的其他数据集中不存在。

Many of the proposed machine learning (ML) based network intrusion detection systems (NIDSs) achieve near perfect detection performance when evaluated on synthetic benchmark datasets. Though, there is no record of if and how these results generalise to other network scenarios, in particular to real-world networks. In this paper, we investigate the generalisability property of ML-based NIDSs by extensively evaluating seven supervised and unsupervised learning models on four recently published benchmark NIDS datasets. Our investigation indicates that none of the considered models is able to generalise over all studied datasets. Interestingly, our results also indicate that the generalisability has a high degree of asymmetry, i.e., swapping the source and target domains can significantly change the classification performance. Our investigation also indicates that overall, unsupervised learning methods generalise better than supervised learning models in our considered scenarios. Using SHAP values to explain these results indicates that the lack of generalisability is mainly due to the presence of strong correspondence between the values of one or more features and Attack/Benign classes in one dataset-model combination and its absence in other datasets that have different feature distributions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题