论文标题
基于机器学习的物联网入侵检测系统:MQTT案例研究(MQTT-ID-IDS2020数据集)
Machine Learning Based IoT Intrusion Detection System: An MQTT Case Study (MQTT-IoT-IDS2020 Dataset)
论文作者
论文摘要
物联网(IoT)是网络安全领域的主要研究领域之一。这是由于(a)增加对自动化设备的依赖性,以及(b)为特殊用途网络使用部署的通用入侵检测系统(ID)的不足。正在为物联网设备通信使用提供了许多轻巧的协议。可区分的物联网对机器人通信协议之一是消息排队遥测传输(MQTT)协议。但是,根据作者的最佳知识,没有包括MQTT良性或攻击实例在内的可用IDS数据集,因此,没有IDS实验结果可用。在本文中,评估了六种机器学习(ML)技术检测基于MQTT的攻击的有效性。评估了三个抽象水平的特征水平,即基于数据包的单向流量和双向流量特征。生成MQTT模拟数据集并将其用于培训和评估过程。该数据集具有开放式访问许可证,以帮助研究社区进一步分析伴随的挑战。实验结果证明了提出的ML模型适合基于MQTT的网络IDS要求的足够。此外,结果强调使用基于流的功能来区分基于MQTT的攻击与良性流量的重要性,而基于数据包的功能足以容纳传统的网络攻击。
The Internet of Things (IoT) is one of the main research fields in the Cybersecurity domain. This is due to (a) the increased dependency on automated device, and (b) the inadequacy of general purpose Intrusion Detection Systems (IDS) to be deployed for special purpose networks usage. Numerous lightweight protocols are being proposed for IoT devices communication usage. One of the distinguishable IoT machine-to-machine communication protocols is Message Queuing Telemetry Transport (MQTT) protocol. However, as per the authors best knowledge, there are no available IDS datasets that include MQTT benign or attack instances and thus, no IDS experimental results available. In this paper, the effectiveness of six Machine Learning (ML) techniques to detect MQTT-based attacks is evaluated. Three abstraction levels of features are assessed, namely, packet-based, unidirectional flow, and bidirectional flow features. An MQTT simulated dataset is generated and used for the training and evaluation processes. The dataset is released with an open access licence to help the research community further analyse the accompanied challenges. The experimental results demonstrated the adequacy of the proposed ML models to suit MQTT-based networks IDS requirements. Moreover, the results emphasise on the importance of using flow-based features to discriminate MQTT-based attacks from benign traffic, while packet-based features are sufficient for traditional networking attacks.