符号分析符合联合学习以增强恶意软件标识符

论文标题

符号分析符合联合学习以增强恶意软件标识符

Symbolic analysis meets federated learning to enhance malware identifier

论文作者

Dam, Khanh Huu The, Van Ouytsel, Charles-Henry Bertrand, Legay, Axel

论文摘要

在过去的几年中，由于恶意软件威胁的数量一直在增加，因此在反恶意软件产品中，制定检测规则的手动方法不再是实用的。因此，转向机器学习方法是使恶意软件识别更有效的一种有希望的方法。传统的集中式机器学习需要大量数据来培训具有出色性能的模型。为了提高恶意软件检测，培训数据可能在各种数据源上，例如主机，基于网络和基于云的反恶意软件组件，甚至来自不同企业的数据。为了避免数据收集的费用以及私人数据的泄漏，我们提出了一个联合学习系统，以通过行为图，即系统呼叫依赖关系图来识别Malwares。它基于一个深度学习模型，包括图形自动编码器和多分类器模块。该模型是通过客户之间的安全学习协议培训的，以保护私人数据，以防止推理攻击。使用该模型来识别恶能，我们实现了同质图数据的85 \％的准确性，对于不均匀图数据，我们达到了93 \％。

Over past years, the manually methods to create detection rules were no longer practical in the anti-malware product since the number of malware threats has been growing. Thus, the turn to the machine learning approaches is a promising way to make the malware recognition more efficient. The traditional centralized machine learning requires a large amount of data to train a model with excellent performance. To boost the malware detection, the training data might be on various kind of data sources such as data on host, network and cloud-based anti-malware components, or even, data from different enterprises. To avoid the expenses of data collection as well as the leakage of private data, we present a federated learning system to identify malwares through the behavioural graphs, i.e., system call dependency graphs. It is based on a deep learning model including a graph autoencoder and a multi-classifier module. This model is trained by a secure learning protocol among clients to preserve the private data against the inference attacks. Using the model to identify malwares, we achieve the accuracy of 85\% for the homogeneous graph data and 93\% for the inhomogeneous graph data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题