论文标题
符号分析符合联合学习以增强恶意软件标识符
Symbolic analysis meets federated learning to enhance malware identifier
论文作者
论文摘要
在过去的几年中,由于恶意软件威胁的数量一直在增加,因此在反恶意软件产品中,制定检测规则的手动方法不再是实用的。因此,转向机器学习方法是使恶意软件识别更有效的一种有希望的方法。传统的集中式机器学习需要大量数据来培训具有出色性能的模型。为了提高恶意软件检测,培训数据可能在各种数据源上,例如主机,基于网络和基于云的反恶意软件组件,甚至来自不同企业的数据。为了避免数据收集的费用以及私人数据的泄漏,我们提出了一个联合学习系统,以通过行为图,即系统呼叫依赖关系图来识别Malwares。它基于一个深度学习模型,包括图形自动编码器和多分类器模块。该模型是通过客户之间的安全学习协议培训的,以保护私人数据,以防止推理攻击。使用该模型来识别恶能,我们实现了同质图数据的85 \%的准确性,对于不均匀图数据,我们达到了93 \%。
Over past years, the manually methods to create detection rules were no longer practical in the anti-malware product since the number of malware threats has been growing. Thus, the turn to the machine learning approaches is a promising way to make the malware recognition more efficient. The traditional centralized machine learning requires a large amount of data to train a model with excellent performance. To boost the malware detection, the training data might be on various kind of data sources such as data on host, network and cloud-based anti-malware components, or even, data from different enterprises. To avoid the expenses of data collection as well as the leakage of private data, we present a federated learning system to identify malwares through the behavioural graphs, i.e., system call dependency graphs. It is based on a deep learning model including a graph autoencoder and a multi-classifier module. This model is trained by a secure learning protocol among clients to preserve the private data against the inference attacks. Using the model to identify malwares, we achieve the accuracy of 85\% for the homogeneous graph data and 93\% for the inhomogeneous graph data.