论文标题

并非所有数据集都是平等的:在异质数据和对抗性示例上

Not All Datasets Are Born Equal: On Heterogeneous Data and Adversarial Examples

论文作者

Mathov, Yael, Levy, Eden, Katzir, Ziv, Shabtai, Asaf, Elovici, Yuval

论文摘要

关于对抗性学习的最新工作主要集中在神经网络和域上,这些网络(例如计算机视觉或音频处理)出色。这些域中的数据通常是均匀的,而异质的表格数据集域尽管存在流行率。在搜索异质输入空间内的对抗模式时,攻击者必须同时保留数据的复杂域特异性有效性规则,以及已确定的样本的对抗性。因此,将对抗性操作应用于异质数据集已被证明是一项具有挑战性的任务,到目前为止,没有提出任何通用攻击方法。但是,我们认为,在异质表格数据上训练的机器学习模型与受对抗性操作一样容易受到对抗操作的影响,而那些是经过连续或同质数据(例如图像)训练的机器。为了支持我们的主张,我们引入了一个通用优化框架,用于识别异质输入空间中的对抗扰动。我们为保持对抗性示例的一致性定义了分布感知的约束,并通过将异质输入嵌入到连续的潜在空间中来结合起来。由于基础数据集的性质,我们将重点放在$ \ ell_0 $扰动上,并证明了它们在现实生活中的适用性。我们使用来自不同内容域的三个数据集证明了方法的有效性。我们的结果表明,尽管对异质数据集的输入有效性施加的限制,但使用此类数据训练的机器学习模型仍然对对抗性示例也同样容易受到影响。

Recent work on adversarial learning has focused mainly on neural networks and domains where those networks excel, such as computer vision, or audio processing. The data in these domains is typically homogeneous, whereas heterogeneous tabular datasets domains remain underexplored despite their prevalence. When searching for adversarial patterns within heterogeneous input spaces, an attacker must simultaneously preserve the complex domain-specific validity rules of the data, as well as the adversarial nature of the identified samples. As such, applying adversarial manipulations to heterogeneous datasets has proved to be a challenging task, and no generic attack method was suggested thus far. We, however, argue that machine learning models trained on heterogeneous tabular data are as susceptible to adversarial manipulations as those trained on continuous or homogeneous data such as images. To support our claim, we introduce a generic optimization framework for identifying adversarial perturbations in heterogeneous input spaces. We define distribution-aware constraints for preserving the consistency of the adversarial examples and incorporate them by embedding the heterogeneous input into a continuous latent space. Due to the nature of the underlying datasets We focus on $\ell_0$ perturbations, and demonstrate their applicability in real life. We demonstrate the effectiveness of our approach using three datasets from different content domains. Our results demonstrate that despite the constraints imposed on input validity in heterogeneous datasets, machine learning models trained using such data are still equally susceptible to adversarial examples.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源