用于攻击基于图的谣言检测的可解释有效的强化学习

论文标题

用于攻击基于图的谣言检测的可解释有效的强化学习

Interpretable and Effective Reinforcement Learning for Attacking against Graph-based Rumor Detection

论文作者

Lyu, Yuefei, Yang, Xiaoyu, Liu, Jiaxin, Yu, Philip S., Xie, Sihong, Zhang, Xi

论文摘要

社交网络经常受到谣言的污染，谣言可以通过高级模型（例如图形神经网络）检测到。但是，这些模型容易受到攻击，并且了解漏洞对于实践中的谣言检测至关重要。为了发现微妙的脆弱性，我们设计了一种强大的攻击算法，以基于强化学习的社交网络中的谣言，这些算法可以与任何黑盒探测器进行互动并攻击。环境具有指数较大的状态空间，高阶图依赖性和延迟嘈杂的奖励，这使得最新的端到端方法难以学习特征，因为大型学习成本和图形深度模型的表达限制。取而代之的是，我们设计了特定领域的功能，以避免学习功能并制定可解释的攻击政策。为了进一步加快政策优化的速度，我们设计了：（i）一种信用分配方法，该方法将延迟的奖励分解为原子攻击行动，与其对目标谣言的伪装影响成正比；（ii）依赖时间的控制变量，以减少由于奖励方差分析和对预测分布的贝叶斯分析的支持，以减少奖励方差和许多攻击步骤。在谣言检测任务的三个现实世界数据集中，我们证明了：（i）与基于规则的攻击和当前的端到端方法相比，学识渊博的攻击政策的有效性；（ii）拟议的信贷分配策略和降低差异组件的有用性；（iii）通过案例研究产生强烈的攻击时该政策的解释性。

Social networks are frequently polluted by rumors, which can be detected by advanced models such as graph neural networks. However, the models are vulnerable to attacks and understanding the vulnerabilities is critical to rumor detection in practice. To discover subtle vulnerabilities, we design a powerful attacking algorithm to camouflage rumors in social networks based on reinforcement learning that can interact with and attack any black-box detectors. The environment has exponentially large state spaces, high-order graph dependencies, and delayed noisy rewards, making the state-of-the-art end-to-end approaches difficult to learn features as large learning costs and expressive limitation of graph deep models. Instead, we design domain-specific features to avoid learning features and produce interpretable attack policies. To further speed up policy optimization, we devise: (i) a credit assignment method that decomposes delayed rewards to atomic attacking actions proportional to the their camouflage effects on target rumors; (ii) a time-dependent control variate to reduce reward variance due to large graphs and many attacking steps, supported by the reward variance analysis and a Bayesian analysis of the prediction distribution. On three real world datasets of rumor detection tasks, we demonstrate: (i) the effectiveness of the learned attacking policy compared to rule-based attacks and current end-to-end approaches; (ii) the usefulness of the proposed credit assignment strategy and variance reduction components; (iii) the interpretability of the policy when generating strong attacks via the case study.

下载PDF全文

下载文献需遵守相关版权规定

论文标题