使用GNN探索Horn条款的表示（扩展技术报告）

论文标题

使用GNN探索Horn条款的表示（扩展技术报告）

Exploring Representation of Horn Clauses using GNNs (Extended Technical Report)

论文作者

Liang, Chencheng, Rümmer, Philipp, Brockschmidt, Marc

论文摘要

由于现实世界编程语言语法的复杂性，因此来自原始源代码的学习程序语义是具有挑战性的，并且由于难以重建长距离的关系信息在程序中使用标识符暗示表示。在解决第一点时，我们将约束的Horn条款（CHC）视为程序验证问题的标准表示，提供了一种简单而编程的语言语法。对于第二个挑战，我们探索了CHC的图表表示，并提出了一个新的关系超图神经网络（R-HYGNN）体系结构来学习程序功能。我们介绍了CHC的两个不同的图表。一个称为约束图（CG），并通过将符号及其关系分别（分别为键入节点和二进制边缘）转换为CHC的句法信息，并将其构造为抽象语法树的约束。第二个称为控制和数据流超图（CDHG），并通过表示通过三元超过的控制和数据流来强调CHC的语义信息。然后，我们提出了一种新的GNN体系结构R-HYGNN，扩展了关系图卷积网络，以处理超图。为了评估R-HYGNN从程序中提取语义信息的能力，我们使用R-HYGNN在两个图表表示上训练模型，以及在五个具有越来越难度的代理任务上使用CHC-Comp 2021的基准作为培训数据。最困难的代理任务要求该模型预测反例中的条款的发生，这是CHC的满意度。 CDHG在此任务中达到90.59％的精度。此外，R-HYGNN对由290多个条款组成的图表具有完美的预测。总体而言，我们的实验表明R-HYGNN可以捕获用于指导验证问题的复杂程序功能。

Learning program semantics from raw source code is challenging due to the complexity of real-world programming language syntax and due to the difficulty of reconstructing long-distance relational information implicitly represented in programs using identifiers. Addressing the first point, we consider Constrained Horn Clauses (CHCs) as a standard representation of program verification problems, providing a simple and programming language-independent syntax. For the second challenge, we explore graph representations of CHCs, and propose a new Relational Hypergraph Neural Network (R-HyGNN) architecture to learn program features. We introduce two different graph representations of CHCs. One is called constraint graph (CG), and emphasizes syntactic information of CHCs by translating the symbols and their relations in CHCs as typed nodes and binary edges, respectively, and constructing the constraints as abstract syntax trees. The second one is called control- and data-flow hypergraph (CDHG), and emphasizes semantic information of CHCs by representing the control and data flow through ternary hyperedges. We then propose a new GNN architecture, R-HyGNN, extending Relational Graph Convolutional Networks, to handle hypergraphs. To evaluate the ability of R-HyGNN to extract semantic information from programs, we use R-HyGNNs to train models on the two graph representations, and on five proxy tasks with increasing difficulty, using benchmarks from CHC-COMP 2021 as training data. The most difficult proxy task requires the model to predict the occurrence of clauses in counter-examples, which subsumes satisfiability of CHCs. CDHG achieves 90.59% accuracy in this task. Furthermore, R-HyGNN has perfect predictions on one of the graphs consisting of more than 290 clauses. Overall, our experiments indicate that R-HyGNN can capture intricate program features for guiding verification problems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题