论文标题
带来自己的观点:图形对比度学习而无需预制数据
Bringing Your Own View: Graph Contrastive Learning without Prefabricated Data Augmentations
论文作者
论文摘要
自我划分最近在其图形学习的新领域飙升。它有助于图表表示有益于下游任务;但是它的成功可以取决于手工业的领域知识或通常昂贵的试验和错误。即使是其最先进的代表,图形对比度学习(GraphCl)也并不完全不含这些需求,因为GraphCl使用了图形数据增强的临时手动选择的预制先验。我们的工作旨在通过回答以下问题来推进GraphCl:如何表示图表的空间增强视图?在该空间中学习先验的哪些原则可以依靠什么?可以构建哪些框架以与对比度学习同时学习先验?因此,我们已经将预制离散的先验扩展到了增强集中的预制离散先验,并在图形生成器的参数空间中可学习的连续先验,假设图可以通过数据生成来学习图形先验本身,类似于图像歧管的概念。此外,由于先前的可学习性,要形成对比的观点而没有崩溃到琐碎的解决方案,我们利用了信息最小化的原则(infomin)和信息瓶颈(Infobn)来正规化学习的先验。最终,对比度学习,信息和Infobn有机地纳入了双层优化的一个框架中。事实证明,我们的原则性和自动化方法在小图的基准上与最先进的图形自学方法(包括GraphCl)具有竞争力。并在大规模图上显示出更好的概括性,而无需诉诸人类的专业知识或下游验证。我们的代码在https://github.com/shen-lab/graphcl_automated上公开发布。
Self-supervision is recently surging at its new frontier of graph learning. It facilitates graph representations beneficial to downstream tasks; but its success could hinge on domain knowledge for handcraft or the often expensive trials and errors. Even its state-of-the-art representative, graph contrastive learning (GraphCL), is not completely free of those needs as GraphCL uses a prefabricated prior reflected by the ad-hoc manual selection of graph data augmentations. Our work aims at advancing GraphCL by answering the following questions: How to represent the space of graph augmented views? What principle can be relied upon to learn a prior in that space? And what framework can be constructed to learn the prior in tandem with contrastive learning? Accordingly, we have extended the prefabricated discrete prior in the augmentation set, to a learnable continuous prior in the parameter space of graph generators, assuming that graph priors per se, similar to the concept of image manifolds, can be learned by data generation. Furthermore, to form contrastive views without collapsing to trivial solutions due to the prior learnability, we have leveraged both principles of information minimization (InfoMin) and information bottleneck (InfoBN) to regularize the learned priors. Eventually, contrastive learning, InfoMin, and InfoBN are incorporated organically into one framework of bi-level optimization. Our principled and automated approach has proven to be competitive against the state-of-the-art graph self-supervision methods, including GraphCL, on benchmarks of small graphs; and shown even better generalizability on large-scale graphs, without resorting to human expertise or downstream validation. Our code is publicly released at https://github.com/Shen-Lab/GraphCL_Automated.