固有的 - 超支卷积和在3D蛋白结构上学习的合并

论文标题

固有的 - 超支卷积和在3D蛋白结构上学习的合并

Intrinsic-Extrinsic Convolution and Pooling for Learning on 3D Protein Structures

论文作者

Hermosilla, Pedro, Schäfer, Marco, Lang, Matěj, Fackelmann, Gloria, Vázquez, Pere Pau, Kozlíková, Barbora, Krone, Michael, Ritschel, Tobias, Ropinski, Timo

论文摘要

蛋白质在生物体中发挥了多种功能，因此在生物学中起着关键作用。截至目前，可用的学习算法处理蛋白质数据并未考虑此类数据的几个特殊性和/或对于大蛋白质构象的扩展不佳。为了填补这一空白，我们提出了两个新的学习操作，可以对大规模蛋白质数据进行深入的3D分析。首先，我们介绍了一个新型的卷积操作员，该操作员认为，通过使用在欧几里得距离上定义的$ n $ d卷积，以及在欧几里得距离上定义的$ n $ d卷积，以及在多仪器中原子之间的多个地理距离，都考虑了固有的（在蛋白质折叠下）和外在的（在键合）结构下不变。其次，我们通过引入层次合并操作员来启用多尺度蛋白质分析，并利用蛋白质是有限氨基酸的重组的事实，可以使用共享的池矩阵进行汇总。最后，我们在几个大规模数据集中评估了算法的准确性，用于公共蛋白质分析任务，在此方面，我们表现优于最先进的方法。

Proteins perform a large variety of functions in living organisms, thus playing a key role in biology. As of now, available learning algorithms to process protein data do not consider several particularities of such data and/or do not scale well for large protein conformations. To fill this gap, we propose two new learning operations enabling deep 3D analysis of large-scale protein data. First, we introduce a novel convolution operator which considers both, the intrinsic (invariant under protein folding) as well as extrinsic (invariant under bonding) structure, by using $n$-D convolutions defined on both the Euclidean distance, as well as multiple geodesic distances between atoms in a multi-graph. Second, we enable a multi-scale protein analysis by introducing hierarchical pooling operators, exploiting the fact that proteins are a recombination of a finite set of amino acids, which can be pooled using shared pooling matrices. Lastly, we evaluate the accuracy of our algorithms on several large-scale data sets for common protein analysis tasks, where we outperform state-of-the-art methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题