论文标题
有效的MCMC方法在蛋白质结构预测中优化能量功能
An Efficient MCMC Approach to Energy Function Optimization in Protein Structure Prediction
论文作者
论文摘要
蛋白质结构预测是与药物设计,突变检测和蛋白质合成以及其他应用有关的关键问题。为此,进化数据已用于构建接触图,这些接触图传统上通过基于梯度下降的方案(例如L-BFGS算法)将其作为能量函数最小化。在本文中,我们介绍了所谓的交流大都市束缚(AMH)算法,该算法(a)显着提高了传统MCMC方法的性能,(b)本质上是可行的,允许使用GPU进行大量硬件加速,并且可以与L-BFGS algorithM集成以提高其性能。该算法显示,与传统MH相比,发现结构的能量改善了8.17%至61.04%(平均38.9%),比传统MH比传统MH的能量提高了,与传统MH相比,与最近的CASP竞争中的9个蛋白质相比,传统MH比传统的MH提高了。我们继续将交替的MH算法映射到GPGPU,该算法将采样率提高了277倍,并将模拟时间提高到低能量蛋白预测的仿真时间在CPU上提高了7.5倍至26.5倍。我们表明,通过将其应用于Trrosetta2的能量函数和Alphafold1能量函数的距离组件,可以将方法纳入最先进的蛋白预测管道中。最后,我们注意到,特殊设计的概率计算机(或P计算机)可以像在此处讨论的那样为MCMC算法提供比GPU更好的性能。
Protein structure prediction is a critical problem linked to drug design, mutation detection, and protein synthesis, among other applications. To this end, evolutionary data has been used to build contact maps which are traditionally minimized as energy functions via gradient descent based schemes like the L-BFGS algorithm. In this paper we present what we call the Alternating Metropolis-Hastings (AMH) algorithm, which (a) significantly improves the performance of traditional MCMC methods, (b) is inherently parallelizable allowing significant hardware acceleration using GPU, and (c) can be integrated with the L-BFGS algorithm to improve its performance. The algorithm shows an improvement in energy of found structures of 8.17% to 61.04% (average 38.9%) over traditional MH and 0.53% to 17.75% (average 8.9%) over traditional MH with intermittent noisy restarts, tested across 9 proteins from recent CASP competitions. We go on to map the Alternating MH algorithm to a GPGPU which improves sampling rate by 277x and improves simulation time to a low energy protein prediction by 7.5x to 26.5x over CPU. We show that our approach can be incorporated into state-of-the-art protein prediction pipelines by applying it to both trRosetta2's energy function and the distogram component of Alphafold1's energy function. Finally, we note that specially designed probabilistic computers (or p-computers) can provide even better performance than GPUs for MCMC algorithms like the one discussed here.