论文标题

用于基于结构药物设计的增强遗传算法

Reinforced Genetic Algorithm for Structure-based Drug Design

论文作者

Fu, Tianfan, Gao, Wenhao, Coley, Connor W., Sun, Jimeng

论文摘要

基于结构的药物设计(SBDD)旨在通过发现与疾病相关蛋白质紧密结合的分子(靶)来发现候选药物,这是计算机辅助药物发现的主要方法。最近,在蛋白质口袋上应用深层生成模型以解决SBDD的三维(3D)分子设计引起了很多关注,但是它们作为概率建模的表述通常会导致不令人满意的优化性能。另一方面,传统的组合优化方法(例如遗传算法)(GA)在各种分子优化任务中都表现出最先进的性能。但是,他们不利用蛋白质目标结构来为设计步骤提供信息,而是依赖于随机步行的探索,这会导致性能不稳定,尽管具有相似的结合物理学,但在不同任务之间没有知识转移。为了实现更稳定,更有效的SBDD,我们提出了使用神经模型来优先考虑有利可图的设计步骤并抑制随机步行行为的增强遗传算法(RGA)。神经模型将靶和配体的3D结构作为输入,并使用天然复杂结构进行预训练,以利用来自不同目标的共享结合物理的知识,然后在优化过程中进行微调。我们进行了彻底的实证研究,以优化与各种疾病靶标的结合亲和力,并表明RGA在对接得分方面胜过基准,并且对随机初始化更为强大。消融研究还表明,对不同目标的培训通过利用结合过程的共同基础物理来有助于提高性能。该代码可从https://github.com/futianfan/reinforced-genetic-algorithm获得。

Structure-based drug design (SBDD) aims to discover drug candidates by finding molecules (ligands) that bind tightly to a disease-related protein (targets), which is the primary approach to computer-aided drug discovery. Recently, applying deep generative models for three-dimensional (3D) molecular design conditioned on protein pockets to solve SBDD has attracted much attention, but their formulation as probabilistic modeling often leads to unsatisfactory optimization performance. On the other hand, traditional combinatorial optimization methods such as genetic algorithms (GA) have demonstrated state-of-the-art performance in various molecular optimization tasks. However, they do not utilize protein target structure to inform design steps but rely on a random-walk-like exploration, which leads to unstable performance and no knowledge transfer between different tasks despite the similar binding physics. To achieve a more stable and efficient SBDD, we propose Reinforced Genetic Algorithm (RGA) that uses neural models to prioritize the profitable design steps and suppress random-walk behavior. The neural models take the 3D structure of the targets and ligands as inputs and are pre-trained using native complex structures to utilize the knowledge of the shared binding physics from different targets and then fine-tuned during optimization. We conduct thorough empirical studies on optimizing binding affinity to various disease targets and show that RGA outperforms the baselines in terms of docking scores and is more robust to random initializations. The ablation study also indicates that the training on different targets helps improve performance by leveraging the shared underlying physics of the binding processes. The code is available at https://github.com/futianfan/reinforced-genetic-algorithm.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源