基于集体神经rest细胞迁移的确定性策略梯度的深入加强学习模型

论文标题

基于集体神经rest细胞迁移的确定性策略梯度的深入加强学习模型

A deep reinforcement learning model based on deterministic policy gradient for collective neural crest cell migration

论文作者

Zhang, Yihao, Chai, Zhaojie, Sun, Yubing, Lykotrafitis, George

论文摘要

建模细胞相互作用，例如共缩合和运动的接触抑制对于理解集体细胞迁移至关重要。在这里，我们提出了一种新型的深入增强学习模型，用于集体神经rest细胞迁移。我们将与粒子动力学模拟环境相关的深层确定性策略梯度算法应用于训练代理以确定迁移路径。由于领导者和追随者神经rest细胞的迁移机制不同，我们训练两种类型的药物（领导者和追随者）学习集体细胞迁移行为。对于领导者，我们考虑了全球任务的线性组合，从而导致了通往目标源的最短路径和局部任务，从而导致沿本地化学吸引梯度的协调运动。对于追随者代理，我们仅考虑本地任务。首先，我们表明，通过领导者细胞点学到的自动驱动力大约到座位，这意味着代理能够学习遵循目标最短路径。为了验证我们的方法，我们比较了代理经过的总时间，以便使用所提出的方法和使用基于代理的模型计算的时间计算出的位置。使用两种方法计算的迁移时间间隔的分布显示出没有显着差异。然后，我们研究了互动和运动对集体领导者细胞迁移的影响。我们表明，由于共同吸引力减轻源驱动的效果，因此共同吸引案例的总体领导者细胞迁移较慢。此外，我们发现领导者和追随者代理人学会遵循与实验观察相似的迁移行为。总体而言，我们提出的方法提供了有关如何应用强化学习技术模拟集体细胞迁移的有用见解。

Modeling cell interactions such as co-attraction and contact-inhibition of locomotion is essential for understanding collective cell migration. Here, we propose a novel deep reinforcement learning model for collective neural crest cell migration. We apply the deep deterministic policy gradient algorithm in association with a particle dynamics simulation environment to train agents to determine the migration path. Because of the different migration mechanisms of leader and follower neural crest cells, we train two types of agents (leaders and followers) to learn the collective cell migration behavior. For a leader agent, we consider a linear combination of a global task, resulting in the shortest path to the target source, and a local task, resulting in a coordinated motion along the local chemoattractant gradient. For a follower agent, we consider only the local task. First, we show that the self-driven forces learned by the leader cell point approximately to the placode, which means that the agent is able to learn to follow the shortest path to the target. To validate our method, we compare the total time elapsed for agents to reach the placode computed using the proposed method and the time computed using an agent-based model. The distributions of the migration time intervals calculated using the two methods are shown to not differ significantly. We then study the effect of co-attraction and contact-inhibition of locomotion to the collective leader cell migration. We show that the overall leader cell migration for the case with co-attraction is slower because the co-attraction mitigates the source-driven effect. In addition, we find that the leader and follower agents learn to follow a similar migration behavior as in experimental observations. Overall, our proposed method provides useful insight on how to apply reinforcement learning techniques to simulate collective cell migration.

下载PDF全文

下载文献需遵守相关版权规定

论文标题