论文标题

变分策略梯度应用于原子级材料合成

Application of variational policy gradient to atomic-scale materials synthesis

论文作者

Liu, Siyan, Borodinov, Nikolay, Vlcek, Lukas, Lu, Dan, Laanait, Nouamane, Vasudevan, Rama K.

论文摘要

通过层沉积技术合成原子尺度材料,为控制材料结构和产量系统提供了独特的机会,该系统显示出独特的功能性能,这些功能性能无法使用传统的散装合成路线稳定。但是,沉积过程本身提出了一个庞大的多维空间,传统上通过直觉和反复试验优化,从而减慢了进度。在这里,我们使用Stein变分策略梯度(SVPG)方法介绍了深入增强学习对模拟材料合成问题的应用,以训练多个代理,以优化随机策略以产生所需的功能性能。我们的贡献是(1)用于分层材料合成问题的完全开源的仿真环境,利用动力学蒙特卡洛引擎并在OpenAI健身框架中实现,(2)Stein变异策略梯度方法扩展与图像和表格输入的处理,以及(3)使用Horovg的Sim and Rortial Onoves,syvpg gp, CPU。我们证明了这种方法在优化材料表面特征,表面粗糙度的实用性,并与传统的参与者 - 批评(A2C)基线相比,探索了代理使用的策略。此外,我们发现SVPG稳定了传统A2C的训练过程。如果解决了实施挑战,那么这种训练的剂对于各种原子尺度沉积技术,包括脉冲激光沉积和分子束外延可能有用。

Atomic-scale materials synthesis via layer deposition techniques present a unique opportunity to control material structures and yield systems that display unique functional properties that cannot be stabilized using traditional bulk synthetic routes. However, the deposition process itself presents a large, multidimensional space that is traditionally optimized via intuition and trial and error, slowing down progress. Here, we present an application of deep reinforcement learning to a simulated materials synthesis problem, utilizing the Stein variational policy gradient (SVPG) approach to train multiple agents to optimize a stochastic policy to yield desired functional properties. Our contributions are (1) A fully open source simulation environment for layered materials synthesis problems, utilizing a kinetic Monte-Carlo engine and implemented in the OpenAI Gym framework, (2) Extension of the Stein variational policy gradient approach to deal with both image and tabular input, and (3) Developing a parallel (synchronous) implementation of SVPG using Horovod, distributing multiple agents across GPUs and individual simulation environments on CPUs. We demonstrate the utility of this approach in optimizing for a material surface characteristic, surface roughness, and explore the strategies used by the agents as compared with a traditional actor-critic (A2C) baseline. Further, we find that SVPG stabilizes the training process over traditional A2C. Such trained agents can be useful to a variety of atomic-scale deposition techniques, including pulsed laser deposition and molecular beam epitaxy, if the implementation challenges are addressed.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源