论文标题
通过流媒体进行粒子模拟的持续学习自动编码器培训
Continual learning autoencoder training for a particle-in-cell simulation via streaming
论文作者
论文摘要
即将到来的Exascale时代将提供新一代的物理模拟。这些模拟将具有高时空分辨率,这将影响机器学习模型的训练,因为在磁盘上存储大量的仿真数据几乎是不可能的。因此,我们需要重新考虑即将到来的Exascale时代的机器学习模型的培训。这项工作提出了一种方法,该方法将神经网络同时训练在不存储磁盘上的数据的情况下同时进行运行模拟。培训管道通过内存流访问培训数据。此外,我们应用了持续学习领域的方法来增强模型的概括。我们测试了对3D自动编码器训练的管道,该训练同时训练了激光韦克赛场加速粒子中的粒子模拟。此外,我们尝试了各种持续学习方法及其对概括的影响。
The upcoming exascale era will provide a new generation of physics simulations. These simulations will have a high spatiotemporal resolution, which will impact the training of machine learning models since storing a high amount of simulation data on disk is nearly impossible. Therefore, we need to rethink the training of machine learning models for simulations for the upcoming exascale era. This work presents an approach that trains a neural network concurrently to a running simulation without storing data on a disk. The training pipeline accesses the training data by in-memory streaming. Furthermore, we apply methods from the domain of continual learning to enhance the generalization of the model. We tested our pipeline on the training of a 3d autoencoder trained concurrently to laser wakefield acceleration particle-in-cell simulation. Furthermore, we experimented with various continual learning methods and their effect on the generalization.