论文标题
对机器学习模型VVUQ的深神经网络预测不确定性的量化
Quantification of Deep Neural Network Prediction Uncertainties for VVUQ of Machine Learning Models
论文作者
论文摘要
最近的人工智能(AI)和机器学习(ML)的表现突破,尤其是深度学习的进步(DL),可用性强大,易于使用的ML库(例如Scikit-Learn,Tensorflow,Pytorch。),以及计算能力的日益增强,使AI/ML的兴趣越来越大。对于基于物理的计算模型,已经对验证,验证和不确定性定量(VVUQ)进行了广泛研究,并且已经开发了许多方法。但是,ML模型的VVUQ的研究相对较少,尤其是在核工程中。在这项工作中,我们将重点放在ML模型的UQ上,这是ML VVUQ的初步步骤,更具体地说,是深层神经网络(DNN),因为它们是用于回归和分类任务的最广泛使用的监督ML算法。这项工作旨在量化DNNs用作昂贵物理模型的替代模型时量化DNN的预测或近似不确定性。比较了针对DNN的UQ的三种技术,即Monte Carlo Dropout(MCD),深层合奏(DE)和贝叶斯神经网络(BNNS)。两个核工程示例用于基准这些方法,(1)使用野牛代码释放时间依赖性的裂变气体释放数据,以及(2)基于BFBT基准测试的空隙分数模拟,使用跟踪代码。发现这三种方法通常需要不同的DNN体系结构和超参数来优化其性能。 UQ结果还取决于可用培训数据的量和数据的性质。总体而言,所有这三种方法都可以对近似不确定性进行合理的估计。当平均预测接近测试数据时,不确定性通常较小,而BNN方法通常会产生比MCD和DE更大的不确定性。
Recent performance breakthroughs in Artificial intelligence (AI) and Machine learning (ML), especially advances in Deep learning (DL), the availability of powerful, easy-to-use ML libraries (e.g., scikit-learn, TensorFlow, PyTorch.), and increasing computational power have led to unprecedented interest in AI/ML among nuclear engineers. For physics-based computational models, Verification, Validation and Uncertainty Quantification (VVUQ) have been very widely investigated and a lot of methodologies have been developed. However, VVUQ of ML models has been relatively less studied, especially in nuclear engineering. In this work, we focus on UQ of ML models as a preliminary step of ML VVUQ, more specifically, Deep Neural Networks (DNNs) because they are the most widely used supervised ML algorithm for both regression and classification tasks. This work aims at quantifying the prediction, or approximation uncertainties of DNNs when they are used as surrogate models for expensive physical models. Three techniques for UQ of DNNs are compared, namely Monte Carlo Dropout (MCD), Deep Ensembles (DE) and Bayesian Neural Networks (BNNs). Two nuclear engineering examples are used to benchmark these methods, (1) time-dependent fission gas release data using the Bison code, and (2) void fraction simulation based on the BFBT benchmark using the TRACE code. It was found that the three methods typically require different DNN architectures and hyperparameters to optimize their performance. The UQ results also depend on the amount of training data available and the nature of the data. Overall, all these three methods can provide reasonable estimations of the approximation uncertainties. The uncertainties are generally smaller when the mean predictions are close to the test data, while the BNN methods usually produce larger uncertainties than MCD and DE.