审查和检查输入功能准备方法和用于湍流建模的机器学习模型

论文标题

审查和检查输入功能准备方法和用于湍流建模的机器学习模型

Review and Examination of Input Feature Preparation Methods and Machine Learning Models for Turbulence Modeling

论文作者

Luo, Shirui, Cui, Jiahuan, Vellakal, Madhu, Liu, Jian, Jiang, Enyi, Koric, Seid, Kindratenko, Volodymyr

论文摘要

模型外推到看不见的流量是数据驱动的湍流建模面临的最大挑战之一，尤其是对于涉及许多流动特征的高维输入的模型。在这项研究中，我们回顾了对数据驱动的雷诺平均奈弗·斯托克斯（RANS）湍流建模和模型外推的先前努力，主要关注转移学习领域中使用的流行方法。检查了一些测量训练流与测试流之间差异的潜在指标。将不同的机器学习（ML）模型进行比较，以了解模型的容量或复杂性如何影响数据集偏移的行为。研究的数据预处理方案可用于协方差转移，例如归一化，转换和重要性重新加权可能性，以了解是否有可能在保持可预测性的同时找到削弱培训和测试分布差异的数据的预测。提出了三个指标来评估培训/测试数据集之间的差异。为了减轻差异，使用分布匹配框架来对齐分布的统计数据。这些修改还允许回归任务在预测目标变量代表性不足的极端值方面具有更好的准确性。这些发现对于未来基于ML的湍流模型很有用，可以评估其模型可预测性，并为系统生成多元化的高保真模拟数据库提供指导。

Model extrapolation to unseen flow is one of the biggest challenges facing data-driven turbulence modeling, especially for models with high dimensional inputs that involve many flow features. In this study we review previous efforts on data-driven Reynolds-Averaged Naiver Stokes (RANS) turbulence modeling and model extrapolation, with main focus on the popular methods being used in the field of transfer learning. Several potential metrics to measure the dissimilarity between training flows and testing flows are examined. Different Machine Learning (ML) models are compared to understand how the capacity or complexity of the model affects its behavior in the face of dataset shift. Data preprocessing schemes which are robust to covariate shift, like normalization, transformation, and importance re-weighted likelihood, are studied to understand whether it is possible to find projections of the data that attenuate the differences in the training and test distributions while preserving predictability. Three metrics are proposed to assess the dissimilarity between training/testing dataset. To attenuate the dissimilarity, a distribution matching framework is used to align the statistics of the distributions. These modifications also allow the regression tasks to have better accuracy in forecasting under-represented extreme values of the target variable. These findings are useful for future ML based turbulence models to evaluate their model predictability and provide guidance to systematically generate diversified high-fidelity simulation database.

下载PDF全文

下载文献需遵守相关版权规定

论文标题