论文标题
深度学习中超参数的目标敏感性分析
Goal-Oriented Sensitivity Analysis of Hyperparameters in Deep Learning
论文作者
论文摘要
通过神经网络解决新的机器学习问题始终意味着优化众多的超参数,以定义其结构并强烈影响其性能。在这项工作中,我们研究了基于希尔伯特·史克米特独立标准(HSIC)的面向目标灵敏度分析的使用,用于超参数分析和优化。超参数生活在通常复杂而尴尬的空间中。它们可以具有不同的本质(分类,离散,布尔,连续),相互作用并具有相互依存关系。所有这些使得进行经典的灵敏度分析是不平凡的。我们可以减轻这些困难,以获取能够量化超参数对神经网络最终错误的相对影响的强大分析指数。这种有价值的工具使我们能够更好地理解超参数,并使超参数优化更容易解释。我们在高参数优化的背景下说明了这种知识的好处,并得出了一种基于HSIC的优化算法,该算法我们应用于MNIST和CIFAR,经典的机器学习数据集,还将其应用于Runge功能和Bateman方程的近似和Bateman方程解决方案,即科学机器学习的兴趣解决方案。该方法产生既有竞争力又具有成本效益的神经网络。
Tackling new machine learning problems with neural networks always means optimizing numerous hyperparameters that define their structure and strongly impact their performances. In this work, we study the use of goal-oriented sensitivity analysis, based on the Hilbert-Schmidt Independence Criterion (HSIC), for hyperparameter analysis and optimization. Hyperparameters live in spaces that are often complex and awkward. They can be of different natures (categorical, discrete, boolean, continuous), interact, and have inter-dependencies. All this makes it non-trivial to perform classical sensitivity analysis. We alleviate these difficulties to obtain a robust analysis index that is able to quantify hyperparameters' relative impact on a neural network's final error. This valuable tool allows us to better understand hyperparameters and to make hyperparameter optimization more interpretable. We illustrate the benefits of this knowledge in the context of hyperparameter optimization and derive an HSIC-based optimization algorithm that we apply on MNIST and Cifar, classical machine learning data sets, but also on the approximation of Runge function and Bateman equations solution, of interest for scientific machine learning. This method yields neural networks that are both competitive and cost-effective.