神经网络基于神经切线内核方法的修订

论文标题

神经网络基于神经切线内核方法的修订

A Revision of Neural Tangent Kernel-based Approaches for Neural Networks

论文作者

Kim, Kyung-Su, Lozano, Aurélie C., Yang, Eunho

论文摘要

基于神经切线内核（NTK）的最新理论作品揭示了过度参数化网络的优化和概括，并部分弥合了其实践成功与经典学习理论之间的差距。特别是，使用基于NTK的方法，获得了以下三个代表性结果：（1）得出训练误差，以表明网络可以通过反映训练速度的更严格的表征，具体取决于数据的复杂性，可以完美地适合任何有限的训练样本。（2）通过使用数据依赖性复杂度度量（CMD）得出网络大小的概括误差界限。从这个CMD结合开始，网络可以推广任意平滑功能。（3）简单而分析的内核函数确实是等同于训练有素的网络。该内核在很少的镜头学习中优于其相应的网络和现有的金标准，随机森林。对于所有这些结果，网络缩放系数$κ$应降低W.R.T.样本尺寸n。但是，在这种情况下，我们证明上述结果令人惊讶地错误。这是因为当$κ$降低W.R.T. n。为了解决这个问题，我们通过基本上消除$κ$影响的值来拧紧钥匙界。我们的更严格的分析解决了缩放问题，并可以验证原始的基于NTK的结果。

Recent theoretical works based on the neural tangent kernel (NTK) have shed light on the optimization and generalization of over-parameterized networks, and partially bridge the gap between their practical success and classical learning theory. Especially, using the NTK-based approach, the following three representative results were obtained: (1) A training error bound was derived to show that networks can fit any finite training sample perfectly by reflecting a tighter characterization of training speed depending on the data complexity. (2) A generalization error bound invariant of network size was derived by using a data-dependent complexity measure (CMD). It follows from this CMD bound that networks can generalize arbitrary smooth functions. (3) A simple and analytic kernel function was derived as indeed equivalent to a fully-trained network. This kernel outperforms its corresponding network and the existing gold standard, Random Forests, in few shot learning. For all of these results to hold, the network scaling factor $κ$ should decrease w.r.t. sample size n. In this case of decreasing $κ$, however, we prove that the aforementioned results are surprisingly erroneous. It is because the output value of trained network decreases to zero when $κ$ decreases w.r.t. n. To solve this problem, we tighten key bounds by essentially removing $κ$-affected values. Our tighter analysis resolves the scaling problem and enables the validation of the original NTK-based results.

下载PDF全文

下载文献需遵守相关版权规定

论文标题