在标准有限宽度卷积神经网络体系结构的经验神经切线内核上

论文标题

在标准有限宽度卷积神经网络体系结构的经验神经切线内核上

On the Empirical Neural Tangent Kernel of Standard Finite-Width Convolutional Neural Network Architectures

论文作者

Samarin, Maxim, Roth, Volker, Belius, David

论文摘要

神经切线内核（NTK）是持续努力建立深度学习理论的重要里程碑。它的预测是，足够广泛的神经网络以内核方法的形式表现出来，或等效地为随机特征模型，已在某些宽阔的架构中得到证实。 NTK理论仍然是一个悬而未决的问题，该理论如何模拟在实践中常见的宽度的标准神经网络体系结构，并在复杂数据集（例如ImageNet）上训练。我们从经验上研究了两个众所周知的卷积神经网络体系结构，即Alexnet和Lenet，并发现其行为显着偏离了其有限宽度的NTK对应物。对于这些网络的更广泛的版本，在其中增加了通道的数量和宽度的宽度，偏差会减少。

The Neural Tangent Kernel (NTK) is an important milestone in the ongoing effort to build a theory for deep learning. Its prediction that sufficiently wide neural networks behave as kernel methods, or equivalently as random feature models, has been confirmed empirically for certain wide architectures. It remains an open question how well NTK theory models standard neural network architectures of widths common in practice, trained on complex datasets such as ImageNet. We study this question empirically for two well-known convolutional neural network architectures, namely AlexNet and LeNet, and find that their behavior deviates significantly from their finite-width NTK counterparts. For wider versions of these networks, where the number of channels and widths of fully-connected layers are increased, the deviation decreases.

下载PDF全文

下载文献需遵守相关版权规定

论文标题