在两个神经网络和学习的稳定性之间

论文标题

在两个神经网络和学习的稳定性之间

On the distance between two neural networks and the stability of learning

论文作者

Bernstein, Jeremy, Vahdat, Arash, Yue, Yisong, Liu, Ming-Yu

论文摘要

本文将参数距离与一类广泛的非线性组成函数的梯度分解相关联。该分析导致了一种新的距离函数，称为“深度相对信任”和神经网络的下降引理。由于所得的学习规则似乎几乎不需要学习率调整，因此它可能会解锁更简单的工作流程，以训练更深入，更复杂的神经网络。本文中使用的Python代码在这里：https：//github.com/jxbz/fromage。

This paper relates parameter distance to gradient breakdown for a broad class of nonlinear compositional functions. The analysis leads to a new distance function called deep relative trust and a descent lemma for neural networks. Since the resulting learning rule seems to require little to no learning rate tuning, it may unlock a simpler workflow for training deeper and more complex neural networks. The Python code used in this paper is here: https://github.com/jxbz/fromage.

下载PDF全文

下载文献需遵守相关版权规定

论文标题