论文标题
目标空间中的深度学习
Deep Learning in Target Space
论文作者
论文摘要
深度学习使用其权重参数的神经网络。神经网络通常通过调整权重来训练以直接最大程度地减少给定的损失函数。在本文中,我们建议将权重重新分配为网络中各个节点的发射强度的目标。给定一组目标,可以计算重量,从而使点火强度最能达到这些目标。有人认为,使用目标进行训练解决了爆炸梯度的问题,即我们称为级联无障碍的过程,并使损失功能的表面表面更加顺畅,从而导致更轻松,更快的训练,并可能对神经网络进行更好的概括。它还可以更轻松地学习更深层次的网络结构。目标对权重的必要转换为额外的计算费用,在许多情况下,这是可以管理的。目标空间中的学习可以与现有的神经网络优化器相结合,以获得额外的收益。实验结果表明,使用目标空间的速度以及改进的概括,用于完全连接的网络和卷积网络的示例,以及回忆和处理长时间序列并使用经常性网络进行自然语言处理的能力。
Deep learning uses neural networks which are parameterised by their weights. The neural networks are usually trained by tuning the weights to directly minimise a given loss function. In this paper we propose to re-parameterise the weights into targets for the firing strengths of the individual nodes in the network. Given a set of targets, it is possible to calculate the weights which make the firing strengths best meet those targets. It is argued that using targets for training addresses the problem of exploding gradients, by a process which we call cascade untangling, and makes the loss-function surface smoother to traverse, and so leads to easier, faster training, and also potentially better generalisation, of the neural network. It also allows for easier learning of deeper and recurrent network structures. The necessary conversion of targets to weights comes at an extra computational expense, which is in many cases manageable. Learning in target space can be combined with existing neural-network optimisers, for extra gain. Experimental results show the speed of using target space, and examples of improved generalisation, for fully-connected networks and convolutional networks, and the ability to recall and process long time sequences and perform natural-language processing with recurrent networks.