论文标题
关于在深神经网络中求解梯度不稳定性的正交初始化的生物学合理性
On the biological plausibility of orthogonal initialisation for solving gradient instability in deep neural networks
论文作者
论文摘要
已知初始化具有正交矩阵的人工神经网络(ANN)的突触权重可以减轻消失和爆炸的梯度问题。反对这种初始化方案的一个主要反对意见是,它们在生物学上被认为是不可信的,因为它们授权分解技术很难归因于神经生物学过程。本文提出了两种初始化方案,允许网络自然地演变其权重形成正交矩阵,提供了理论分析,预训练培训正交始终会融合,并经验证实,所提出的方案比随机初始化的循环和馈送网络胜过较优于随机的循环和馈电网络。
Initialising the synaptic weights of artificial neural networks (ANNs) with orthogonal matrices is known to alleviate vanishing and exploding gradient problems. A major objection against such initialisation schemes is that they are deemed biologically implausible as they mandate factorization techniques that are difficult to attribute to a neurobiological process. This paper presents two initialisation schemes that allow a network to naturally evolve its weights to form orthogonal matrices, provides theoretical analysis that pre-training orthogonalisation always converges, and empirically confirms that the proposed schemes outperform randomly initialised recurrent and feedforward networks.