论文标题
深度神经网络,具有短路,可改善梯度学习
Deep Neural Networks with Short Circuits for Improved Gradient Learning
论文作者
论文摘要
深度神经网络在计算机视觉和自然语言处理任务中取得了巨大的成功。但是,大多数最先进的方法高度依赖于外部培训或计算来提高性能。为了减轻外部依赖,我们提出了一种由短路神经连接进行的梯度增强方法,以改善深层神经网络的梯度学习。提出的短路是单向连接,单个背部从深层到浅层的敏感性传播。此外,短路公式是其交叉层的梯度截断,可以插入骨干深神经网络而无需引入外部训练参数。广泛的实验表明,在计算机视觉和自然语言处理任务上,我们的短路上的基线差距很大。
Deep neural networks have achieved great success both in computer vision and natural language processing tasks. However, mostly state-of-art methods highly rely on external training or computing to improve the performance. To alleviate the external reliance, we proposed a gradient enhancement approach, conducted by the short circuit neural connections, to improve the gradient learning of deep neural networks. The proposed short circuit is a unidirectional connection that single back propagates the sensitive from the deep layer to the shallows. Moreover, the short circuit formulates to be a gradient truncation of its crossing layers which can plug into the backbone deep neural networks without introducing external training parameters. Extensive experiments demonstrate deep neural networks with our short circuit gain a large margin over the baselines on both computer vision and natural language processing tasks.