近似激活函数

论文标题

近似激活函数

Approximating Activation Functions

论文作者

Timmons, Nicholas Gerard, Rice, Andrew

论文摘要

Relu被广泛视为神经网络中激活功能的默认选择。但是，在某些情况下，需要更复杂的功能。特别是，复发性神经网络（例如LSTMS）广泛使用双曲线切线和乙状结肠功能。这些功能的计算昂贵。我们使用功能近似技术来开发这些功能的替代品，并在三种流行的网络配置上进行经验评估它们。我们发现安全的近似值，在CPU上的培训时间提高了10％至37％。这些近似值适用于我们考虑的所有情况，我们认为使用这些激活功能是所有网络的适当替换。我们还开发了范围的近似值，这些近似值仅在某些情况下由于其输入域的限制而适用。在网络培训时间内，我们的近近期近似值可提高20％至53％。我们的功能还匹配或相当大，执行了Theano中使用的临时近似值和Word2Vec的实现。

ReLU is widely seen as the default choice for activation functions in neural networks. However, there are cases where more complicated functions are required. In particular, recurrent neural networks (such as LSTMs) make extensive use of both hyperbolic tangent and sigmoid functions. These functions are expensive to compute. We used function approximation techniques to develop replacements for these functions and evaluated them empirically on three popular network configurations. We find safe approximations that yield a 10% to 37% improvement in training times on the CPU. These approximations were suitable for all cases we considered and we believe are appropriate replacements for all networks using these activation functions. We also develop ranged approximations which only apply in some cases due to restrictions on their input domain. Our ranged approximations yield a performance improvement of 20% to 53% in network training time. Our functions also match or considerably out perform the ad-hoc approximations used in Theano and the implementation of Word2Vec.

下载PDF全文

下载文献需遵守相关版权规定

论文标题