论文标题

神经状态记录

Neural Status Registers

论文作者

Faber, Lukas, Wattenhofer, Roger

论文摘要

标准的神经网络可以学习数学操作,但不能推断。推断意味着该模型可以应用于较大的数字,远远超出了训练期间观察到的数字。最近的体系结构应对算术操作,并可以推断出来。但是,定量推理的同样重要问题仍然没有解决。在这项工作中,我们提出了一种新颖的建筑元素,即神经状态登记册(NSR),用于定量推理数字。我们的NSR放松身心状态寄存器的离散位逻辑以连续数字,并允许以梯度下降的端到端学习。实验表明,NSR实现了推断数量比训练集中大量数量级的数量数量的溶液。我们成功地训练NSR进行数字比较,分段不连续的函数,计数序列,经常发现最小值,在图中找到最短路径以及图像中的数字。

Standard Neural Networks can learn mathematical operations, but they do not extrapolate. Extrapolation means that the model can apply to larger numbers, well beyond those observed during training. Recent architectures tackle arithmetic operations and can extrapolate; however, the equally important problem of quantitative reasoning remains unaddressed. In this work, we propose a novel architectural element, the Neural Status Register (NSR), for quantitative reasoning over numbers. Our NSR relaxes the discrete bit logic of physical status registers to continuous numbers and allows end-to-end learning with gradient descent. Experiments show that the NSR achieves solutions that extrapolate to numbers many orders of magnitude larger than those in the training set. We successfully train the NSR on number comparisons, piecewise discontinuous functions, counting in sequences, recurrently finding minimums, finding shortest paths in graphs, and comparing digits in images.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源