论文标题
测试Glom从模棱两可的部分推断批发的能力
Testing GLOM's ability to infer wholes from ambiguous parts
论文作者
论文摘要
Hinton [2021]提出的Glom架构是一个复发性神经网络,用于将图像解析为批发和部分的层次结构。当零件含糊不清时,Glom假设可以通过允许该零件对其所属的姿势和全部的姿势和身份进行多模式预测来解决歧义,然后使用对其他可能模棱两可的部分产生的相似预测的关注,以在几个不同部分预测的共同模式下定居。在这项研究中,我们描述了高度简化的Glom版本,该版本使我们能够评估这种处理歧义的有效性。我们的结果表明,通过有监督的培训,Glom能够成功形成与同一物体所占据的所有位置的非常相似的嵌入向量的岛屿,并且对于输入中的强烈噪声注入和分发输入转换也很强。
The GLOM architecture proposed by Hinton [2021] is a recurrent neural network for parsing an image into a hierarchy of wholes and parts. When a part is ambiguous, GLOM assumes that the ambiguity can be resolved by allowing the part to make multi-modal predictions for the pose and identity of the whole to which it belongs and then using attention to similar predictions coming from other possibly ambiguous parts to settle on a common mode that is predicted by several different parts. In this study, we describe a highly simplified version of GLOM that allows us to assess the effectiveness of this way of dealing with ambiguity. Our results show that, with supervised training, GLOM is able to successfully form islands of very similar embedding vectors for all of the locations occupied by the same object and it is also robust to strong noise injections in the input and to out-of-distribution input transformations.