论文标题
双向注意网络单眼深度估计
Bidirectional Attention Network for Monocular Depth Estimation
论文作者
论文摘要
在本文中,我们提出了一个双向注意网络(Banet),这是一个单眼深度估计(MDE)的端到端框架,该框架解决了有效地整合卷积神经网络中局部和全球信息的局限性。该机制的结构源自神经机器翻译的强大概念基础,并提出了一种轻巧的机制,用于自适应控制计算,类似于经常性神经网络的动态性质。我们介绍了利用馈送特征图的双向注意模块,并结合了全局上下文以滤除歧义。广泛的实验揭示了这种双向注意模型的高度能力,而不是进料前线基线和其他最新方法,用于两个具有挑战性的数据集(Kitti和二极管)的单眼深度估计。我们表明,我们提出的方法要么表现优于或至少与最先进的单眼深度估计方法相提并论,其记忆力和计算复杂性较小。
In this paper, we propose a Bidirectional Attention Network (BANet), an end-to-end framework for monocular depth estimation (MDE) that addresses the limitation of effectively integrating local and global information in convolutional neural networks. The structure of this mechanism derives from a strong conceptual foundation of neural machine translation, and presents a light-weight mechanism for adaptive control of computation similar to the dynamic nature of recurrent neural networks. We introduce bidirectional attention modules that utilize the feed-forward feature maps and incorporate the global context to filter out ambiguity. Extensive experiments reveal the high degree of capability of this bidirectional attention model over feed-forward baselines and other state-of-the-art methods for monocular depth estimation on two challenging datasets -- KITTI and DIODE. We show that our proposed approach either outperforms or performs at least on a par with the state-of-the-art monocular depth estimation methods with less memory and computational complexity.