RGB-D显着对象检测的数据级重组和轻巧的融合方案

论文标题

RGB-D显着对象检测的数据级重组和轻巧的融合方案

Data-Level Recombination and Lightweight Fusion Scheme for RGB-D Salient Object Detection

论文作者

Wang, Xuehao, Li, Shuai, Chen, Chenglizhao, Fang, Yuming, Hao, Aimin, Qin, Hong

论文摘要

现有的RGB-D显着对象检测方法将深度信息视为一个独立的组件，以补充其RGB部分，并广泛遵循双流并行网络体系结构。为了选择性地融合从RGB和DEPTH提取的CNN特征作为最终结果，最新的（SOTA）BI-Stream网络通常由两个独立的子分支组成。即，一个子分支用于RGB显着性，而另一个子分支则用于深度显着性。但是，其深度显着性持续不如RGB显着性，因为RGB组件本质上比深度分量本质上更具信息性。双流式体系结构很容易将其随后的融合过程偏向于RGB子分支，从而导致性能瓶颈。在本文中，我们提出了一种新型的数据级重组策略，将RGB与D（深度）融合在深度提取之前，我们将原始的4维RGB-D转换为\ textbf {d} gb，r \ textbf {d} gb，textbf {d} b and rg \ textbf {d textbf {d}。然后，在这些新型配方的数据上应用了新的轻巧设计的三流网络，以在RGB和D之间获得最佳的频道互补融合状态，从而实现了新的SOTA性能。

Existing RGB-D salient object detection methods treat depth information as an independent component to complement its RGB part, and widely follow the bi-stream parallel network architecture. To selectively fuse the CNNs features extracted from both RGB and depth as a final result, the state-of-the-art (SOTA) bi-stream networks usually consist of two independent subbranches; i.e., one subbranch is used for RGB saliency and the other aims for depth saliency. However, its depth saliency is persistently inferior to the RGB saliency because the RGB component is intrinsically more informative than the depth component. The bi-stream architecture easily biases its subsequent fusion procedure to the RGB subbranch, leading to a performance bottleneck. In this paper, we propose a novel data-level recombination strategy to fuse RGB with D (depth) before deep feature extraction, where we cyclically convert the original 4-dimensional RGB-D into \textbf{D}GB, R\textbf{D}B and RG\textbf{D}. Then, a newly lightweight designed triple-stream network is applied over these novel formulated data to achieve an optimal channel-wise complementary fusion status between the RGB and D, achieving a new SOTA performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题