学习人类解析的构图神经信息融合

论文标题

学习人类解析的构图神经信息融合

Learning Compositional Neural Information Fusion for Human Parsing

论文作者

Wang, Wenguan, Zhang, Zhijie, Qi, Siyuan, Shen, Jianbing, Pang, Yanwei, Shao, Ling

论文摘要

这项工作建议将神经网络与人体的组成层次结合起来，以有效而完整的人类解析。我们将方法作为神经信息融合框架提出。我们的模型从层次结构上组装了三个推理过程的信息：直接推理（使用图像信息直接预测人体的每个部分），自下而上的推理（从组成部分组装知识）和自上而下的推论（从父节点中利用上下文）。自下而上和自上而下的推论分别明确地模拟了人体中的组成和分解关系。此外，多源信息的融合基于输入，即通过估计和考虑来源的置信度。整个模型是端到端可区分的，明确的建模信息流和结构。我们的方法在四个流行的数据集上进行了广泛的评估，在所有情况下，快速处理速度在23fps中都优于最先进的方法。我们的代码和结果已发布，以帮助朝着这一方向缓解未来的研究。

This work proposes to combine neural networks with the compositional hierarchy of human bodies for efficient and complete human parsing. We formulate the approach as a neural information fusion framework. Our model assembles the information from three inference processes over the hierarchy: direct inference (directly predicting each part of a human body using image information), bottom-up inference (assembling knowledge from constituent parts), and top-down inference (leveraging context from parent nodes). The bottom-up and top-down inferences explicitly model the compositional and decompositional relations in human bodies, respectively. In addition, the fusion of multi-source information is conditioned on the inputs, i.e., by estimating and considering the confidence of the sources. The whole model is end-to-end differentiable, explicitly modeling information flows and structures. Our approach is extensively evaluated on four popular datasets, outperforming the state-of-the-arts in all cases, with a fast processing speed of 23fps. Our code and results have been released to help ease future research in this direction.

下载PDF全文

下载文献需遵守相关版权规定

论文标题