deppypyramid：启用金字塔视图和可变形的金字塔接收，以进行性欲分割视频中的语义分割

论文标题

deppypyramid：启用金字塔视图和可变形的金字塔接收，以进行性欲分割视频中的语义分割

DeepPyramid: Enabling Pyramid View and Deformable Pyramid Reception for Semantic Segmentation in Cataract Surgery Videos

论文作者

Ghamsarian, Negin, Taschwer, Mario, Sznitman, Raphael, Schoeffmann, Klaus

论文摘要

白内障手术中的语义分割具有广泛的应用，可导致外科结果增强和降低临床风险。但是，在这些手术中分割不同相关结构的不同问题使得指定独特的网络非常具有挑战性。本文提出了一个称为deppyramid的语义分割网络，可以使用三个新颖性来应对这些挑战：（1）金字塔视图融合模块，该模块在输入卷积图映射中为每个像素位置以周围位置为中心的周围区域提供了不同的全球视图；（2）一个可变形的金字塔接收模块，该模块可实现一个可适应感兴趣对象的几何变换的宽阔可变形接收场；（3）专用金字塔损失，可自适应监督多尺度语义特征地图。结合在一起，我们表明这些模块可以有效地提高语义分割性能，尤其是在对象中透明度，可变形性，可伸缩性和钝边缘的情况下。我们证明我们的方法在最先进的级别上执行，并且优于许多现有方法，其利润率很高（与最佳竞争对手的方法相比，联盟的交叉路口总体上的总体改善为3.66％）。

Semantic segmentation in cataract surgery has a wide range of applications contributing to surgical outcome enhancement and clinical risk reduction. However, the varying issues in segmenting the different relevant structures in these surgeries make the designation of a unique network quite challenging. This paper proposes a semantic segmentation network, termed DeepPyramid, that can deal with these challenges using three novelties: (1) a Pyramid View Fusion module which provides a varying-angle global view of the surrounding region centering at each pixel position in the input convolutional feature map; (2) a Deformable Pyramid Reception module which enables a wide deformable receptive field that can adapt to geometric transformations in the object of interest; and (3) a dedicated Pyramid Loss that adaptively supervises multi-scale semantic feature maps. Combined, we show that these modules can effectively boost semantic segmentation performance, especially in the case of transparency, deformability, scalability, and blunt edges in objects. We demonstrate that our approach performs at a state-of-the-art level and outperforms a number of existing methods with a large margin (3.66% overall improvement in intersection over union compared to the best rival approach).

下载PDF全文

下载文献需遵守相关版权规定

论文标题