论文标题

可用区域估计用于评估医学图像分割模型的实际可用性

Usable Region Estimate for Assessing Practical Usability of Medical Image Segmentation Models

论文作者

Zhang, Yizhe, Mishra, Suraj, Liang, Peixian, Zheng, Hao, Chen, Danny Z.

论文摘要

我们旨在定量衡量医学图像分割模型的实际可用性:可以使用/信任模型的预测在多大程度上,多久和在哪些样本上进行样本。我们首先提出了一个度量,正确的信心等级相关性(CCRC),以捕获预测的置信度估计与其正确性得分的相关性。 CCRC值高的模型意味着其预测的信心可靠地表明,哪些样本的预测更可能是正确的。由于CCRC无法捕获实际的预测正确性,因此仅仅指出预测模型是否既准确又可靠地用于实践中。因此,我们进一步提出了另一种可用区域估计(URE)的方法,同时量化了预测在一个估计中的置信度评估的正确性和可靠性。 URE提供了有关模型的预测在多大程度上可用的具体信息。此外,可以利用可用区域(UR)的大小来比较模型:具有较大UR的模型可以作为更可用的模型,因此可以将其视为更好的模型。六个数据集的实验验证了提议的评估方法的运行良好,为医学图像分割模型的实际可用性提供了具体和简洁的措施。代码可在https://github.com/yizhezhang2000/ure上提供。

We aim to quantitatively measure the practical usability of medical image segmentation models: to what extent, how often, and on which samples a model's predictions can be used/trusted. We first propose a measure, Correctness-Confidence Rank Correlation (CCRC), to capture how predictions' confidence estimates correlate with their correctness scores in rank. A model with a high value of CCRC means its prediction confidences reliably suggest which samples' predictions are more likely to be correct. Since CCRC does not capture the actual prediction correctness, it alone is insufficient to indicate whether a prediction model is both accurate and reliable to use in practice. Therefore, we further propose another method, Usable Region Estimate (URE), which simultaneously quantifies predictions' correctness and reliability of confidence assessments in one estimate. URE provides concrete information on to what extent a model's predictions are usable. In addition, the sizes of usable regions (UR) can be utilized to compare models: A model with a larger UR can be taken as a more usable and hence better model. Experiments on six datasets validate that the proposed evaluation methods perform well, providing a concrete and concise measure for the practical usability of medical image segmentation models. Code is made available at https://github.com/yizhezhang2000/ure.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源