论文标题
与多评价者共识建模的难度了解青光眼分类
Difficulty-aware Glaucoma Classification with Multi-Rater Consensus Modeling
论文作者
论文摘要
医疗图像通常在确定最终地面真相标签之前由多个专家标记。专家在各个图像上的共识或分歧反映了图像的毕业能力和难度水平。但是,在用于模型训练时,仅利用最终的基础真实标签,而原始的多评价者等级中包含的有关图像的关键信息则被丢弃。在本文中,我们旨在利用原始的多评价等级,以提高青光眼分类任务的深度学习模型性能。具体而言,提出了多支分支模型结构,以预测输入图像的最敏感,最特异性和平衡的融合结果。为了鼓励灵敏度分支和特异性分支为共识标签产生一致的结果,并对分歧标签产生相反的结果,提出了共识损失来限制两个分支的输出。同时,两个分支的预测结果之间的一致性/不一致意味着图像是一个简单/硬的情况,这进一步用于鼓励平衡的融合分支更多地集中在困难的情况上。与仅使用最终基真实标签训练的模型相比,使用多评价者共识信息的拟议方法提高了性能,并且还能够估计进行预测时单个输入图像的难度水平。
Medical images are generally labeled by multiple experts before the final ground-truth labels are determined. Consensus or disagreement among experts regarding individual images reflects the gradeability and difficulty levels of the image. However, when being used for model training, only the final ground-truth label is utilized, while the critical information contained in the raw multi-rater gradings regarding the image being an easy/hard case is discarded. In this paper, we aim to take advantage of the raw multi-rater gradings to improve the deep learning model performance for the glaucoma classification task. Specifically, a multi-branch model structure is proposed to predict the most sensitive, most specifical and a balanced fused result for the input images. In order to encourage the sensitivity branch and specificity branch to generate consistent results for consensus labels and opposite results for disagreement labels, a consensus loss is proposed to constrain the output of the two branches. Meanwhile, the consistency/inconsistency between the prediction results of the two branches implies the image being an easy/hard case, which is further utilized to encourage the balanced fusion branch to concentrate more on the hard cases. Compared with models trained only with the final ground-truth labels, the proposed method using multi-rater consensus information has achieved superior performance, and it is also able to estimate the difficulty levels of individual input images when making the prediction.