全场数字乳房X线摄影图像和合成乳房X线摄影图像的乳房密度深度学习模型的多站点研究

论文标题

全场数字乳房X线摄影图像和合成乳房X线摄影图像的乳房密度深度学习模型的多站点研究

A Multi-site Study of a Breast Density Deep Learning Model for Full-field Digital Mammography Images and Synthetic Mammography Images

论文作者

Matthews, Thomas P., Singh, Sadanand, Mombourquette, Brent, Su, Jason, Shah, Meet P., Pedemonte, Stefano, Long, Aaron, Maffit, David, Gurney, Jenny, Hoil, Rodrigo Morales, Ghare, Nikita, Smith, Douglas, Moore, Stephen M., Marks, Susan C., Wahl, Richard L.

论文摘要

目的：在多站点设置中开发乳房成像报告和数据系统（BI-RADS）乳房密度深度学习（DL）模型，用于合成的二维乳腺X线摄影（SM）图像，该图像使用全场数字乳腺摄影（FFDM）图像和有限的SM数据得出了数字乳腺周期检查。材料和方法：培训了DL模型，可以使用从2008年到2017年获得的FFDM图像（站点1：57492患者，187627考试，750752图像）来预测BI-RADS乳房密度。使用来自两个机构的SM数据集评估了FFDM模型（站点1：3842患者，3866张考试，14472张图像，从2016年到2017年获得；现场2：7557患者，16283张考试，63973次检查，2015年至2019年）。然后，三个数据集中的每个数据集分为培训，验证和测试数据集。研究了适应方法以提高SM数据集的性能，并考虑了数据集大小对每种适应方法的影响。使用置信区间（CI）评估统计显着性，通过自举估计。结果：没有适应，该模型与所有三个数据集的原始报告放射学家（站点1 FFDM：线性加权$κ__W$ = 0.75 [95％CI：0.74，0.76]; SITE 1 SM：$κ____________________$ = 0.64，0.64，0.78]; CI：0.70，0.75]）。通过适应，网站2（站点1：$κ_W$ = 0.72 [95％CI：0.66，0.79]，0.71 vs 0.72，p = .80;站点2：$κ_W$ = 0.79 [95％CI：0.76，0.81]，0.72 vs $ $ <$ .001，结论：BI-RADS乳房密度DL模型在FFDM上表现出强烈的性能和来自两个机构的SM图像，而无需在SM图像上训练并使用少量SM图像进行了改进。

Purpose: To develop a Breast Imaging Reporting and Data System (BI-RADS) breast density deep learning (DL) model in a multi-site setting for synthetic two-dimensional mammography (SM) images derived from digital breast tomosynthesis exams using full-field digital mammography (FFDM) images and limited SM data. Materials and Methods: A DL model was trained to predict BI-RADS breast density using FFDM images acquired from 2008 to 2017 (Site 1: 57492 patients, 187627 exams, 750752 images) for this retrospective study. The FFDM model was evaluated using SM datasets from two institutions (Site 1: 3842 patients, 3866 exams, 14472 images, acquired from 2016 to 2017; Site 2: 7557 patients, 16283 exams, 63973 images, 2015 to 2019). Each of the three datasets were then split into training, validation, and test datasets. Adaptation methods were investigated to improve performance on the SM datasets and the effect of dataset size on each adaptation method is considered. Statistical significance was assessed using confidence intervals (CI), estimated by bootstrapping. Results: Without adaptation, the model demonstrated substantial agreement with the original reporting radiologists for all three datasets (Site 1 FFDM: linearly-weighted $κ_w$ = 0.75 [95% CI: 0.74, 0.76]; Site 1 SM: $κ_w$ = 0.71 [95% CI: 0.64, 0.78]; Site 2 SM: $κ_w$ = 0.72 [95% CI: 0.70, 0.75]). With adaptation, performance improved for Site 2 (Site 1: $κ_w$ = 0.72 [95% CI: 0.66, 0.79], 0.71 vs 0.72, P = .80; Site 2: $κ_w$ = 0.79 [95% CI: 0.76, 0.81], 0.72 vs 0.79, P $<$ .001) using only 500 SM images from that site. Conclusion: A BI-RADS breast density DL model demonstrated strong performance on FFDM and SM images from two institutions without training on SM images and improved using few SM images.

下载PDF全文

下载文献需遵守相关版权规定

论文标题