使用随机投影算法优化使用CT图像在胃癌患者中预测腹膜转移的机器学习模型

论文标题

使用随机投影算法优化使用CT图像在胃癌患者中预测腹膜转移的机器学习模型

Applying a random projection algorithm to optimize machine learning model for predicting peritoneal metastasis in gastric cancer patients using CT images

论文作者

Mirniaharikandehei, Seyedehnafiseh, Heidari, Morteza, Danala, Gopichandh, Lakshmivarahan, Sivaramakrishnan, Zheng, Bin

论文摘要

背景和客观：非侵入性预测手术前癌症转移的风险在确定癌症患者最佳治疗方法方面起着至关重要的作用（包括谁可以从新辅助化疗中受益）。尽管开发基于放射线的机器学习（ML）模型已引起了广泛的研究兴趣，但它通常面临如何使用小型和不平衡的图像数据集构建高度性能且强大的ML模型的挑战。方法：在这项研究中，我们探讨了一种建立最佳ML模型的新方法。组装了159名诊断为胃癌患者的腹部计算机断层扫描（CT）图像的回顾性数据集。其中，有121例腹膜转移（PM），而38例没有PM。首先将计算机辅助检测（CAD）方案应用于节段原发性胃肿瘤体积，并最初计算315张图像特征。然后，两种嵌入了两种不同特征维度降低方法的梯度提升机（GBM）模型，即主成分分析（PCA）和一个随机投影算法（RPA）和一种合成的少数族裔超级采样技术，以预测患者患有PM患者的风险。所有GBM型号均经过一对验证的交叉验证方法训练和测试。结果：结果表明，与使用PCA相比，与RPA嵌入的GBM的预测准确性明显更高（71.2％）（65.2％）（p <0.05）。结论：研究表明，原发性胃肿瘤的CT图像包含歧视性信息以预测PM的风险，RPA是一种产生最佳特征矢量的有前途的方法，从而提高了ML Medical图像模型的性能。

Background and Objective: Non-invasively predicting the risk of cancer metastasis before surgery plays an essential role in determining optimal treatment methods for cancer patients (including who can benefit from neoadjuvant chemotherapy). Although developing radiomics based machine learning (ML) models has attracted broad research interest for this purpose, it often faces a challenge of how to build a highly performed and robust ML model using small and imbalanced image datasets. Methods: In this study, we explore a new approach to build an optimal ML model. A retrospective dataset involving abdominal computed tomography (CT) images acquired from 159 patients diagnosed with gastric cancer is assembled. Among them, 121 cases have peritoneal metastasis (PM), while 38 cases do not have PM. A computer-aided detection (CAD) scheme is first applied to segment primary gastric tumor volumes and initially computes 315 image features. Then, two Gradient Boosting Machine (GBM) models embedded with two different feature dimensionality reduction methods, namely, the principal component analysis (PCA) and a random projection algorithm (RPA) and a synthetic minority oversampling technique, are built to predict the risk of the patients having PM. All GBM models are trained and tested using a leave-one-case-out cross-validation method. Results: Results show that the GBM embedded with RPA yielded a significantly higher prediction accuracy (71.2%) than using PCA (65.2%) (p<0.05). Conclusions: The study demonstrated that CT images of the primary gastric tumors contain discriminatory information to predict the risk of PM, and RPA is a promising method to generate optimal feature vector, improving the performance of ML models of medical images.

下载PDF全文

下载文献需遵守相关版权规定

论文标题