论文标题
癌症基因组通用性和差异的综合稀疏分析
An integrative sparse boosting analysis of cancer genomic commonality and difference
论文作者
论文摘要
在癌症研究中,高通量分析已经进行了广泛的进行。在最近的研究中,已经对多个癌症患者组/亚组的数据进行了综合分析。这种分析有可能揭示基因组通用性以及群体/亚组之间的差异。但是,在现有文献中,特别关注基因组通用性和差异的方法非常有限。在这项研究中,开发了一种基于稀疏增强技术的新型估计和标记选择方法,以解决共同点/差异问题。在技术创新方面,引入了新的惩罚和增量计算。所提出的方法还可以有效地适应协变量的分组结构。模拟表明,在各种环境中,它可以超越直接竞争对手。进行了两个TCGA(癌症基因组地图集)数据集的分析,表明所提出的分析可以鉴定具有重要生物学意义的标记物,并具有令人满意的预测和稳定性。
In cancer research, high-throughput profiling has been extensively conducted. In recent studies, the integrative analysis of data on multiple cancer patient groups/subgroups has been conducted. Such analysis has the potential to reveal the genomic commonality as well as difference across groups/subgroups. However, in the existing literature, methods with a special attention to the genomic commonality and difference are very limited. In this study, a novel estimation and marker selection method based on the sparse boosting technique is developed to address the commonality/difference problem. In terms of technical innovation, a new penalty and computation of increments are introduced. The proposed method can also effectively accommodate the grouping structure of covariates. Simulation shows that it can outperform direct competitors under a wide spectrum of settings. The analysis of two TCGA (The Cancer Genome Atlas) datasets is conducted, showing that the proposed analysis can identify markers with important biological implications and have satisfactory prediction and stability.