论文标题
来自多项研究的生存数据的整合
Integration of Survival Data from Multiple Studies
论文作者
论文摘要
我们介绍了一项统计程序,该程序将来自多个生物医学研究的生存数据整合在一起,以提高基于单个临床和基因组概况的生存或其他事件预测的准确性,与仅利用单一研究或荟萃分析方法相比,基于个体的临床和基因组谱。该方法解释了由于不同的患者人群,治疗方法和技术来衡量结果和生物标志物,预测因素和结果之间的关系的潜在差异。这些差异是用特定于研究的参数明确建模的。我们使用层次正则化来缩小针对彼此的研究特定参数,并在整个研究中借用信息。研究特异性参数的收缩由相似性矩阵控制,该矩阵总结了协变量和跨研究结果之间关系的差异和相似性。我们在仿真研究中说明了该方法,并使用了卵巢癌中基因表达数据集的集合。我们表明,与替代性荟萃分析方法相比,提出的模型提高了生存预测的准确性。
We introduce a statistical procedure that integrates survival data from multiple biomedical studies, to improve the accuracy of predictions of survival or other events, based on individual clinical and genomic profiles, compared to models developed leveraging only a single study or meta-analytic methods. The method accounts for potential differences in the relation between predictors and outcomes across studies, due to distinct patient populations, treatments and technologies to measure outcomes and biomarkers. These differences are modeled explicitly with study-specific parameters. We use hierarchical regularization to shrink the study-specific parameters towards each other and to borrow information across studies. Shrinkage of the study-specific parameters is controlled by a similarity matrix, which summarizes differences and similarities of the relations between covariates and outcomes across studies. We illustrate the method in a simulation study and using a collection of gene-expression datasets in ovarian cancer. We show that the proposed model increases the accuracy of survival prediction compared to alternative meta-analytic methods.