论文标题

具有多元纵向内源性协变量的随机生存林

Random survival forests with multivariate longitudinal endogenous covariates

论文作者

Devaux, Anthony, Helmer, Catherine, Genuer, Robin, Proust-Lima, Cécile

论文摘要

使用完整的患者病史预测临床事件的个人风险仍然是个性化医学的主要挑战。在用于计算个体动态预测的方法中,联合模型具有考虑辍学时使用所有可用信息的资产。但是,它们仅限于少量的纵向预测变量。我们的目标是提出一种创新的替代解决方案,以使用可能大量的纵向预测变量来预测事件概率。我们开发了DynForest,这是处理内源性纵向预测因子的竞争风险随机生存森林的扩展。在树的每个节点上,将时间依赖性预测变量转换为时期的特征(使用混合模型),用于将受试者分为两个亚组。单个事件的概率是通过Aalen-Johansen估计器在每棵树中估算的,该叶子的叶子是根据其预测因子史对受试者进行分类的。最终的个人预测由特定于树特定的个人事件概率的平均值给出。我们进行了一项仿真研究,以证明在较小的维度(与关节模型相比)和较大的维环境中(与忽略信息脱落的回归校准方法相比),在较小的维度上(与关节模型相比)表现了DynForest的性能。我们还将DynForest应用于(i)根据认知,功能,血管和神经脱位标记的重复度量预测老年人痴呆症的个体概率,以及(ii)量化每种标记物对痴呆预测的重要性。我们的方法在R包Dynforest中实现,为预测任何数量的纵向内源性预测因子的事件提供了一种新颖而适当的解决方案。

Predicting the individual risk of a clinical event using the complete patient history is still a major challenge for personalized medicine. Among the methods developed to compute individual dynamic predictions, the joint models have the assets of using all the available information while accounting for dropout. However, they are restricted to a very small number of longitudinal predictors. Our objective was to propose an innovative alternative solution to predict an event probability using a possibly large number of longitudinal predictors. We developed DynForest, an extension of competing-risk random survival forests that handles endogenous longitudinal predictors. At each node of the tree, the time-dependent predictors are translated into time-fixed features (using mixed models) to be used as candidates for splitting the subjects into two subgroups. The individual event probability is estimated in each tree by the Aalen-Johansen estimator of the leaf in which the subject is classified according to his/her history of predictors. The final individual prediction is given by the average of the tree-specific individual event probabilities. We carried out a simulation study to demonstrate the performances of DynForest both in a small dimensional context (in comparison with joint models) and in a large dimensional context (in comparison with a regression calibration method that ignores informative dropout). We also applied DynForest to (i) predict the individual probability of dementia in the elderly according to repeated measures of cognitive, functional, vascular and neuro-degeneration markers, and (ii) quantify the importance of each type of markers for the prediction of dementia. Implemented in the R package DynForest, our methodology provides a novel and appropriate solution for the prediction of events from any number of longitudinal endogenous predictors.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源