纵向变异自动编码器

论文标题

纵向变异自动编码器

Longitudinal Variational Autoencoder

论文作者

Ramchandran, Siddharth, Tikhonov, Gleb, Kujanpää, Kalle, Koskinen, Miika, Lähdesmäki, Harri

论文摘要

在许多生物医学，心理学，社会和其他研究中，随着时间的流逝，纵向数据集随着时间的流逝而反复测量。一种分析包含缺失值的高维数据的常见方法是使用变分自动编码器（VAE）学习低维表示。但是，标准VAE假定学习的表示形式是I.I.D.，并且无法捕获数据样本之间的相关性。我们提出了纵向VAE（L-VAE），该纵向使用了多输出添加剂高斯工艺（GP），然后扩展了VAE学习由辅助协变量信息实现的结构化低维表示的能力，并得出了这种GPS的新的KL差异上限。我们的方法可以同时适应时间变化的共享和随机效应，产生结构化的低维表示，单个协变量的分离效应或其相互作用，并实现高度准确的预测性能。我们将模型与以前关于合成和临床数据集的方法进行了比较，并证明了数据插补，重建和长期预测任务的最新性能。

Longitudinal datasets measured repeatedly over time from individual subjects, arise in many biomedical, psychological, social, and other studies. A common approach to analyse high-dimensional data that contains missing values is to learn a low-dimensional representation using variational autoencoders (VAEs). However, standard VAEs assume that the learnt representations are i.i.d., and fail to capture the correlations between the data samples. We propose the Longitudinal VAE (L-VAE), that uses a multi-output additive Gaussian process (GP) prior to extend the VAE's capability to learn structured low-dimensional representations imposed by auxiliary covariate information, and derive a new KL divergence upper bound for such GPs. Our approach can simultaneously accommodate both time-varying shared and random effects, produce structured low-dimensional representations, disentangle effects of individual covariates or their interactions, and achieve highly accurate predictive performance. We compare our model against previous methods on synthetic as well as clinical datasets, and demonstrate the state-of-the-art performance in data imputation, reconstruction, and long-term prediction tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题