论文标题
半监督的生成对抗网络,用于预测遗传疾病结果
A Semi-Supervised Generative Adversarial Network for Prediction of Genetic Disease Outcomes
论文作者
论文摘要
对于大多数疾病,构建标有遗传数据的大型数据库是一项昂贵且需要时间的任务。为了解决这个问题,我们介绍了基于创新的GAN体系结构的半监督方法的遗传生成对抗网络(GGAN),以创建大量的合成遗传数据集,从少量标记的数据和大量未标记的数据开始。我们的目标是确定仅凭其遗传特征从其遗传特征中发展出严重疾病形式的倾向。所提出的模型使用来自不同数据集和人群的实际遗传数据实现了令人满意的结果,其中测试人群可能没有相同的遗传特征。所提出的模型是自我意识的,能够确定新的遗传概况是否与训练网络的数据相兼容,因此适合预测。可以在https://github.com/caio-davi/ggan上找到所使用的代码和数据集。
For most diseases, building large databases of labeled genetic data is an expensive and time-demanding task. To address this, we introduce genetic Generative Adversarial Networks (gGAN), a semi-supervised approach based on an innovative GAN architecture to create large synthetic genetic data sets starting with a small amount of labeled data and a large amount of unlabeled data. Our goal is to determine the propensity of a new individual to develop the severe form of the illness from their genetic profile alone. The proposed model achieved satisfactory results using real genetic data from different datasets and populations, in which the test populations may not have the same genetic profiles. The proposed model is self-aware and capable of determining whether a new genetic profile has enough compatibility with the data on which the network was trained and is thus suitable for prediction. The code and datasets used can be found at https://github.com/caio-davi/gGAN.