论文标题
SCICML:基于信息理论共集群的多视图学习,用于单细胞多摩变数据的集成分析
scICML: Information-theoretic Co-clustering-based Multi-view Learning for the Integrative Analysis of Single-cell Multi-omics data
论文作者
论文摘要
现代的高通量测序技术使我们能够从同一单个细胞中介绍多种分子方式,从而提供了前所未有的机会,可以从多个生物学层中测定Celluar异质性。但是,这些技术产生的数据集倾向于具有很高的噪声,并且高度稀疏,这给数据分析带来了挑战。在本文中,我们开发了一种基于多摩变单细胞数据集成的基于新的信息理论共聚类多视图学习(SCICML)方法。 SCICML利用共簇来汇总数据视图的相似特征,并发现单元格的常见聚类模式。此外,SCICML会自动匹配不同数据类型的链接特征的簇,以考虑不同类型的基因组特征的生物依赖性结构。我们在四个现实世界数据集上的实验表明,SCICML改善了整体聚类性能,并为外围血液单核细胞的数据分析提供了生物学见解。
Modern high-throughput sequencing technologies have enabled us to profile multiple molecular modalities from the same single cell, providing unprecedented opportunities to assay celluar heterogeneity from multiple biological layers. However, the datasets generated from these technologies tend to have high level of noise and are highly sparse, bringing challenges to data analysis. In this paper, we develop a novel information-theoretic co-clustering-based multi-view learning (scICML) method for multi-omics single-cell data integration. scICML utilizes co-clusterings to aggregate similar features for each view of data and uncover the common clustering pattern for cells. In addition, scICML automatically matches the clusters of the linked features across different data types for considering the biological dependency structure across different types of genomic features. Our experiments on four real-world datasets demonstrate that scICML improves the overall clustering performance and provides biological insights into the data analysis of peripheral blood mononuclear cells.