用于流式细胞仪分析的分层偏斜正常内核的粗糙混合物

论文标题

用于流式细胞仪分析的分层偏斜正常内核的粗糙混合物

Coarsened mixtures of hierarchical skew normal kernels for flow cytometry analyses

论文作者

Gorsky, Shai, Chan, Cliburn, Ma, Li

论文摘要

流式细胞仪（FCM）是用于测量单细胞表型和功能性的标准多参数测定法。它通常用于量化血液和分类组织中细胞亚集的相对频率。 FCM数据的典型分析涉及细胞分类 - 即，样品中细胞亚组的鉴定以及跨样品或条件的细胞亚组的比较。虽然现代实验通常需要在多批次中收集和处理样品，但跨批次的FCM数据的分析很具有挑战性，因为由于真实的生物学变异或技术原因，例如抗体批次效应或诸如跨批次的仪器仪器的差异。因此，在分析此类数据的现有自动化方法中，多样本FCM数据比较分析的关键步骤是跨样本校准，其目标是在存在技术变化的情况下将相应的细胞集对齐相应的细胞集，以便对生物学变量进行有意义的比较。我们引入了一种贝叶斯非参数分层建模方法，用于以统一的概率方式同时完成校准和细胞分类。我们方法的三个重要特征使其在分析多样本FCM数据方面特别有效：非参数混合物避免预先指定细胞簇的数量；分层偏斜的正常核，可在细胞子集的形状中柔韧性并在其位置进行跨样本变化；最后，“粗糙”策略使推理稳定地偏离了模型，例如偏斜正常内核未捕获的重尾。我们在模拟示例中证明了我们方法的优点，并在分析两个多样本FCM数据集时进行了案例研究。

Flow cytometry (FCM) is the standard multi-parameter assay for measuring single cell phenotype and functionality. It is commonly used for quantifying the relative frequencies of cell subsets in blood and disaggregated tissues. A typical analysis of FCM data involves cell classification---that is, the identification of cell subgroups in the sample---and comparisons of the cell subgroups across samples or conditions. While modern experiments often necessitate the collection and processing of samples in multiple batches, analysis of FCM data across batches is challenging because differences across samples may occur due to either true biological variation or technical reasons such as antibody lot effects or instrument optics across batches. Thus a critical step in comparative analyses of multi-sample FCM data---yet missing in existing automated methods for analyzing such data---is cross-sample calibration, whose goal is to align corresponding cell subsets across multiple samples in the presence of technical variations, so that biological variations can be meaningfully compared. We introduce a Bayesian nonparametric hierarchical modeling approach for accomplishing both calibration and cell classification simultaneously in a unified probabilistic manner. Three important features of our method make it particularly effective for analyzing multi-sample FCM data: a nonparametric mixture avoids prespecifying the number of cell clusters; a hierarchical skew normal kernel that allows flexibility in the shapes of the cell subsets and cross-sample variation in their locations; and finally the "coarsening" strategy makes inference robust to departures from the model such as heavy-tailness not captured by the skew normal kernels. We demonstrate the merits of our approach in simulated examples and carry out a case study in the analysis of two multi-sample FCM data sets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题