论文标题
TCA和TLRA:应急表和组成数据的比较
TCA and TLRA: A comparison on contingency tables and compositional data
论文作者
论文摘要
有两种流行的一般方法,用于分析和可视化表格表和组成数据集:对应分析(CA)和对数比分析(LRA)。 LRA包括两种独立发展的方法:关联模型和组成数据分析。将CA或LRA应用于应急表或组成数据集的应用包括预处理步骤。在CA中,中心步骤是乘法的,而在LRA中,它是对数双添加的。预处理矩阵以双重为中心,因此它是一个粘性矩阵。这意味着它会影响分析的最终结果。本文介绍了一个新颖的索引,称其为选择预处理的质量(QSR)质量的固有度量,从而选择了该方法。该标准基于出租车奇异价值分解(TSVD),在该分类中开发了R包出租车中的taxicabca。我们提供了一个最小的R脚本,可以执行以获取本文中的数值结果和地图。在网络上自由使用的三个相对尺寸的数据集用作示例。
There are two popular general approaches for the analysis and visualization of a contingency table and a compositional data set: Correspondence analysis (CA) and log ratio analysis (LRA). LRA includes two independently well developed methods: association models and compositional data analysis. The application of either CA or LRA to a contingency table or to compositional data set includes a preprocessing centering step. In CA the centering step is multiplicative, while in LRA it is log bi-additive. A preprocessed matrix is double-centered, so it is a residuel matrix; which implies that it affects the final results of the analysis. This paper introduces a novel index named the intrinsic measure of the quality of the signs of the residuals (QSR) for the choice of the preprocessing, and consequently of the method. The criterion is based on taxicab singular value decomposition (TSVD) on which the package TaxicabCA in R is developed. We present a minimal R script that can be executed to obtain the numerical results and the maps in this paper. Three relatively small sized data sets available freely on the web are used as examples.