论文标题
Aitchison的组成数据分析40年:重新评估
Aitchison's Compositional Data Analysis 40 Years On: A Reappraisal
论文作者
论文摘要
约翰·阿奇森(John Aitchison)的构图数据分析方法的发展是自1982年向皇家统计学会读到的。可以坚持认为,该方法最初建立的属性,主要是子组件连贯性的属性,不需要准确地满足 - 准固定性就足够了,这足以满足所有实际目的的连贯性。这为使用更简单的数据变换(例如功率转换)打开了字段,该数据转换允许数据中的零值。随后引入而不是在Aitchison的原始概念中引入了精确等轴测的附加特性,它施加了等距logratio转换的使用,但是这些效率是复杂且有问题的解释,涉及几何均值的比率。如果在某些分析环境中认为该属性很重要,例如无监督的学习,可以通过证明常规的成对logratios以及替代的准辅解转换来放松,这也可以是准时的,这也可以是准确的,这意味着它们足够接近所有实用性的均衡器。可以得出结论,尽管许多作者坚持使用其强制性使用,但等轴测和相关的求解转换(例如Pivot Logratios)并不是良好实践的先决条件。在地球化学和基因组学方面的案例研究中,这一结论得到了完全支持,在这些结论中,良好的表现是对成对的logratios的良好性能,如Aitchison最初提出的,或者原始组合物的盒子能力转换无需零替换。
The development of John Aitchison's approach to compositional data analysis is followed since his paper read to the Royal Statistical Society in 1982. Aitchison's logratio approach, which was proposed to solve the problematic aspects of working with data with a fixed sum constraint, is summarized and reappraised. It is maintained that the properties on which this approach was originally built, the main one being subcompositional coherence, are not required to be satisfied exactly -- quasi-coherence is sufficient, that is near enough to being coherent for all practical purposes. This opens up the field to using simpler data transformations, such as power transformations, that permit zero values in the data. The additional property of exact isometry, which was subsequently introduced and not in Aitchison's original conception, imposed the use of isometric logratio transformations, but these are complicated and problematic to interpret, involving ratios of geometric means. If this property is regarded as important in certain analytical contexts, for example unsupervised learning, it can be relaxed by showing that regular pairwise logratios, as well as the alternative quasi-coherent transformations, can also be quasi-isometric, meaning they are close enough to exact isometry for all practical purposes. It is concluded that the isometric and related logratio transformations such as pivot logratios are not a prerequisite for good practice, although many authors insist on their obligatory use. This conclusion is fully supported here by case studies in geochemistry and in genomics, where the good performance is demonstrated of pairwise logratios, as originally proposed by Aitchison, or Box-Cox power transforms of the original compositions where no zero replacements are necessary.