论文标题

对气候生物群体识别量应用的张量的粗粒簇分析

Coarse-Grain Cluster Analysis of Tensors with Application to Climate Biome Identification

论文作者

DeSantis, Derek, Wolfram, Phillip J., Bennett, Katrina, Alexandrov, Boian

论文摘要

张量提供了一种简洁的方法,可以使复杂数据的相互依赖性编纂。将张量视为d-way阵列,每个条目都会记录不同索引之间的相互作用。聚类提供了一种将数据的复杂性解析为更容易理解的信息的方法。聚类方法在很大程度上取决于选择的算法以及算法所选的超参数。但是,它们对数据量表的敏感性在很大程度上未知。 在这项工作中,我们应用离散的小波变换来分析粗粒剂对聚类张量数据的影响。我们特别有兴趣了解规模如何影响地球气候系统的聚类。离散小波变换允许在许多时空尺度上对地球气候分类。离散小波变换用于产生分类估计的集合,而不是单个分类。信息理论方法用于识别集群L15气候数据集时的重要规模Lenght。我们还发现了跨越观察到的大部分方差的集合的子集合,从而允许有效的共识聚类技术,可用于识别气候生物群落。

A tensor provides a concise way to codify the interdependence of complex data. Treating a tensor as a d-way array, each entry records the interaction between the different indices. Clustering provides a way to parse the complexity of the data into more readily understandable information. Clustering methods are heavily dependent on the algorithm of choice, as well as the chosen hyperparameters of the algorithm. However, their sensitivity to data scales is largely unknown. In this work, we apply the discrete wavelet transform to analyze the effects of coarse-graining on clustering tensor data. We are particularly interested in understanding how scale effects clustering of the Earth's climate system. The discrete wavelet transform allows classification of the Earth's climate across a multitude of spatial-temporal scales. The discrete wavelet transform is used to produce an ensemble of classification estimates, as opposed to a single classification. Information theoretic approaches are used to identify important scale lenghts in clustering The L15 Climate Datset. We also discover a sub-collection of the ensemble that spans the majority of the variance observed, allowing for efficient consensus clustering techniques that can be used to identify climate biomes.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源