论文标题
多级预测指标的拟合度量的优点
Goodness of Fit Metrics for Multi-class Predictor
论文作者
论文摘要
近年来,多级预测越来越受欢迎。因此,衡量拟合度的善良成为研究人员经常必须处理的基本问题。几个指标通常用于此任务。但是,当一个人必须决定正确的测量值时,他必须考虑不同的用例施加了控制这一决定的不同约束。至少在\ emph {现实世界}多级问题中的主要约束是不平衡的数据:多类问题几乎无法提供对称数据。因此,当我们观察到常见的KPI(关键性能指标)时,例如精确敏感性或准确性时,很少会将所获得的数字解释为模型的实际需求。我们建议将Matthew的相关系数概括为多维。该概括基于对广义混淆矩阵的几何解释。
The multi-class prediction had gained popularity over recent years. Thus measuring fit goodness becomes a cardinal question that researchers often have to deal with. Several metrics are commonly used for this task. However, when one has to decide about the right measurement, he must consider that different use-cases impose different constraints that govern this decision. A leading constraint at least in \emph{real world} multi-class problems is imbalanced data: Multi categorical problems hardly provide symmetrical data. Hence, when we observe common KPIs (key performance indicators), e.g., Precision-Sensitivity or Accuracy, one can seldom interpret the obtained numbers into the model's actual needs. We suggest generalizing Matthew's correlation coefficient into multi-dimensions. This generalization is based on a geometrical interpretation of the generalized confusion matrix.