论文标题
比较形状受限的回归算法以进行数据验证
Comparing Shape-Constrained Regression Algorithms for Data Validation
论文作者
论文摘要
工业和科学应用程序处理大量数据,这些数据使人无法可行。因此,我们需要能够考虑域专家的先验知识的自动数据验证方法,以便对数据质量进行可靠,可信赖的评估。通常可以作为描述目标相互作用的规则来提供先验知识,例如目标必须单调减小,并且在增加的输入值之后凸出。域专家能够一目了然地验证多个此类相互作用。但是,现有的基于规则的数据验证方法无法考虑这些约束。在这项工作中,我们根据数据验证的分类准确性和运行时性能比较了不同形状受限的回归算法。
Industrial and scientific applications handle large volumes of data that render manual validation by humans infeasible. Therefore, we require automated data validation approaches that are able to consider the prior knowledge of domain experts to produce dependable, trustworthy assessments of data quality. Prior knowledge is often available as rules that describe interactions of inputs with regard to the target e.g. the target must be monotonically decreasing and convex over increasing input values. Domain experts are able to validate multiple such interactions at a glance. However, existing rule-based data validation approaches are unable to consider these constraints. In this work, we compare different shape-constrained regression algorithms for the purpose of data validation based on their classification accuracy and runtime performance.