论文标题

通过最佳转换来预测具有不完美数据的回归概率分布

Predicting Regression Probability Distributions with Imperfect Data Through Optimal Transformations

论文作者

Friedman, Jerome H.

论文摘要

回归分析的目的是在给定其他(预测指标)变量的关节值x的矢量x的矢量x上预测数字结果变量的值。通常,特定的x-vector不会指定y的可重复值,而是可能的y-值概率分布,p(y | x)。该分布具有一个位置,比例和形状,所有这些分布都可以取决于x,并且需要推断给定x的可能值。回归方法通常假定训练数据y值是一些良好指示的P(Y | X)的完美数字实现。通常,实际的培训数据y值是离散,截断和/或任意审查的。在可能存在这种不完美的训练数据的情况下,提出了基于最佳转换策略的回归程序,以将P(Y | X)作为X的一般函数估算为X的一般函数。此外,提出了验证诊断以确定解决方案的质量。

The goal of regression analysis is to predict the value of a numeric outcome variable y given a vector of joint values of other (predictor) variables x. Usually a particular x-vector does not specify a repeatable value for y, but rather a probability distribution of possible y--values, p(y|x). This distribution has a location, scale and shape, all of which can depend on x, and are needed to infer likely values for y given x. Regression methods usually assume that training data y-values are perfect numeric realizations from some well behaived p(y|x). Often actual training data y-values are discrete, truncated and/or arbitrary censored. Regression procedures based on an optimal transformation strategy are presented for estimating location, scale and shape of p(y|x) as general functions of x, in the possible presence of such imperfect training data. In addition, validation diagnostics are presented to ascertain the quality of the solutions.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源