通过惩罚估计，从数据集中转移回归模型的学习

论文标题

通过惩罚估计，从数据集中转移回归模型的学习

Transfer learning of regression models from a sequence of datasets by penalized estimation

论文作者

van Wieringen, Wessel N., Binder, Harald

论文摘要

转移学习是指基于其他数据的预培训初始化模型的有前途的想法。我们特别考虑回归建模设置，其中可以将来自先前数据的参数估计用作锚定点，但可能无法用于所有参数，因此无法重复使用协方差信息。提出了通过有针对性的惩罚估计更新的程序，该程序将提出将估计值缩小到非零值的缩小。从新的数据中寻求更新时，从先前数据中的参数估计值作为此非零值。这自然扩展到具有相同响应的数据集序列，但可能仅在协变量中部分重叠。迭代更新的回归参数估计器被证明是渐近的无偏见和一致的。惩罚参数是通过受约束的交叉验证的Loglikelihood优化选择的。约束界限更新的估计器的收缩量从下面限制为当前一个。 Bound的目的是保留（更新的）估算器对所有新颖数据的拟合优度。将提出的方法与其他回归建模程序进行了比较。最后，在一项流行病学研究中进行了说明，该研究以不同的协变量可用性分批到达数据，并重新安装了新型批次的可用性。

Transfer learning refers to the promising idea of initializing model fits based on pre-training on other data. We particularly consider regression modeling settings where parameter estimates from previous data can be used as anchoring points, yet may not be available for all parameters, thus covariance information cannot be reused. A procedure that updates through targeted penalized estimation, which shrinks the estimator towards a nonzero value, is presented. The parameter estimate from the previous data serves as this nonzero value when an update is sought from novel data. This naturally extends to a sequence of data sets with the same response, but potentially only partial overlap in covariates. The iteratively updated regression parameter estimator is shown to be asymptotically unbiased and consistent. The penalty parameter is chosen through constrained cross-validated loglikelihood optimization. The constraint bounds the amount of shrinkage of the updated estimator toward the current one from below. The bound aims to preserve the (updated) estimator's goodness-of-fit on all-but-the-novel data. The proposed approach is compared to other regression modeling procedures. Finally, it is illustrated on an epidemiological study where the data arrive in batches with different covariate-availability and the model is re-fitted with the availability of a novel batch.

下载PDF全文

下载文献需遵守相关版权规定

论文标题