论文标题

在数据融合下使用经验可能性校准回归估计

Calibrated regression estimation using empirical likelihood under data fusion

论文作者

Li, Wei, Luo, Shanshan, Xu, Wangli

论文摘要

基于来自多个来源的信息的数据分析在经济和生物医学研究中很常见。此设置通常称为数据融合问题,这与传统缺失的数据问题不同,因为任何主题均未观察到完整的数据。当结果变量和某些协变量从两个不同的来源收集时,我们考虑回归分析。通过利用在两个数据集中观察到的共同变量,在文献中提出了双重鲁棒的估计程序,以防止可能的模型错误。但是,他们仅采用一个单个倾向得分模型来进行数据融合过程,并为一个数据集中可用的协变量提供一个归档模型。假设在实践中正确指定任何一个模型可能是值得怀疑的。因此,我们提出了一种校准多重倾向得分和归纳模型的方法,以基于经验可能性方法获得更多保护。当正确指定这些模型中的任何一个,并且与拟合倾向分数的极端值稳健时,所得的估计器是一致的。我们还建立了其渐近态性属性,并讨论半参数估计效率。仿真研究表明,所提出的估计器比现有的双重稳定估计器具有很大的优势,并且组装了美国家庭支出数据示例用于插图。

Data analysis based on information from several sources is common in economic and biomedical studies. This setting is often referred to as the data fusion problem, which differs from traditional missing data problems since no complete data is observed for any subject. We consider a regression analysis when the outcome variable and some covariates are collected from two different sources. By leveraging the common variables observed in both data sets, doubly robust estimation procedures are proposed in the literature to protect against possible model misspecifications. However, they employ only a single propensity score model for the data fusion process and a single imputation model for the covariates available in one data set. It may be questionable to assume that either model is correctly specified in practice. We therefore propose an approach that calibrates multiple propensity score and imputation models to gain more protection based on empirical likelihood methods. The resulting estimator is consistent when any one of those models is correctly specified and is robust against extreme values of the fitted propensity scores. We also establish its asymptotic normality property and discuss the semiparametric estimation efficiency. Simulation studies show that the proposed estimator has substantial advantages over existing doubly robust estimators, and an assembled U.S. household expenditure data example is used for illustration.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源