论文标题
集合机器学习方法用于建模COVID19死亡
Ensemble Machine Learning Methods for Modeling COVID19 Deaths
论文作者
论文摘要
使用机器学习和流行病学方法的混合体,我们提出了一种新型的数据驱动方法,以预测美国在县一级的COVID-19死亡。该模型对每日死亡分布进行了更完整的描述,即输出分位数估计而不是平均死亡,该模型的目标是最大程度地减少纽约时报冠心病病毒县数据集报道的死亡的弹球损失。所得的分位数估计可准确预测可变长度周期的个体县水平的死亡,并且该方法在不同的预测周期长度上很好地概括了。我们在50多个团队中赢得了加州理工学院运行的建模竞赛,我们的总体竞争与最佳的Covid-19建模系统(在均方根误差上)竞争。
Using a hybrid of machine learning and epidemiological approaches, we propose a novel data-driven approach in predicting US COVID-19 deaths at a county level. The model gives a more complete description of the daily death distribution, outputting quantile-estimates instead of mean deaths, where the model's objective is to minimize the pinball loss on deaths reported by the New York Times coronavirus county dataset. The resulting quantile estimates accurately forecast deaths at an individual-county level for a variable-length forecast period, and the approach generalizes well across different forecast period lengths. We won the Caltech-run modeling competition out of 50+ teams, and our aggregate is competitive with the best COVID-19 modeling systems (on root mean squared error).