论文标题
用于预测,诊断和减轻医院再入院健康差异的机器学习模型
A Machine Learning Model for Predicting, Diagnosing, and Mitigating Health Disparities in Hospital Readmission
论文作者
论文摘要
住院患者高血糖的管理对发病率和死亡率都有重大影响。因此,重要的是要预测需要住院的糖尿病患者。但是,使用标准的机器学习方法来做出这些预测可能会导致由与社会决定因素(例如种族,年龄和性别)的数据中的偏见引起的健康差异。这些偏见必须在数据收集过程的早期,在进入系统之前就可以消除,并通过模型预测加强,从而导致模型决策的偏见。在本文中,我们提出了一个机器学习管道,能够进行预测以及检测和减轻数据和模型预测中的偏见。该管道分析了临床数据,并确定数据中是否存在偏差,如果是的话,它会在做出预测之前消除这些偏见。我们使用准确性和公平措施在临床数据集上评估了所提出的方法的性能。结果的发现表明,当我们在数据摄入期间早期减轻偏见时,我们会得到更公平的预测。
The management of hyperglycemia in hospitalized patients has a significant impact on both morbidity and mortality. Therefore, it is important to predict the need for diabetic patients to be hospitalized. However, using standard machine learning approaches to make these predictions may result in health disparities caused by biases in the data related to social determinants (such as race, age, and gender). These biases must be removed early in the data collection process, before they enter the system and are reinforced by model predictions, resulting in biases in the model's decisions. In this paper, we propose a machine learning pipeline capable of making predictions as well as detecting and mitigating biases in the data and model predictions. This pipeline analyses the clinical data and determines whether biases exist in the data, if so, it removes those biases before making predictions. We evaluate the performance of the proposed method on a clinical dataset using accuracy and fairness measures. The findings of the results show that when we mitigate biases early during the data ingestion, we get fairer predictions.