论文标题
多标签的少数弹射ICD编码作为自回归的迅速编码
Multi-label Few-shot ICD Coding as Autoregressive Generation with Prompt
论文作者
论文摘要
自动国际疾病分类(ICD)编码旨在将多个ICD代码分配给平均3,000多个令牌的医学票据。由于多标签分配的高维空间(155,000多个ICD代码候选者)和长尾挑战率很高,因此这项任务具有挑战性 - 许多ICD代码很少分配但不经常的ICD代码在临床上很重要。这项研究通过将这项多标签分类任务转换为自回归产生任务,解决了长尾挑战。具体而言,我们首先引入了一个新颖的预读目标,以使用SOAP结构(医学逻辑医生用于注释文档)生成自由文本诊断和程序。其次,我们的模型不是直接预测ICD代码的高维空间,而是生成文本描述的较低维度,然后推断ICD代码。第三,我们为多标签分类设计了一个新颖的提示模板。我们通过所有代码分配(MIMIC-III-FULL)的基准和少数射击ICD代码分配评估基准(MIMIC-III-FEW)的基准来评估我们的一代。对模拟III-FEW的实验表明,我们的模型使用MARCO F1 30.2执行,该模型的表现大大优于先前的MIMIC-III-FULL SOTA模型(Marco F1 4.3),并且该模型专门为少数/零的射击设置而设计(Marco F1 18.7)。最后,我们设计了一个新颖的合奏学习者,这是一个带有提示的交叉注意重读者,以整合以前的SOTA和我们最好的几次编码预测。对模拟物的实验表明,我们的整体学习者将宏和微F1显着改善,分别从10.4到14.6,从58.2到59.1。
Automatic International Classification of Diseases (ICD) coding aims to assign multiple ICD codes to a medical note with an average of 3,000+ tokens. This task is challenging due to the high-dimensional space of multi-label assignment (155,000+ ICD code candidates) and the long-tail challenge - Many ICD codes are infrequently assigned yet infrequent ICD codes are important clinically. This study addresses the long-tail challenge by transforming this multi-label classification task into an autoregressive generation task. Specifically, we first introduce a novel pretraining objective to generate free text diagnoses and procedure using the SOAP structure, the medical logic physicians use for note documentation. Second, instead of directly predicting the high dimensional space of ICD codes, our model generates the lower dimension of text descriptions, which then infer ICD codes. Third, we designed a novel prompt template for multi-label classification. We evaluate our Generation with Prompt model with the benchmark of all code assignment (MIMIC-III-full) and few shot ICD code assignment evaluation benchmark (MIMIC-III-few). Experiments on MIMIC-III-few show that our model performs with a marco F1 30.2, which substantially outperforms the previous MIMIC-III-full SOTA model (marco F1 4.3) and the model specifically designed for few/zero shot setting (marco F1 18.7). Finally, we design a novel ensemble learner, a cross attention reranker with prompts, to integrate previous SOTA and our best few-shot coding predictions. Experiments on MIMIC-III-full show that our ensemble learner substantially improves both macro and micro F1, from 10.4 to 14.6 and from 58.2 to 59.1, respectively.