论文标题
使用BERT和GPT-2对COVID-19的医学研究文章的自动文本摘要
Automatic Text Summarization of COVID-19 Medical Research Articles using BERT and GPT-2
论文作者
论文摘要
随着COVID-19的大流行,医学界越来越紧迫地跟上新的冠状病毒相关文献的加速增长。结果,COVID-19开放研究数据集挑战挑战发行了学术文章,并呼吁采用机器学习方法来帮助弥合研究人员与快速增长的出版物之间的差距。在这里,我们利用了预先训练的NLP模型BERT和OpenAI GPT-2的最新进展,通过在此数据集中执行文本摘要来解决这一挑战。我们使用胭脂分数和视觉检查评估结果。我们的模型根据原始文章提取的关键字提供了抽象性和全面的信息。我们的工作可以通过提供尚未提供摘要的文章简要摘要来帮助医学界。
With the COVID-19 pandemic, there is a growing urgency for medical community to keep up with the accelerating growth in the new coronavirus-related literature. As a result, the COVID-19 Open Research Dataset Challenge has released a corpus of scholarly articles and is calling for machine learning approaches to help bridging the gap between the researchers and the rapidly growing publications. Here, we take advantage of the recent advances in pre-trained NLP models, BERT and OpenAI GPT-2, to solve this challenge by performing text summarization on this dataset. We evaluate the results using ROUGE scores and visual inspection. Our model provides abstractive and comprehensive information based on keywords extracted from the original articles. Our work can help the the medical community, by providing succinct summaries of articles for which the abstract are not already available.