论文标题
COVID-19的多语言数据集在Twitter上疫苗接种态度
A Multilingual Dataset of COVID-19 Vaccination Attitudes on Twitter
论文作者
论文摘要
疫苗的犹豫被认为是欧洲和美国在欧洲的疫苗停滞摄取比停滞的主要原因之一,那里有足够的疫苗。快速准确地了解公众对疫苗接种的态度对于解决疫苗犹豫至关重要,而社交媒体平台已被证明是公众意见的有效来源。在本文中,我们描述了与Covid-19疫苗有关的推文数据集的收集和发布。该数据集由从西欧收集的2,198,090条推文的ID组成,其中17,934条带有发起人的疫苗接种立场。我们的注释将有助于使用和开发数据驱动的模型来提取社交媒体帖子的疫苗接种态度,从而进一步确认社交媒体在公共卫生监视中的力量。为了为将来的研究奠定基础,我们不仅对数据集进行了统计分析和可视化,而且还评估和比较了疫苗接种立场提取中已建立的基于文本的基准测试的性能。我们在实践中证明了我们的数据的一种潜在用途,以跟踪公共Covid-19-19疫苗接种态度的时间变化。
Vaccine hesitancy is considered as one main cause of the stagnant uptake ratio of COVID-19 vaccines in Europe and the US where vaccines are sufficiently supplied. Fast and accurate grasp of public attitudes toward vaccination is critical to address vaccine hesitancy, and social media platforms have proved to be an effective source of public opinions. In this paper, we describe the collection and release of a dataset of tweets related to COVID-19 vaccines. This dataset consists of the IDs of 2,198,090 tweets collected from Western Europe, 17,934 of which are annotated with the originators' vaccination stances. Our annotation will facilitate using and developing data-driven models to extract vaccination attitudes from social media posts and thus further confirm the power of social media in public health surveillance. To lay the groundwork for future research, we not only perform statistical analysis and visualisation of our dataset, but also evaluate and compare the performance of established text-based benchmarks in vaccination stance extraction. We demonstrate one potential use of our data in practice in tracking the temporal changes of public COVID-19 vaccination attitudes.