论文标题

一个大规模的Covid-19 Twitter Chatter数据集用于开放科学研究 - 国际合作

A large-scale COVID-19 Twitter chatter dataset for open scientific research -- an international collaboration

论文作者

Banda, Juan M., Tekumalla, Ramya, Wang, Guanyu, Yu, Jingyuan, Liu, Tuo, Ding, Yuning, Artemova, Katya, Tutubalina, Elena, Chowell, Gerardo

论文摘要

随着COVID-19的大流行持续到世界各地,正在为遗传学和流行病学研究生成前所未有的开放数据。世界上许多研究小组释放有关正在进行的大流行的数据和出版物的无与伦比的速度使其他科学家可以从COVID-19大流行前线产生的当地经验和数据中学习。但是,有必要集成其他数据源,以绘制和衡量这种独特的世界范围内事件的社会动态作用,以成为生物医学,生物学和流行病学分析。为此,我们提供了一个大规模的策划数据集,其中有超过1.52亿条推文,每天增长,与1月1日至4月4日在撰写本文时产生的Covid-19聊天有关。该开放数据集将使研究人员能够进行许多与社会疏远措施的情感和心理反应有关的研究项目,识别错误信息来源以及对大流行的情绪分层的衡量。

As the COVID-19 pandemic continues its march around the world, an unprecedented amount of open data is being generated for genetics and epidemiological research. The unparalleled rate at which many research groups around the world are releasing data and publications on the ongoing pandemic is allowing other scientists to learn from local experiences and data generated in the front lines of the COVID-19 pandemic. However, there is a need to integrate additional data sources that map and measure the role of social dynamics of such a unique world-wide event into biomedical, biological, and epidemiological analyses. For this purpose, we present a large-scale curated dataset of over 152 million tweets, growing daily, related to COVID-19 chatter generated from January 1st to April 4th at the time of writing. This open dataset will allow researchers to conduct a number of research projects relating to the emotional and mental responses to social distancing measures, the identification of sources of misinformation, and the stratified measurement of sentiment towards the pandemic in near real time.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源