论文标题
Arcov-19:第一个带有传播网络的阿拉伯共同互联-19 Twitter数据集
ArCOV-19: The First Arabic COVID-19 Twitter Dataset with Propagation Networks
论文作者
论文摘要
In this paper, we present ArCOV-19, an Arabic COVID-19 Twitter dataset that spans one year, covering the period from 27th of January 2020 till 31st of January 2021. ArCOV-19 is the first publicly-available Arabic Twitter dataset covering COVID-19 pandemic that includes about 2.7M tweets alongside the propagation networks of the most-popular subset of them (i.e.,最重新打的和 - 肯定)。传播网络包括转发和对话线程(即,答复的线程)。 Arcov-19旨在在几个领域的研究下进行研究,包括自然语言处理,信息检索和社交计算。初步分析表明,Arcov-19捕获了与阿拉伯世界出现的第一个报道的疾病病例有关的讨论。除了源推文和传播网络外,我们还发布了用于收集推文的搜索查询和与语言无关的轨道,以鼓励类似数据集的策划。
In this paper, we present ArCOV-19, an Arabic COVID-19 Twitter dataset that spans one year, covering the period from 27th of January 2020 till 31st of January 2021. ArCOV-19 is the first publicly-available Arabic Twitter dataset covering COVID-19 pandemic that includes about 2.7M tweets alongside the propagation networks of the most-popular subset of them (i.e., most-retweeted and -liked). The propagation networks include both retweets and conversational threads (i.e., threads of replies). ArCOV-19 is designed to enable research under several domains including natural language processing, information retrieval, and social computing. Preliminary analysis shows that ArCOV-19 captures rising discussions associated with the first reported cases of the disease as they appeared in the Arab world. In addition to the source tweets and propagation networks, we also release the search queries and language-independent crawler used to collect the tweets to encourage the curation of similar datasets.