论文标题

灾难管理社交媒体文本上弱监督的细粒度事件识别

Weakly-supervised Fine-grained Event Recognition on Social Media Texts for Disaster Management

论文作者

Yao, Wenlin, Zhang, Cheng, Saravanan, Shiva, Huang, Ruihong, Mostafavi, Ali

论文摘要

人们越来越多地使用社交媒体报告紧急情况,寻求帮助或在灾难期间共享信息,这使社交网络成为灾难管理的重要工具。为了满足这些关键时期需求,我们提出了一种弱监督的方法,用于快速构建高质量的分类器,该分类器将每个Twitter信息标记为具有细粒度的事件类别。最重要的是,我们提出了一种新颖的方法,以及时创建高质量的标记数据,该数据自动将包含事件关键字的推文群簇推文,并要求域专家迅速消除事件词语和标签群集。此外,为了处理极其嘈杂且通常很短的用户生成的消息,我们使用前面的上下文推文丰富了推文表示,并在构建事件识别分类器中回复推文。对哈维和佛罗伦萨的两次飓风的评估表明,仅使用1-2个人的人类监督,经过培训的较弱监督分类器优胜于在50多个人小时内创建的超过一万个带注释的推文培训的分类器。

People increasingly use social media to report emergencies, seek help or share information during disasters, which makes social networks an important tool for disaster management. To meet these time-critical needs, we present a weakly supervised approach for rapidly building high-quality classifiers that label each individual Twitter message with fine-grained event categories. Most importantly, we propose a novel method to create high-quality labeled data in a timely manner that automatically clusters tweets containing an event keyword and asks a domain expert to disambiguate event word senses and label clusters quickly. In addition, to process extremely noisy and often rather short user-generated messages, we enrich tweet representations using preceding context tweets and reply tweets in building event recognition classifiers. The evaluation on two hurricanes, Harvey and Florence, shows that using only 1-2 person-hours of human supervision, the rapidly trained weakly supervised classifiers outperform supervised classifiers trained using more than ten thousand annotated tweets created in over 50 person-hours.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源