论文标题
YouTubers不是MadeForkids:检测频道共享针对儿童的不适当视频
YouTubers Not madeForKids: Detecting Channels Sharing Inappropriate Videos Targeting Children
论文作者
论文摘要
在过去的几年中,数百个新的YouTube频道一直在创建和共享针对儿童的视频,主题与动画,超级英雄电影,漫画等有关。不幸的是,由于令人不安,暴力或性场景,这些视频中的许多视频不适合其目标受众消费。在本文中,我们研究了YouTube频道,该频道在过去发布了适合或令人不安的视频针对孩子的视频。我们确定YouTube假设和标志之间的明显差异是不适当的内容和渠道,而与发现的内容令人不安的内容,并且仍然在平台上可用,以孩子为目标。特别是,我们发现,在2019年,一项较旧的研究手动注释并归类为几乎60%的视频(与Elsa和其他与儿童视频相关的关键字的收集启动)在2021年中仍可以在YouTube上使用。同时,有44%的渠道,这些频道中有44%的频道,这些渠道已被忽略了,并且遇到了令人不安的视频,并且被置于遇到的视频。我们还首次研究了YouTube在2019年底引入的新功能的“ MadeForkids”标志,并将其应用与共享令人不安的视频的频道进行了比较,如先前的研究所标记。显然,与共享合适内容的渠道相比,这些渠道不太可能被视为“ Madeforkids”。此外,发布令人不安的视频的频道利用其频道功能,例如关键字,描述,主题,帖子等,以吸引孩子(例如,使用与游戏相关的关键字)。最后,我们使用此类频道和内容功能的集合来训练能够在频道创建时间检测到频道与干扰内容上传有关的ML分类器。这些分类器可以帮助YouTube主持人减少此类发病率,指出潜在的可疑帐户而无需分析实际视频。
In the last years, hundreds of new Youtube channels have been creating and sharing videos targeting children, with themes related to animation, superhero movies, comics, etc. Unfortunately, many of these videos are inappropriate for consumption by their target audience, due to disturbing, violent, or sexual scenes. In this paper, we study YouTube channels found to post suitable or disturbing videos targeting kids in the past. We identify a clear discrepancy between what YouTube assumes and flags as inappropriate content and channel, vs. what is found to be disturbing content and still available on the platform, targeting kids. In particular, we find that almost 60\% of videos that were manually annotated and classified as disturbing by an older study in 2019 (a collection bootstrapped with Elsa and other keywords related to children videos), are still available on YouTube in mid 2021. In the meantime, 44% of channels that uploaded such disturbing videos, have yet to be suspended and their videos to be removed. For the first time in literature, we also study the "madeForKids" flag, a new feature that YouTube introduced in the end of 2019, and compare its application to the channels that shared disturbing videos, as flagged from the previous study. Apparently, these channels are less likely to be set as "madeForKids" than those sharing suitable content. In addition, channels posting disturbing videos utilize their channel features such as keywords, description, topics, posts, etc., to appeal to kids (e.g., using game-related keywords). Finally, we use a collection of such channel and content features to train ML classifiers able to detect, at channel creation time, when a channel will be related to disturbing content uploads. These classifiers can help YouTube moderators reduce such incidences, pointing to potentially suspicious accounts without analyzing actual videos.