论文标题
现在是几奌?小说的时间分析
What time is it? Temporal Analysis of Novels
论文作者
论文摘要
认识到故事中的时间流是理解它的关键方面。与时间相关的先前工作主要集中于识别事件的时间表或相对测序,但是在这里,我们提出了用壁时钟时间进行计算注释书的每一行,即使没有明确的时间描述性短语。为此,我们构建了52,183本虚构书籍的小时时间短语的数据集。然后,我们构建了一个日期分类模型,该模型的平均误差为2.27小时。此外,我们表明,通过使用断点的动态编程来分析一本书,我们可以将书籍大致分为一部分,每个书籍都与特定的时间相对应。这种方法对基线的改进了两个多小时。最后,我们将模型应用于历史上不同时期的文献语料库,以显示过去的小时活动趋势。在几种观察中,我们发现,在1880年下午10点以前的事件的一部分与电动灯泡和城市灯的出现相吻合。
Recognizing the flow of time in a story is a crucial aspect of understanding it. Prior work related to time has primarily focused on identifying temporal expressions or relative sequencing of events, but here we propose computationally annotating each line of a book with wall clock times, even in the absence of explicit time-descriptive phrases. To do so, we construct a data set of hourly time phrases from 52,183 fictional books. We then construct a time-of-day classification model that achieves an average error of 2.27 hours. Furthermore, we show that by analyzing a book in whole using dynamic programming of breakpoints, we can roughly partition a book into segments that each correspond to a particular time-of-day. This approach improves upon baselines by over two hours. Finally, we apply our model to a corpus of literature categorized by different periods in history, to show interesting trends of hourly activity throughout the past. Among several observations we find that the fraction of events taking place past 10 P.M jumps past 1880 - coincident with the advent of the electric light bulb and city lights.