论文标题
注释语法的有效枚举算法
Efficient Enumeration Algorithms for Annotated Grammars
论文作者
论文摘要
我们介绍了注释的语法,这是无上下文语法的扩展,该语法允许在终端上进行注释。我们的模型扩展了常规跨度的标准概念,并且比Peterfreund最近引入的提取语法更具表现力。我们研究了带注释的语法的枚举问题:固定语法,并给出一个字符串作为输入,列举了构成语法中可派生单词的字符串的所有注释。我们的第一个结果是针对明确注释的语法的算法,该算法将输入字符串预处理并用输出线性延迟列举所有注释。这改善了Peterfreund的结果,该结果需要五重奏时间预处理以实现此延迟。然后,我们研究如何通过在语法上做出其他假设来减少预处理时间,同时保持相同的延迟约束。具体而言,我们提出了一类语法,该语法仅具有一个针对所有输出的派生形状,我们可以通过二次时间预处理进行列举。我们还提供课程,可以概括定期的跨度,线性时间预处理就足够了。
We introduce annotated grammars, an extension of context-free grammars which allows annotations on terminals. Our model extends the standard notion of regular spanners, and is more expressive than the extraction grammars recently introduced by Peterfreund. We study the enumeration problem for annotated grammars: fixing a grammar, and given a string as input, enumerate all annotations of the string that form a word derivable from the grammar. Our first result is an algorithm for unambiguous annotated grammars, which preprocesses the input string in cubic time and enumerates all annotations with output-linear delay. This improves over Peterfreund's result, which needs quintic time preprocessing to achieve this delay bound. We then study how we can reduce the preprocessing time while keeping the same delay bound, by making additional assumptions on the grammar. Specifically, we present a class of grammars which only have one derivation shape for all outputs, for which we can enumerate with quadratic time preprocessing. We also give classes that generalize regular spanners for which linear time preprocessing suffices.