PatentTransFormer-2：结构元数据控制专利文本生成

论文标题

PatentTransFormer-2：结构元数据控制专利文本生成

PatentTransformer-2: Controlling Patent Text Generation by Structural Metadata

论文作者

Lee, Jieh-Sheng, Hsiang, Jieh

论文摘要

PatentTransFormer是我们基于基于变压器的模型的专利文本生成的代号。我们的目标是“增强发明”。在第二个版本中，我们利用更多的专利结构元数据。除了先前的独立索赔外，结构元数据还包括专利标题，摘要和依赖的主张。元数据控制该模型要生成的专利文本。此外，我们利用元数据之间的关系来构建文本到文本生成流，例如，从几个单词到标题，摘要的标题，摘要，独立索赔，以及独立的索赔以及多个依赖的主张。由于关系是双向训练的，因此文本流可以向后移动。我们发布了经过从头开始训练的GPT-2模型以及推理代码，以便读者可以自行验证和生成专利文本。至于发电质量，我们通过Rouge和Google Universal句子编码器进行测量。

PatentTransformer is our codename for patent text generation based on Transformer-based models. Our goal is "Augmented Inventing." In this second version, we leverage more of the structural metadata in patents. The structural metadata includes patent title, abstract, and dependent claim, in addition to independent claim previously. Metadata controls what kind of patent text for the model to generate. Also, we leverage the relation between metadata to build a text-to-text generation flow, for example, from a few words to a title, the title to an abstract, the abstract to an independent claim, and the independent claim to multiple dependent claims. The text flow can go backward because the relation is trained bidirectionally. We release our GPT-2 models trained from scratch and our code for inference so that readers can verify and generate patent text on their own. As for generation quality, we measure it by both ROUGE and Google Universal Sentence Encoder.

下载PDF全文

下载文献需遵守相关版权规定

论文标题