论文标题
COBOL2VEC:COBOL代码的学习表示
Cobol2Vec: Learning Representations of Cobol code
论文作者
论文摘要
在开发新方法的发展方面,人们一直在逐渐兴趣学习给定输入数据,然后将其用于多个下游任务。自然语言处理领域通过将预训练的嵌入到其管道中,从而在不同任务中取得了重大改进。最近,这些方法已应用于编程语言,以提高开发人员的生产率。在本文中,我们提出了一种无监督的学习方法,将旧的大型机语言编码为固定的维矢量空间。我们将COBOL作为我们的激励榜样,创建一个语料库,并在我们的语料库中的代码回程任务中演示我们的方法的功效。
There has been a steadily growing interest in development of novel methods to learn a representation of a given input data and subsequently using them for several downstream tasks. The field of natural language processing has seen a significant improvement in different tasks by incorporating pre-trained embeddings into their pipelines. Recently, these methods have been applied to programming languages with a view to improve developer productivity. In this paper, we present an unsupervised learning approach to encode old mainframe languages into a fixed dimensional vector space. We use COBOL as our motivating example and create a corpus and demonstrate the efficacy of our approach in a code-retrieval task on our corpus.