论文标题
Arctext:描述卷积神经网络体系结构的统一文本方法
ArcText: A Unified Text Approach to Describing Convolutional Neural Network Architectures
论文作者
论文摘要
卷积神经网络(CNN)的优越性在很大程度上取决于它们通常以广泛的人类专业知识手动制作的结构。不幸的是,这种领域知识不一定归于感兴趣的每个用户。现有CNN的数据挖掘可以从其体系结构中发现有用的模式和基本子计算,从而为研究人员提供了强大的先验知识,以设计适当的CNN体系结构,而他们在CNN中没有专业知识。手头上有各种最新的数据挖掘算法,而挖掘只有罕见的工作。主要原因之一是CNN体系结构与数据挖掘算法之间的差距。具体而言,当前的CNN体系结构描述不能准确地矢量化为数据挖掘算法的输入。在本文中,我们提出了一种名为Arctext的统一方法,以根据文本描述CNN体系结构。特别是,在Arctext中精心设计了四个不同的单元和一种订购方法,以唯一地描述了具有足够信息的相同体系结构。同样,所得描述可以完全转换回相应的CNN体系结构。 Arctext桥接了CNN体系结构与数据挖掘研究人员之间的差距,并具有可用于更广泛情况的潜力。
The superiority of Convolutional Neural Networks (CNNs) largely relies on their architectures that are often manually crafted with extensive human expertise. Unfortunately, such kind of domain knowledge is not necessarily owned by each of the users interested. Data mining on existing CNN can discover useful patterns and fundamental sub-comments from their architectures, providing researchers with strong prior knowledge to design proper CNN architectures when they have no expertise in CNNs. There have been various state-of-the-art data mining algorithms at hand, while there is only rare work that has been done for the mining. One of the main reasons is the gap between CNN architectures and data mining algorithms. Specifically, the current CNN architecture descriptions cannot be exactly vectorized to the input of data mining algorithms. In this paper, we propose a unified approach, named ArcText, to describing CNN architectures based on text. Particularly, four different units and an ordering method have been elaborately designed in ArcText, to uniquely describe the same architecture with sufficient information. Also, the resulted description can be exactly converted back to the corresponding CNN architecture. ArcText bridges the gap between CNN architectures and data mining researchers, and has the potentiality to be utilized to wider scenarios.