论文标题
人工神经网络的功能规则提取方法
Functional Rule Extraction Method for Artificial Neural Networks
论文作者
论文摘要
我在本文中提出的想法是一种基于从人工神经网络操作中提取的综合功能的方法。首先,我定义了综合功能,然后构建了一个综合的多层网络(称为$ n $)。 $ n $的每个激活函数均参数为综合函数。在$ n $构造之后,我通过观察网络输出取决于综合功能的概率来从网络中提取规则。该功能性规则提取方法适用于感知器和多层神经网络。对于任何经过培训以预测某些事件的训练的$ n $模型,可以表达模型行为 - 使用功能规则提取方法 - 作为正式规则或网络遵守的正式规则或非正式规则来预测结果。例如,图1由一个全面的物理功能组成,该功能是网络隐藏激活函数之一的参数。使用功能性规则提取方法,我推断出综合的多层网络预测取决于该物理功能的概率和$ n $中其他复合综合功能的概率。此外,功能性规则提取方法可以帮助应用设置,以生成学习现象的方程。可以通过首先训练$ n $模型来预测现象的结果,然后提取规则,并假设网络综合功能是常数。最后,为了简化生成的方程式,可以省略具有概率$ p = 0 $的综合功能。
The idea I propose in this paper is a method that is based on comprehensive functions for directed and undirected rule extraction from artificial neural network operations. Firstly, I defined comprehensive functions, then constructed a comprehensive multilayer network (denoted as $N$). Each activation function of $N$ is parametrized to a comprehensive function. Following $N$ construction, I extracted rules from the network by observing that the network output depends on probabilities of composite functions that are comprehensive functions. This functional rule extraction method applies to the perceptron and multilayer neural network. For any $N$ model that is trained to predict some outcome given some event, that model behaviour can be expressed – using the functional rule extraction method – as a formal rule or informal rule obeyed by the network to predict that outcome. As example, figure 1 consist of a comprehensive physics function that is parameter for one of the network hidden activation functions. Using the functional rule extraction method, I deduced that the comprehensive multilayer network prediction depends on probability of that physics function and probabilities of other composite comprehensive functions in $N$. Additionally, functional rule extraction method can aid in applied settings for generation of equations of learned phenomena. This generation can be achieved by first training an $N$ model toward predicting outcome of a phenomenon, then extracting the rules and assuming that probability values of the network comprehensive functions are constants. Finally, to simplify the generated equation, comprehensive functions with probability $p = 0$ can be omitted.