论文标题
阅读《超越行》:通过脱脂和密集的阅读模型理解隐含的文本含义
Read Beyond the Lines: Understanding the Implied Textual Meaning via a Skim and Intensive Reading Model
论文作者
论文摘要
由于机器模型的高上下文敏感性和对比喻性语言的大量使用,因此很难理解对文本的非文字解释。在这项研究中,受到人类阅读理解的启发,我们提出了一个新颖,简单且有效的深层神经框架,称为脱脂和密集阅读模型(SIRM),以弄清隐含的文本含义。所提出的SIRM由两个主要组成部分组成,即脱脂阅读组件和密集阅读组件。 n-gram特征是从脱离读数组件中快速提取的,该组件是几个卷积神经网络的组合,作为脱脂(整个)信息。密集的阅读组件可以对本地(句子)和全局(段落)表示进行层次研究,该研究将当前的嵌入和上下文信息封装在密集的连接中。更具体地说,上下文信息包括近邻居信息和上述浏览信息。最后,除了正常的训练损失功能外,我们还采用了对抗性损失函数作为对脱脂阅读组件的惩罚,以消除训练数据中特殊的比喻单词产生的嘈杂信息。为了验证所提出的体系结构的有效性,鲁棒性和效率,我们对几个讽刺基准和带有隐喻的工业垃圾邮件数据集进行了广泛的比较实验。实验结果表明,(1)所提出的模型从上下文建模和比喻语言的考虑中受益,其表现优于现有的最新解决方案,具有可比的参数量表和训练速度; (2)SIRM就参数尺寸灵敏度而言产生了出色的鲁棒性; (3)与Sirm的消融和加法变体相比,最终框架足够有效。
The nonliteral interpretation of a text is hard to be understood by machine models due to its high context-sensitivity and heavy usage of figurative language. In this study, inspired by human reading comprehension, we propose a novel, simple, and effective deep neural framework, called Skim and Intensive Reading Model (SIRM), for figuring out implied textual meaning. The proposed SIRM consists of two main components, namely the skim reading component and intensive reading component. N-gram features are quickly extracted from the skim reading component, which is a combination of several convolutional neural networks, as skim (entire) information. An intensive reading component enables a hierarchical investigation for both local (sentence) and global (paragraph) representation, which encapsulates the current embedding and the contextual information with a dense connection. More specifically, the contextual information includes the near-neighbor information and the skim information mentioned above. Finally, besides the normal training loss function, we employ an adversarial loss function as a penalty over the skim reading component to eliminate noisy information arisen from special figurative words in the training data. To verify the effectiveness, robustness, and efficiency of the proposed architecture, we conduct extensive comparative experiments on several sarcasm benchmarks and an industrial spam dataset with metaphors. Experimental results indicate that (1) the proposed model, which benefits from context modeling and consideration of figurative language, outperforms existing state-of-the-art solutions, with comparable parameter scale and training speed; (2) the SIRM yields superior robustness in terms of parameter size sensitivity; (3) compared with ablation and addition variants of the SIRM, the final framework is efficient enough.