论文标题
深层视觉特征的自然语言描述
Natural Language Descriptions of Deep Visual Features
论文作者
论文摘要
深网中的一些神经元专门识别输入的高度特定感知,结构或语义特征。在计算机视觉中,存在用于识别对颜色,纹理和对象类等单个概念类别响应的神经元的技术。但是这些技术的范围有限,仅在任何网络中标记一小部分神经元和行为。神经元计算的表征是否更丰富?我们介绍了一个程序(称为米兰,用于神经元的相互信息引导的语言注释),该程序会自动用开放式,组成,自然语言描述为神经元标记神经元。鉴于神经元,米兰通过搜索自然语言字符串来产生描述,该语言字符串使用神经元处于活动状态的图像区域最大化相互信息。米兰产生细粒度的描述,可在学到的特征中捕获分类,关系和逻辑结构。这些描述与在各种模型体系结构和任务中的人类生成的特征描述具有很高的一致性,并可以帮助理解和控制学到的模型。我们重点介绍了自然语言神经元描述的三种应用。首先,我们使用米兰进行分析,表征神经元在视觉模型中选择性,类别和关系信息的分布和重要性。其次,我们使用米兰进行审核,浮出水面对数据集中的人脸敏感的神经元旨在掩盖它们。最后,我们使用米兰来编辑,通过删除对文本特征敏感的神经元与类标签相关的文本特征,从而改善了图像分类器中的鲁棒性。
Some neurons in deep networks specialize in recognizing highly specific perceptual, structural, or semantic features of inputs. In computer vision, techniques exist for identifying neurons that respond to individual concept categories like colors, textures, and object classes. But these techniques are limited in scope, labeling only a small subset of neurons and behaviors in any network. Is a richer characterization of neuron-level computation possible? We introduce a procedure (called MILAN, for mutual-information-guided linguistic annotation of neurons) that automatically labels neurons with open-ended, compositional, natural language descriptions. Given a neuron, MILAN generates a description by searching for a natural language string that maximizes pointwise mutual information with the image regions in which the neuron is active. MILAN produces fine-grained descriptions that capture categorical, relational, and logical structure in learned features. These descriptions obtain high agreement with human-generated feature descriptions across a diverse set of model architectures and tasks, and can aid in understanding and controlling learned models. We highlight three applications of natural language neuron descriptions. First, we use MILAN for analysis, characterizing the distribution and importance of neurons selective for attribute, category, and relational information in vision models. Second, we use MILAN for auditing, surfacing neurons sensitive to human faces in datasets designed to obscure them. Finally, we use MILAN for editing, improving robustness in an image classifier by deleting neurons sensitive to text features spuriously correlated with class labels.