论文标题
通过模式生成和对比网络无监督的工业异常检测
Unsupervised Industrial Anomaly Detection via Pattern Generative and Contrastive Networks
论文作者
论文摘要
很难收集足够的缺陷图像来培训工业生产中的深度学习网络。因此,现有的工业异常检测方法更喜欢使用基于CNN的无监督检测和本地化网络来实现此任务。但是,由于传统的端到端网络在高维空间中符合非线性模型的障碍,因此这些方法总是失败。此外,它们通过基本上将正常图像的功能聚集来具有内存库,这使得纹理变化并不强大。为此,我们提出了基于视觉变压器的(基于VIT)的无监督异常检测网络。它利用分层任务学习和人类经验来增强其可解释性。我们的网络由模式生成和比较网络组成。模式生成网络使用两个基于VIT的编码器模块来提取两个连续图像贴片的功能,然后使用基于VIT的解码器模块来学习这些功能的人类设计样式并预测第三张图像贴片。之后,我们使用基于暹罗的网络来计算Generation Image Patch和原始图像补丁的相似性。最后,我们通过双向推理策略来完善异常定位。公共数据集MVTEC数据集上的比较实验显示我们的方法达到了99.8%的AUC,它超过了先前的最新方法。此外,我们在自己的皮革和布数据集上给出了定性插图。准确的片段结果强烈证明了我们方法在异常检测中的准确性。
It is hard to collect enough flaw images for training deep learning network in industrial production. Therefore, existing industrial anomaly detection methods prefer to use CNN-based unsupervised detection and localization network to achieve this task. However, these methods always fail when there are varieties happened in new signals since traditional end-to-end networks suffer barriers of fitting nonlinear model in high-dimensional space. Moreover, they have a memory library by clustering the feature of normal images essentially, which cause it is not robust to texture change. To this end, we propose the Vision Transformer based (VIT-based) unsupervised anomaly detection network. It utilizes a hierarchical task learning and human experience to enhance its interpretability. Our network consists of pattern generation and comparison networks. Pattern generation network uses two VIT-based encoder modules to extract the feature of two consecutive image patches, then uses VIT-based decoder module to learn the human designed style of these features and predict the third image patch. After this, we use the Siamese-based network to compute the similarity of the generation image patch and original image patch. Finally, we refine the anomaly localization by the bi-directional inference strategy. Comparison experiments on public dataset MVTec dataset show our method achieves 99.8% AUC, which surpasses previous state-of-the-art methods. In addition, we give a qualitative illustration on our own leather and cloth datasets. The accurate segment results strongly prove the accuracy of our method in anomaly detection.