CN116862894A - An industrial defect detection method based on image restoration - Google Patents
An industrial defect detection method based on image restoration Download PDFInfo
- Publication number
- CN116862894A CN116862894A CN202310921316.3A CN202310921316A CN116862894A CN 116862894 A CN116862894 A CN 116862894A CN 202310921316 A CN202310921316 A CN 202310921316A CN 116862894 A CN116862894 A CN 116862894A
- Authority
- CN
- China
- Prior art keywords
- feature
- image
- feature map
- abnormal
- defect
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0004—Industrial image inspection
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30108—Industrial image inspection
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Quality & Reliability (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
本发明公开了一种基于图像恢复的工业缺陷检测方法,属于工业产品表面缺陷检测技术领域。本发明利用FFM‑SA模块将编码器输出的特征与经过块金字塔记忆模块过滤掉异常的多尺度特征有效地融合,增强恢复图像的细节信息;此外,本发明提出的分割子网络提取异常图像和恢复图像的多尺度特征,并完成对应尺度特征的级联操作,为异常区域的定位提供更有效的信息。最后使用具有代表性的异常检测数据集进行实验,证明了本发明方法较其他异常检测方法具有更好的检测效果。
The invention discloses an industrial defect detection method based on image recovery, which belongs to the technical field of surface defect detection of industrial products. The present invention uses the FFM-SA module to effectively fuse the features output by the encoder with the multi-scale features that filter out abnormalities through the block pyramid memory module, thereby enhancing the detailed information of the restored image; in addition, the segmentation sub-network proposed by the present invention extracts abnormal images and Recover the multi-scale features of the image and complete the cascade operation of the corresponding scale features to provide more effective information for locating abnormal areas. Finally, experiments were conducted using representative anomaly detection data sets, which proved that the method of the present invention has better detection results than other anomaly detection methods.
Description
技术领域Technical field
本发明涉及一种基于图像恢复的工业缺陷检测方法,属于工业产品表面缺陷检测技术领域。The invention relates to an industrial defect detection method based on image recovery, and belongs to the technical field of surface defect detection of industrial products.
背景技术Background technique
在实际生产过程中,由于受到各种因素的影响,工业制品表面不可避免的会有缺陷存在,而这些缺陷会对产品的质量和性能产生负面影响。因此,为了确保产品质量和提升产品竞争力,表面缺陷检测环节变得至关重要。In the actual production process, due to the influence of various factors, defects will inevitably exist on the surface of industrial products, and these defects will have a negative impact on the quality and performance of the product. Therefore, in order to ensure product quality and enhance product competitiveness, surface defect detection has become crucial.
与通用的目标检测任务相比,表面缺陷检测会面临更多难点,例如缺陷样本少、缺陷存在可视性较低、形状不规则和缺陷类型未知等问题。近年来人们提出了各种基于无监督学习的表面缺陷检测算法,这些算法主要分为基于特征的方法和基于重建的方法。Compared with general target detection tasks, surface defect detection faces more difficulties, such as few defect samples, low visibility of defects, irregular shapes, and unknown defect types. In recent years, various surface defect detection algorithms based on unsupervised learning have been proposed. These algorithms are mainly divided into feature-based methods and reconstruction-based methods.
基于特征的方法,通过预训练模型将输入图像投影到可区分的特征空间中来检测异常,这类方法通常具有较好的性能,但是可解释性和可调整性较差。Feature-based methods detect anomalies by projecting the input image into a distinguishable feature space through a pre-trained model. This type of method usually has better performance, but poor interpretability and adjustability.
基于重建的方法假设:正常样本比异常样本能够通过潜在特征空间更好地被重建。重建网络在正常图像上进行训练后,充分地学习到了正常图像的特征,从而在推理时能够很好地重建正常图像;而对于异常图像来说,由于在训练阶段没有学习到异常特征,所以推理时重建异常的能力较差;最后通过计算输入图像和重建图像之间的重建误差来进行异常检测。Reconstruction-based methods assume that normal samples can be reconstructed better through the latent feature space than abnormal samples. After the reconstruction network is trained on normal images, it fully learns the characteristics of normal images, so that it can reconstruct normal images well during inference; for abnormal images, because no abnormal features are learned during the training stage, the reasoning The ability to reconstruct anomalies is poor; finally, anomaly detection is performed by calculating the reconstruction error between the input image and the reconstructed image.
基于重建的方法可以直接观察到原始图像和重建图像之间的差异性,从而更容易让人理解。但在实际应用中,由于卷积神经网络的泛化性,基于重建的方法有时也能很好地重建异常,使得难以通过重建误差来检测异常,导致缺陷检测的准确度下降。Reconstruction-based methods can directly observe the differences between the original image and the reconstructed image, making it easier to understand. However, in practical applications, due to the generalization of convolutional neural networks, reconstruction-based methods can sometimes reconstruct anomalies well, making it difficult to detect anomalies through reconstruction errors, resulting in a decrease in defect detection accuracy.
发明内容Contents of the invention
为了提升基于重建的缺陷检测方法的准确度,本发明提供了一种基于图像恢复的工业缺陷检测方法,包括:In order to improve the accuracy of the reconstruction-based defect detection method, the present invention provides an industrial defect detection method based on image restoration, including:
步骤一:获取待测工件的原始图像,输入编码器,得到一组多尺度特征图;Step 1: Obtain the original image of the workpiece to be tested, input it into the encoder, and obtain a set of multi-scale feature maps;
步骤二:利用块金字塔记忆模块对深层特征图进行异常特征的抑制,得到一组过滤异常后的特征图;所述块金字塔记忆模块采用正常图像进行训练;Step 2: Use the block pyramid memory module to suppress abnormal features of the deep feature maps to obtain a set of abnormally filtered feature maps; the block pyramid memory module uses normal images for training;
步骤三:通过基于空间注意力机制的特征融合模块(FFM-SA),将所述步骤一得到的多尺度特征图和所述步骤二得到的过滤异常后的特征图进行融合,得到增强细节信息的特征图;Step 3: Use the feature fusion module (FFM-SA) based on the spatial attention mechanism to fuse the multi-scale feature map obtained in the step 1 and the anomaly-filtered feature map obtained in the step 2 to obtain enhanced detail information. feature map;
步骤四:将所述步骤三得到的增强细节信息的特征图输入解码器,进行图像恢复;Step 4: Input the feature map of enhanced detail information obtained in Step 3 into the decoder to perform image restoration;
步骤五:将所述原始图像和所述步骤四得到的恢复图像输入到分割子网络中,得到缺陷分割结果,由此判断待测工件存在的缺陷。Step 5: Input the original image and the restored image obtained in step 4 into the segmentation sub-network to obtain the defect segmentation result, thereby determining the defects in the workpiece to be tested.
可选的,所述步骤二中块金字塔记忆模块的计算过程包括:Optionally, the calculation process of the block pyramid memory module in step 2 includes:
对所述原始图像x,编码器E输出一组多尺度特征图{Z1,Z2,...,Z5},将深层特征图{Z3,Z4,Z5}分别输入对应的块金字塔记忆模块;For the original image x, the encoder E outputs a set of multi-scale feature maps {Z 1 , Z 2 ,..., Z 5 }, and inputs the deep feature maps {Z 3 , Z 4 , Z 5 } into the corresponding Block Pyramid Memory Module;
第三层特征图的处理过程包括:The third layer feature map The processing includes:
(1)在第一个分支中,将特征图Z3均分为个块,其中/>和/>分别为沿高度、宽度和通道维度的划分率;每个块的高度为/>宽度为/>通道长度为/>H为高,W为宽,C为通道维数;(1) In the first branch, the feature map Z 3 is equally divided into blocks, where/> and/> are the division ratios along the height, width, and channel dimensions respectively; the height of each block is/> The width is/> The channel length is/> H is the height, W is the width, and C is the channel dimension;
对于每一个块,将其展平后得到一个向量作为查询,根据查询与内存中的内存项的相似性来计算注意权值w,其中/>内存M1包含N1个维度为P1的实值向量,P1=h1×w1×c1;For each block, flatten it to get a vector as the query, according to the query and memory The attention weight w is calculated based on the similarity of the memory items in , where/> Memory M 1 contains N 1 real-valued vectors with dimensions P 1 , P 1 =h 1 ×w 1 ×c 1 ;
然后通过使用权值w对内存M1中的内存项进行线性组合,最后将得到的特征向量变换为与Z3相同的维度大小的特征图 Then linearly combine the memory items in memory M 1 by using the weight w, and finally transform the obtained feature vector into a feature map with the same dimension size as Z 3
(2)在第二个分支中,划分率为rH2=2rH1,rW2=2rW1,rC2=2rC1,内存表示为其他计算过程与所述第一个分支中相同,通过查询并聚合后得到特征图 (2) In the second branch, the division rate is r H2 = 2r H1 , r W2 = 2r W1 , r C2 = 2r C1 , and the memory is expressed as Other calculation processes are the same as in the first branch. The feature map is obtained through query and aggregation.
(3)将增强后的特征图进行特征融合后得到最终的特征图/> (3) Convert the enhanced feature map to After feature fusion, the final feature map is obtained/>
第四层特征图Z4与第五层特征图Z5的处理过程与所述第三层特征图Z3相同,由此得到一组变换后的特征图 The processing process of the fourth layer feature map Z 4 and the fifth layer feature map Z 5 is the same as the third layer feature map Z 3 , thus obtaining a set of transformed feature maps
可选的,FFM-SA模块的计算过程包括:Optional, the calculation process of the FFM-SA module includes:
将所述块金字塔记忆模块输出的修复特征图和/>以及/>输入所述FFM-SA模块;The repaired feature map output by the block pyramid memory module and/> and/> Enter the FFM-SA module;
首先对和Fabnormal通过逐元素求和来实现特征融合:First of all and F abnormal achieve feature fusion through element-by-element summation:
接下来,使用K×K的卷积核对Ffuse做卷积运算,得到通道维度为3的特征图,即注意力特征图和/> Next, use a K×K convolution kernel to perform a convolution operation on F fuse to obtain a feature map with a channel dimension of 3, which is the attention feature map. and/>
通过softmax函数对注意力特征图在通道维度做归一化处理,公式如下:The attention feature map is normalized in the channel dimension through the softmax function. The formula is as follows:
其中和/>分别表示/>和/>的第i行、第j列元素;in and/> Respectively expressed/> and/> The i-th row and j-th column elements;
最后使用三个标准化注意力特征图来选择和Fabnormal的信息得到最终输出特征图Fout,计算公式如下:Finally, three standardized attention feature maps are used to select and F abnormal information to obtain the final output feature map F out . The calculation formula is as follows:
其中⊙表示点积运算;由此得到第三层特征图Z3对应的增强后的特征图;where ⊙ represents the dot product operation; thus the enhanced feature map corresponding to the third layer feature map Z 3 is obtained;
针对第四层特征图Z4与第五层特征图Z5,分别进行相同的处理,以得到对应的增强后的特征图。For the fourth layer feature map Z 4 and the fifth layer feature map Z 5 , the same processing is performed respectively to obtain the corresponding enhanced feature map.
可选的,所述步骤五包括:Optionally, step five includes:
步骤51:使用ResNet34作为分割模块的特征提取网络,将恢复图像与原始图像分别输入到该网络中,分别提取多尺度的特征图,并且对两组特征图分别进行通道维度的拼接;Step 51: Use ResNet34 as the feature extraction network of the segmentation module, input the restored image and the original image into the network respectively, extract multi-scale feature maps respectively, and splice the two sets of feature maps in channel dimensions respectively;
步骤52:对级联后的特征图进行多尺度特征融合;Step 52: Perform multi-scale feature fusion on the concatenated feature maps;
通过CA注意力机制获取跨通道信息、方向感知和位置信息,使用1×1的卷积进行特征融合并进行通道降维;The CA attention mechanism is used to obtain cross-channel information, direction perception and position information, and 1×1 convolution is used for feature fusion and channel dimensionality reduction;
通过上采样将不同尺度的特征图的分辨率对齐,然后再使用卷积将通道维度对齐,最后执行逐元素相加操作,实现多尺度特征融合;Align the resolutions of feature maps at different scales through upsampling, then use convolution to align channel dimensions, and finally perform element-wise addition operations to achieve multi-scale feature fusion;
每一层特征图在经过多尺度特征融合后,再执行3×3的卷积操作;After each layer of feature maps undergoes multi-scale feature fusion, a 3×3 convolution operation is performed;
将除第一层之外的特征图上采样至与第一层特征图相同的分辨率大小后,进行通道维度的拼接后输入到分割头中进行缺陷分割得到异常分割结果。After the feature maps except the first layer are upsampled to the same resolution size as the first layer feature map, the channel dimensions are spliced and then input into the segmentation head for defect segmentation to obtain abnormal segmentation results.
可选的,将UNet的编码器和解码器作为重建子网络的编码器和解码器。Optionally, use the encoder and decoder of UNet as the encoder and decoder of the reconstructed subnetwork.
可选的,训练整体网络的过程中基于Perlin噪声和贝塞尔曲线伪造异常样本,具体过程包括:Optionally, in the process of training the overall network, forge abnormal samples based on Perlin noise and Bezier curve. The specific process includes:
使用显著性检测方法CPD得到正常图像的显著性区域,再利用Perlin噪声和贝塞尔曲线生成缺陷掩膜,将缺陷掩膜与异常源图像结合后粘贴在正常图像的显著性区域中得到伪造的缺陷样本。Use the saliency detection method CPD to obtain the salient area of the normal image, then use Perlin noise and Bezier curve to generate a defect mask, combine the defect mask with the abnormal source image and paste it in the salient area of the normal image to obtain the forged Defective samples.
可选的,训练过程分为两个阶段;Optionally, the training process is divided into two stages;
第一个阶段仅使用正常样本对基于块金字塔记忆模块的重建网络进行训练;The first stage trains the reconstruction network based on the block pyramid memory module using only normal samples;
第二个阶段使用所述伪造的缺陷样本训练基于图像恢复的异常检测与分割网络,且不再更新所述块金字塔记忆模块的权重。The second stage uses the forged defect samples to train an anomaly detection and segmentation network based on image restoration, and no longer updates the weights of the block pyramid memory module.
可选的,第一个阶段的重建损失为:Optional, the reconstruction loss in the first stage is:
其中,为L2损失函数,/>为SSIM损失函数。in, is the L 2 loss function,/> is the SSIM loss function.
可选的,第二个阶段的损失为:Optional, the loss in the second stage is:
Loss=Lrec2+Lseg Loss=L rec2 +L seg
其中,Lrec2为恢复损失,Lseg为分割损失;Among them, L rec2 is the recovery loss, and L seg is the segmentation loss;
Lseg=Lossfocal(Im,AM)L seg =Loss focal (I m ,AM)
其中,Im代表ground truth,AM代表分割结果,Lossfocal表示Focal Loss函数。Among them, I m represents the ground truth, AM represents the segmentation result, and Loss focal represents the Focal Loss function.
本发明还提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可执行指令,所述述计算机可执行指令被处理器执行时实现上述任一项所述的方法。The present invention also provides a computer-readable storage medium that stores computer-executable instructions. When the computer-executable instructions are executed by a processor, any one of the above methods is implemented.
本发明有益效果是:The beneficial effects of the present invention are:
通过以下对本发明技术特征的描述,解决了采用常规重建方法时,异常图像可能会被很好的重建,因此不能通过重建误差来检测异常的缺点,达到了准确检测工业产品中表面缺陷的目的。Through the following description of the technical features of the present invention, the disadvantage that abnormal images may be reconstructed well when conventional reconstruction methods are used and therefore cannot detect abnormalities through reconstruction errors is solved, thereby achieving the purpose of accurately detecting surface defects in industrial products.
1)提出了伪造异常方法来生成缺陷样本。通过对以往的伪造异常方法进行改进,提出了基于Perlin噪声和贝塞尔曲线的显著性异常伪造方法(SAFM-PB)。在显著性前景区域生成缺陷,从而避免背景区域噪声的影响;使用Perlin噪声和贝塞尔曲线生成形式多样的缺陷形状,并将异常源图像设置为与数据集分布不同的图像或随机采样得到的正常图像,增加缺陷的多样性,鼓励模型学习异常样本的不规则性。1) A fake anomaly method is proposed to generate defect samples. By improving the previous forgery anomaly methods, a significant anomaly forgery method (SAFM-PB) based on Perlin noise and Bezier curve is proposed. Generate defects in the salient foreground area to avoid the influence of noise in the background area; use Perlin noise and Bezier curves to generate various defect shapes, and set the abnormal source image to an image different from the distribution of the data set or randomly sampled Normal images, increase the diversity of defects and encourage the model to learn the irregularities of abnormal samples.
2)为了增强恢复图像的细节信息,提出基于空间注意力机制的特征融合模块(FFM-SA),将编码器提取的特征和通过块金字塔记忆模块过滤掉异常的多尺度特征进行有效地融合来增强恢复图像的细节信息,提升异常检测准确度。2) In order to enhance the detailed information of the recovered image, a feature fusion module (FFM-SA) based on the spatial attention mechanism is proposed to effectively fuse the features extracted by the encoder and the abnormal multi-scale features filtered out by the block pyramid memory module. Enhance the detailed information of the restored image and improve the accuracy of anomaly detection.
3)增加分割子网络,提取异常图像和恢复图像的多尺度特征,并完成对应尺度特征的级联操作,为异常区域的定位提供更有效的信息。3) Add a segmentation sub-network to extract multi-scale features of abnormal images and restored images, and complete cascade operations of corresponding scale features to provide more effective information for locating abnormal areas.
附图说明Description of the drawings
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are only some embodiments of the present invention. For those of ordinary skill in the art, other drawings can also be obtained based on these drawings without exerting creative efforts.
图1为本发明一种基于图像恢复的工业缺陷检测方法的整体网络结构图。Figure 1 is an overall network structure diagram of an industrial defect detection method based on image recovery according to the present invention.
图2为基于块金字塔记忆模块的重建网络结构图。Figure 2 is a reconstruction network structure diagram based on the block pyramid memory module.
图3为伪造的条状、块状缺陷掩膜和其对应的缺陷样本示例图。Figure 3 is an example of a forged strip-shaped or block-shaped defect mask and its corresponding defect sample.
图4为基于空间注意力机制的特征融合模块(FFM-SA)网络结构图。Figure 4 is a network structure diagram of the feature fusion module (FFM-SA) based on the spatial attention mechanism.
图5为分割子网络的网络结构图。Figure 5 is a network structure diagram of segmented sub-networks.
具体实施方式Detailed ways
为使本发明的目的、技术方案和优点更加清楚,下面将结合附图对本发明实施方式作进一步地详细描述。In order to make the purpose, technical solutions and advantages of the present invention clearer, the embodiments of the present invention will be described in further detail below with reference to the accompanying drawings.
实施例一:Example 1:
本实施例提供一种基于图像恢复的工业缺陷检测方法,包括:This embodiment provides an industrial defect detection method based on image restoration, including:
步骤一:获取待测工件的原始图像,输入编码器,得到一组多尺度特征图;Step 1: Obtain the original image of the workpiece to be tested, input it into the encoder, and obtain a set of multi-scale feature maps;
步骤二:利用块金字塔记忆模块对深层特征图进行异常特征的抑制,得到一组过滤异常后的特征图;所述块金字塔记忆模块采用正常图像进行训练;Step 2: Use the block pyramid memory module to suppress abnormal features of the deep feature maps to obtain a set of abnormally filtered feature maps; the block pyramid memory module uses normal images for training;
步骤三:通过基于空间注意力机制的特征融合模块(FFM-SA),将所述步骤一得到的多尺度特征图和所述步骤二得到的过滤异常后的特征图进行融合,得到增强细节信息的特征图;Step 3: Use the feature fusion module (FFM-SA) based on the spatial attention mechanism to fuse the multi-scale feature map obtained in the step 1 and the anomaly-filtered feature map obtained in the step 2 to obtain enhanced detail information. feature map;
步骤四:将所述步骤三得到的增强细节信息的特征图输入解码器,进行图像恢复;Step 4: Input the feature map of enhanced detail information obtained in Step 3 into the decoder to perform image restoration;
步骤五:将所述原始图像和所述步骤四得到的恢复图像输入到分割子网络中,得到缺陷分割结果,由此判断待测工件存在的缺陷。Step 5: Input the original image and the restored image obtained in step 4 into the segmentation sub-network to obtain the defect segmentation result, thereby determining the defects in the workpiece to be tested.
实施例二:Example 2:
本实施例提供一种基于图像恢复的工业缺陷检测方法,实现该方法的整体网络结构如图1所示,包括一个恢复子网络和一个分割子网络,其中恢复子网络由编码器E、解码器D、基于Perlin噪声和贝塞尔曲线的显著性异常伪造模块(SAFM-PB)、块金字塔记忆模块(M)和基于空间注意力机制的特征融合模块(FFM-SA)组成,其中将UNet的编码器和解码器作为恢复子网络的编码器E和解码器D。This embodiment provides an industrial defect detection method based on image restoration. The overall network structure to implement the method is shown in Figure 1, including a restoration sub-network and a segmentation sub-network, where the restoration sub-network consists of an encoder E, a decoder D. It consists of the Salient Anomaly Forgery Module (SAFM-PB) based on Perlin noise and Bezier curve, the block pyramid memory module (M) and the feature fusion module (FFM-SA) based on the spatial attention mechanism, in which UNet's The encoder and decoder serve as the encoder E and decoder D of the recovery sub-network.
网络训练的过程包括两个阶段,第一阶段使用正常样本训练图2所示的基于块金字塔记忆模块的重建网络,旨在存储正常样本的一般模式,为后续恢复任务过滤掉异常特征。第二阶段训练如图1所示的网络,此时不再更新块金字塔记忆模块的权重。下面对两个阶段的训练过程进行具体介绍:The process of network training includes two stages. The first stage uses normal samples to train the reconstruction network based on the block pyramid memory module shown in Figure 2, which is designed to store the general patterns of normal samples and filter out abnormal features for subsequent recovery tasks. In the second stage, the network shown in Figure 1 is trained, and the weights of the block pyramid memory module are no longer updated at this time. The following is a detailed introduction to the two-stage training process:
一、基于块金字塔记忆模块的重建网络的训练1. Training of reconstruction network based on block pyramid memory module
第一阶段训练基于块金字塔记忆模块的重建网络,仅使用正常样本进行训练,网络结构如图2所示,包括编码器E和解码器D,通过在编码器和解码器之间增加跳跃连接和块金字塔记忆模块来过滤异常特征。The first stage trains the reconstruction network based on the block pyramid memory module, using only normal samples for training. The network structure is shown in Figure 2, including the encoder E and the decoder D. By adding skip connections and Block pyramid memory module to filter abnormal features.
对于输入图像x,编码器E输出一组特征,记为{Z1,Z2,...,Z5}。将深层特征图{Z3,Z4,Z5}作为其对应的块金字塔记忆模块的输入。For the input image x, the encoder E outputs a set of features, denoted as {Z 1 , Z 2 ,..., Z 5 }. Take the deep feature map {Z 3 , Z 4 , Z 5 } as the input of its corresponding block pyramid memory module.
在此,本实施例以编码器E输出的第三层特征图(H为高,W为宽,C为通道维数)为例(如图2的下方虚线方框所示),对块金字塔记忆模块的存储和读取过程进行详细介绍:Here, this embodiment uses the third layer feature map output by the encoder E (H is height, W is width, C is channel dimension) As an example (shown in the lower dotted box in Figure 2), the storage and reading process of the block pyramid memory module is introduced in detail:
(1)在第一个分支中,将特征图Z3均分为个块,其中/>和/>分别为沿高度、宽度和通道维度的划分率。每个块的高度为/>宽度为/>通道长度为/>对于每一个块,将其展平后得到一个向量作为查询,根据查询与内存/>(表示内存M1包含N1个维度为P1的实值向量,P1=h1×w1×c1)中的内存项的相似性来计算注意权值w,其中/>然后通过使用权值w对内存M1中的内存项进行线性组合(其过程如图2的右方虚线方框所示),最后将得到的特征向量变换为与Z3相同的维度大小的特征图/> (1) In the first branch, the feature map Z 3 is equally divided into blocks, where/> and/> are the division ratios along the height, width, and channel dimensions, respectively. The height of each block is/> The width is/> The channel length is/> For each block, flatten it to get a vector as the query, according to the query and memory/> (Indicates that memory M 1 contains N 1 real-valued vectors with dimensions P 1 , P 1 =h 1 ×w 1 ×c 1 ) to calculate the attention weight w, where/> Then linearly combine the memory items in memory M 1 by using the weight w (the process is shown in the right dotted box in Figure 2), and finally transform the obtained feature vector into a feature with the same dimension size as Z 3 Picture/>
(2)第二个分支与第一个分支类似,但划分率为rH2=2rH1,rW2=2rW1,rC2=2rC1,内存表示为通过查询并聚合后得到特征图/> (2) The second branch is similar to the first branch, but the division rate is r H2 = 2r H1 , r W2 = 2r W1 , r C2 = 2r C1 , and the memory is expressed as The feature map is obtained after querying and aggregating/>
(3)将增强后的特征图进行特征融合(通道拼接,然后1×1卷积降维)后得到最终的特征图/> (3) Convert the enhanced feature map to After feature fusion (channel splicing, then 1×1 convolution dimensionality reduction), the final feature map/>
对于Z4和Z5可进行同样操作,由此得到一组变换后的特征图随后送入解码器D用于对输入图像x进行重建。The same operation can be performed for Z 4 and Z 5 , thus obtaining a set of transformed feature maps It is then sent to the decoder D for reconstruction of the input image x.
二、基于图像恢复的异常检测与分割网络的训练2. Training of anomaly detection and segmentation network based on image restoration
第二阶段训练基于图像恢复的异常检测与分割网络,如图1所示,对于输入正常图像x,经过SAFM-PB模块得到异常图像x′,输入编码器E后输出一组多尺度特征,通过使用块金字塔记忆模块抑制异常特征输出,得到一组过滤异常后的特征图,再和编码器E输出的多尺度特征通过FFM-SA模块融合后得到一组增强细节信息的特征图,送入解码器D用于图像恢复。最后将伪造的异常图像与恢复图像输入到分割子网络中用于缺陷分割。The second stage trains the anomaly detection and segmentation network based on image restoration, as shown in Figure 1. For the input normal image x, the abnormal image x′ is obtained through the SAFM-PB module. After inputting to the encoder E, a set of multi-scale features is output. Use the block pyramid memory module to suppress abnormal feature output and obtain a set of abnormally filtered feature maps, which are then fused with the multi-scale features output by the encoder E through the FFM-SA module to obtain a set of feature maps with enhanced detail information, which are sent to decoding. Device D is used for image recovery. Finally, the forged abnormal images and restored images are input into the segmentation subnetwork for defect segmentation.
1.基于Perlin噪声和贝塞尔曲线的显著性异常伪造方法1. Significant anomaly forgery method based on Perlin noise and Bezier curve
本发明基于Perlin噪声和贝塞尔曲线伪造异常样本。以前提出的模型中,伪造的异常很容易出现在背景中。为了使异常生成在显著性前景区域而不是背景区域,本发明通过使用显著性检测方法CPD得到正常图像的显著性区域,再利用Perlin噪声和贝塞尔曲线生成缺陷掩膜,将缺陷掩膜与异常源图像结合后粘贴在正常图像的显著性区域中得到伪造的缺陷样本,使得生成的异常样本更有利于模型学习。This invention forges abnormal samples based on Perlin noise and Bezier curve. In previously proposed models, spurious anomalies can easily appear in the background. In order to make the abnormality generated in the significant foreground area instead of the background area, the present invention obtains the salient area of the normal image by using the saliency detection method CPD, and then uses Perlin noise and Bezier curve to generate a defect mask, and combines the defect mask with The abnormal source images are combined and pasted in the salient area of the normal image to obtain forged defect samples, making the generated abnormal samples more conducive to model learning.
本发明使用Perlin噪声来生成缺陷掩膜,与异常源图像结合后粘贴在正常图像的显著性区域得到缺陷样本。另外,在产品的生产过程中,很容易出现划伤、裂缝、条状遮挡物等条状缺陷,颜色污染等块状缺陷。本发明通过贝塞尔曲线伪造条状和块状缺陷样本,随机地使用二、三或四阶贝塞尔曲线得到条状缺陷掩膜。块状缺陷掩膜由三条贝塞尔曲线组成,每条曲线为随机的二、三或四阶贝塞尔曲线,其中下一条曲线为前一条曲线的终点,将三条曲线连接形成闭合的图形再进行填充得到块状缺陷掩膜。最后将缺陷掩膜与异常源图像结合后粘贴在正常图像的显著性区域中得到伪造的缺陷样本。图3(a)所示的是伪造的条状缺陷掩膜和对应的缺陷样本,图3(b)所示的是块状缺陷掩膜和对应的缺陷样本。This invention uses Perlin noise to generate a defect mask, combines it with the abnormal source image and then pastes it into the salient area of the normal image to obtain a defect sample. In addition, during the production process of products, strip defects such as scratches, cracks, and strip obstructions, and block defects such as color contamination are prone to occur. The present invention forges strip-shaped and block-shaped defect samples through Bezier curves, and randomly uses second-, third- or fourth-order Bezier curves to obtain strip-shaped defect masks. The block defect mask consists of three Bezier curves. Each curve is a random second, third or fourth order Bezier curve. The next curve is the end point of the previous curve. The three curves are connected to form a closed figure. Filling is performed to obtain a block defect mask. Finally, the defect mask is combined with the abnormal source image and pasted in the salient area of the normal image to obtain a forged defect sample. Figure 3(a) shows the forged strip defect mask and the corresponding defect sample, and Figure 3(b) shows the block defect mask and the corresponding defect sample.
2.基于空间的注意力机制特征融合模块2. Space-based attention mechanism feature fusion module
本发明提出了FFM-SA模块的结构,其网络结构如图4所示。本发明以增强编码器E输出的第三层特征图为例详细地描述该模块。The present invention proposes the structure of the FFM-SA module, and its network structure is shown in Figure 4. This invention enhances the third layer feature map output by the encoder E Take an example to describe this module in detail.
该模块的输入包括块金字塔记忆模块输出的修复特征图和/>以及/>(即Fabnormal),其中/>和/>表示分别以尺寸为h1×w1×c1和h2×w2×c2的内存块进行存储和读取得到的修复特征图。The input to this module includes the inpainted feature map output by the block pyramid memory module. and/> and/> (i.e. F abnormal ), where/> and/> Indicates the repaired feature maps obtained by storing and reading memory blocks with sizes h 1 × w 1 × c 1 and h 2 × w 2 × c 2 respectively.
首先对和Fabnormal通过逐元素求和来实现特征融合:First of all and F abnormal achieve feature fusion through element-by-element summation:
接下来,使用K×K的卷积核对Ffuse做卷积运算,假设K的大小应该大于h2或w2(例如:设第三层两种内存块的尺寸分别为2×2×c1,4×4×c2,则K=5),得到通道维度为3的特征图,即注意力特征图和/>与原始的FFM模块相比,本发明对特征图Ffuse仅通过一个K×K卷积核进行卷积运算即得到三个注意力图,并且注意力图的通道维度仅为1,经过改进后,大幅度地减少了整个模块的计算量,显著降低了模块的复杂度,使模块更加轻量化。Next, use K×K convolution kernel to perform convolution operation on F fuse , assuming that the size of K should be larger than h 2 or w 2 (for example: assuming the sizes of the two memory blocks in the third layer are 2×2×c 1 respectively , 4×4×c 2 , then K=5), and obtain the feature map with channel dimension 3, that is, the attention feature map and/> Compared with the original FFM module, this invention only performs convolution operation on the feature map F fuse through a K×K convolution kernel to obtain three attention maps, and the channel dimension of the attention map is only 1. After improvement, the It greatly reduces the calculation amount of the entire module, significantly reduces the complexity of the module, and makes the module more lightweight.
通过softmax函数对注意力特征图在通道维度做归一化处理,公式如下:The attention feature map is normalized in the channel dimension through the softmax function. The formula is as follows:
其中和/>分别表示/>和/>的第i行、第j列元素。in and/> Respectively expressed/> and/> The i-th row and j-th column elements.
最后使用三个标准化注意力特征图来选择和Fabnormal的信息得到最终输出特征图Fout,计算公式如下:Finally, three standardized attention feature maps are used to select and F abnormal information to obtain the final output feature map F out . The calculation formula is as follows:
其中⊙表示点积运算。where ⊙ represents the dot product operation.
由此得到一组增强后的特征图,随后送入解码器D用于对输入图像x′进行恢复。A set of enhanced feature maps is thus obtained, which are then sent to the decoder D for restoring the input image x′.
3.分割子网络3. Split subnetworks
恢复子网络的输入图像和恢复图像之间的像素差通常用于缺陷检测和分割,但是单个像素没有语义信息,区分正常特征和异常特征需要丰富的上下文信息。另外实际生产中的工业缺陷的大小通常是多尺度的。因此实现精确地缺陷检测与分割需要上下文信息和多尺度信息,由此本发明提出多尺度特征融合模块,其网络结构如图5所示。The pixel difference between the input image and the restored image of the restored sub-network is often used for defect detection and segmentation, but a single pixel has no semantic information, and distinguishing normal features from abnormal features requires rich contextual information. In addition, the size of industrial defects in actual production is usually multi-scale. Therefore, contextual information and multi-scale information are required to achieve accurate defect detection and segmentation. Therefore, the present invention proposes a multi-scale feature fusion module, whose network structure is shown in Figure 5.
本发明使用ResNet34作为分割模块的特征提取网络,将恢复图像与伪造的异常图像分别输入到该网络中,提取多尺度的特征图,得到伪造异常图像的四层特征图分别为和/>恢复图像的四层特征图分别为和/>对这两组特征图分别进行通道维度的拼接,从而为异常区域的定位提供更有效的信息。This invention uses ResNet34 as the feature extraction network of the segmentation module, inputs the restored image and the forged abnormal image into the network respectively, extracts multi-scale feature maps, and obtains the four-layer feature maps of the forged abnormal image as follows: and/> The four-layer feature maps of the restored image are respectively and/> These two sets of feature maps are spliced in channel dimensions to provide more effective information for locating abnormal areas.
最后对级联后的特征图进行多尺度特征融合。首先通过CA注意力机制获取跨通道信息、方向感知和位置信息,从而帮助模型更精确地定位和识别感兴趣的目标。再使用1×1的卷积进行特征融合并进行通道降维,接着继续进行多尺度信息融合:首先通过上采样将不同尺度的特征图的分辨率对齐,然后再使用卷积将通道维度对齐,最后执行逐元素相加操作,实现多尺度特征融合。每一层特征图在经过多尺度特征融合后,再执行3×3的卷积操作。将第二、三、和四层特征图上采样至与第一层特征图相同的分辨率大小后,进行通道维度的拼接后输入到分割头中进行缺陷分割得到异常分割结果,记为AM,并与像素级标签做损失计算。Finally, multi-scale feature fusion is performed on the concatenated feature maps. First, cross-channel information, direction perception and position information are obtained through the CA attention mechanism, thereby helping the model to more accurately locate and identify targets of interest. Then use 1×1 convolution for feature fusion and channel dimensionality reduction, and then continue with multi-scale information fusion: first align the resolutions of feature maps of different scales through upsampling, and then use convolution to align the channel dimensions. Finally, element-by-element addition is performed to achieve multi-scale feature fusion. After each layer of feature maps undergoes multi-scale feature fusion, a 3×3 convolution operation is performed. After the second, third, and fourth layer feature maps are upsampled to the same resolution size as the first layer feature map, the channel dimensions are spliced and then input into the segmentation head for defect segmentation to obtain an abnormal segmentation result, recorded as AM. And do loss calculation with pixel-level labels.
4.损失函数4.Loss function
对于第一阶段,结合L2损失和基于块的SSIM损失函数来训练模型,重建损失为:For the first stage, the model is trained by combining L2 loss and block-based SSIM loss function, and the reconstruction loss is:
对于第二阶段,包括恢复损失和分割损失。正常图像经过SAFM-PB方法得到的伪造异常样本输入模型中进行训练,但是在训练过程中面临着巨大的类别不平衡,因此通过应用Focal Loss函数作为分割子网络的损失函数来缓解类别不平衡问题,通过惩罚常见的类别并聚焦于训练困难样本,以提高对这些样本的准确分割能力,从而增强模型的鲁棒性。分割损失为:For the second stage, recovery losses and segmentation losses are included. Normal images are input into the model for training with forged abnormal samples obtained by the SAFM-PB method. However, they face a huge category imbalance during the training process. Therefore, the Focal Loss function is used as the loss function of the segmentation sub-network to alleviate the category imbalance problem. , by penalizing common categories and focusing on training difficult samples to improve the ability to accurately segment these samples, thereby enhancing the robustness of the model. The segmentation loss is:
Lseg=Lossfocal(Im,AM)L seg =Loss focal (I m ,AM)
其中Im代表ground truth,AM代表分割结果。Among them, I m represents the ground truth and AM represents the segmentation result.
第二阶段的恢复损失为:The recovery loss in the second stage is:
第二阶段总的损失函数即为:The total loss function in the second stage is:
Loss=Lrec2+Lseg Loss=L rec2 +L seg
5.异常判别5. Abnormality identification
本发明使用分割异常图直接评估图像级异常得分,即判断图像中是否存在异常。首先通过均值滤波卷积对AM进行平滑操作,来聚集局部异常信息,最后将平滑后的异常图的最大值作为最终图像级异常得分score:The present invention uses segmentation anomaly maps to directly evaluate image-level anomaly scores, that is, to determine whether there are anomalies in the image. First, AM is smoothed through mean filter convolution to gather local anomaly information. Finally, the maximum value of the smoothed anomaly map is used as the final image-level anomaly score score:
score=max(AM*fsf×sf)score=max(AM*f sf×sf )
其中sf×sf表示均值滤波器的大小为sf×sf,*代表卷积操作,AM代表分割结果。Among them, sf×sf means that the size of the mean filter is sf×sf, * represents the convolution operation, and AM represents the segmentation result.
经过上述两个阶段的训练,可以采用训练好的网络结构进行工业缺陷的检测,检测过程如下:After the above two stages of training, the trained network structure can be used to detect industrial defects. The detection process is as follows:
步骤一:获取待测工件的原始图像,输入编码器,得到一组多尺度特征图;Step 1: Obtain the original image of the workpiece to be tested, input it into the encoder, and obtain a set of multi-scale feature maps;
步骤二:利用块金字塔记忆模块对深层特征图进行异常特征的抑制,得到一组过滤异常后的特征图;块金字塔记忆模块采用正常图像进行训练;Step 2: Use the block pyramid memory module to suppress abnormal features of the deep feature maps to obtain a set of abnormally filtered feature maps; the block pyramid memory module uses normal images for training;
步骤三:通过基于空间注意力机制的特征融合模块FFM-SA,将步骤一得到的多尺度特征图和步骤二得到的过滤异常后的特征图进行融合,得到增强细节信息的特征图;Step 3: Use the feature fusion module FFM-SA based on the spatial attention mechanism to fuse the multi-scale feature map obtained in step 1 and the abnormally filtered feature map obtained in step 2 to obtain a feature map with enhanced detail information;
步骤四:将步骤三得到的增强细节信息的特征图输入解码器,进行图像恢复;Step 4: Input the feature map with enhanced detail information obtained in Step 3 into the decoder for image restoration;
步骤五:将所述原始图像和所述步骤四得到的恢复图像输入到分割子网络中,得到缺陷分割结果,由此判断待测工件存在的缺陷。为了进一步说明本发明的有益效果,进行了实验及分析如下:Step 5: Input the original image and the restored image obtained in step 4 into the segmentation sub-network to obtain the defect segmentation result, thereby determining the defects in the workpiece to be tested. In order to further illustrate the beneficial effects of the present invention, experiments and analyzes were conducted as follows:
1、数据集和实验设置1. Dataset and experimental settings
本发明使用工业异常检测数据集MVTec AD,该数据集专为无监督工业视觉异常检测而提出的,是目前该领域的重要基准。该数据集包括用来训练的3629张正常图像和用于测试的1725张图像(包括正常图像和异常图像),由15种类别的对象组成,包括10种物体类和5种纹理类,每类图像包括一个训练集和一个测试集。图像的分辨率在700~1024范围内,该数据集的挑战在于有些异常图像与正常图像的差别较小,且异常面积的大小差别较大。在实验时,将图像大小调整为256×256。The present invention uses the industrial anomaly detection data set MVTec AD, which is specially proposed for unsupervised industrial visual anomaly detection and is currently an important benchmark in this field. The data set includes 3629 normal images for training and 1725 images for testing (including normal images and abnormal images). It consists of 15 categories of objects, including 10 object categories and 5 texture categories. Each category The images include a training set and a test set. The resolution of the image ranges from 700 to 1024. The challenge of this data set is that the difference between some abnormal images and normal images is small, and the size of the abnormal area is quite different. During experiments, the image size was adjusted to 256×256.
本发明方法使用PyTorch 1.8.1和CUDA 11.1实现,所有实验均在RTX 3090GPU上运行。使用Adam优化器,初始学习率设置lr=0.0001,学习率在0.8×num_Epoch和0.9×num_Epoch后乘以0.2,gamma=0.2;第一阶段训练Epoch为600,第二阶段训练Epoch为400;batch_size=4。The method of this invention is implemented using PyTorch 1.8.1 and CUDA 11.1, and all experiments are run on RTX 3090GPU. Using the Adam optimizer, the initial learning rate is set to lr=0.0001, the learning rate is multiplied by 0.2 after 0.8×num_Epoch and 0.9×num_Epoch, gamma=0.2; the first-stage training Epoch is 600, and the second-stage training Epoch is 400; batch_size= 4.
在实验中,对于bottle、cable、metal_nut和transistor类别,设置和/>剩余类别设置/>和/> In the experiment, for the bottle, cable, metal_nut and transistor categories, set and/> Remaining category settings/> and/>
2、性能评价指标2. Performance evaluation indicators
本实验采用AUROC(Area Under ROC Curve,为ROC曲线下的面积)来评价模型的性能,当AUROC分数越高时,模型性能越好。This experiment uses AUROC (Area Under ROC Curve, which is the area under the ROC curve) to evaluate the performance of the model. When the AUROC score is higher, the model performance is better.
测试样本通过异常检测网络生成一个异常分数,当异常分数大于一定的阈值时判定为异常样本。将正常样本作为负例(Negative),将异常样本作为正例(Positive),正确检测到的异常样本为真正例(TP),正确检测到的正常样本为真负例(TN),将正常样本错误的检测为异常样本称为假正例(FP),将异常样本错误的检测为正常样本称为假负例(FN),不同的阈值将产生一系列的真正例率(TPR)和假正例率(FPR),计算公式如下:The test sample generates an anomaly score through the anomaly detection network. When the anomaly score is greater than a certain threshold, it is determined to be an abnormal sample. Normal samples are regarded as negative examples (Negative), abnormal samples are regarded as positive examples (Positive), correctly detected abnormal samples are regarded as true examples (TP), correctly detected normal samples are regarded as true negative examples (TN), and normal samples are regarded as true negative examples (TN). The wrong detection of abnormal samples is called false positives (FP), and the wrong detection of abnormal samples as normal samples is called false negatives (FN). Different thresholds will produce a series of true positive rates (TPR) and false positives. Case rate (FPR), the calculation formula is as follows:
根据预测的异常分数对测试样本进行排序,按此顺序逐个将小于等于阈值的样本预测为正例,计算FPR和TPR,以FPR为横轴,以TPR为纵轴,在坐标系中绘制ROC曲线,以最终曲线下的面积AUROC来评估本发明方法。Sort the test samples according to the predicted anomaly score, and predict the samples less than or equal to the threshold as positive examples one by one in this order. Calculate FPR and TPR. With FPR as the horizontal axis and TPR as the vertical axis, draw the ROC curve in the coordinate system. , the method of the present invention is evaluated by the area under the final curve AUROC.
3、实验对比结果与分析3. Experimental comparison results and analysis
本发明方法与目前比较经典的异常检测算法进行了比较,表1展示了图像级(image-level)异常检测结果(其中前5行表示纹理类,后10行表示为物体类,Avg.tex、Avg.obj和Avg.all分别代表纹理类、物体类和所有类别的平均AUROC分数)。The method of the present invention is compared with the current classic anomaly detection algorithms. Table 1 shows the image-level anomaly detection results (the first 5 rows represent texture classes, the last 10 rows represent object classes, Avg.tex, Avg.obj and Avg.all represent the texture class, object class and the average AUROC score of all classes respectively).
表1本发明与其他现有方法在图像级异常检测任务的AUROC分数对比Table 1 Comparison of AUROC scores between the present invention and other existing methods in image-level anomaly detection tasks
从表1中可以看出,本发明方法的缺陷检测性能优于许多经典的异常检测算法,比这些算法中最高平均AUROC分数高出0.8个百分点,在15个类别中有8个类别获得最高的AUROC分数,并且在其他类别上也获得了可观的AUROC分数。在物体类中,本发明方法优于所有其他方法。在纹理类中,本发明方法的性能略低于CutPaste,高于其他方法。与CutPaste方法相比,本发明方法的优势表现在物体类和全部类别的平均AUROC都明显优于CutPaste方法。As can be seen from Table 1, the defect detection performance of the inventive method is better than many classic anomaly detection algorithms, 0.8 percentage points higher than the highest average AUROC score among these algorithms, and 8 categories out of 15 categories obtained the highest AUROC score, and achieved respectable AUROC scores in other categories as well. In the object class, the inventive method outperforms all other methods. In the texture class, the performance of the method of the present invention is slightly lower than CutPaste and higher than other methods. Compared with the CutPaste method, the advantage of the method of the present invention is that the average AUROC of the object category and all categories is significantly better than the CutPaste method.
本发明方法与一些主流异常检测算法在像素级(pixel-level)异常检测任务上进行了比较。如表2所示:The method of the present invention is compared with some mainstream anomaly detection algorithms on pixel-level anomaly detection tasks. As shown in table 2:
表2本发明方法与现有方法在像素级异常检测任务的AUROC分数对比Table 2 Comparison of AUROC scores between the method of the present invention and existing methods in pixel-level anomaly detection tasks
从表2中可以看出,本发明方法比其他方法提高了定位的准确性,有更好的异常检测性能。本发明方法获得了最高的像素级平均AUROC分数和mAP分数,并且在7个类别获得了最高的AUROC分数。此外,无论是纹理类还是物体类,本发明方法都达到最优的检测性能。It can be seen from Table 2 that the method of the present invention improves the accuracy of positioning and has better anomaly detection performance than other methods. The method of the present invention obtained the highest pixel-level average AUROC scores and mAP scores, and obtained the highest AUROC scores in 7 categories. In addition, the method of the present invention achieves optimal detection performance regardless of texture type or object type.
综上所述,针对基于重建的方法存在的重建模糊以及重建异常等问题,本发明提出了基于图像恢复的工业异常检测与分割的方法。通过基于Perlin噪声和贝塞尔曲线的显著性异常伪造方法伪造异常样本,增加缺陷样本的多样性,鼓励模型学习异常样本的不规则性。利用FFM-SA模块将编码器输出的特征与经过块金字塔记忆模块过滤掉异常的多尺度特征有效地融合,增强重建图像的细节信息。提出的分割子网络提取异常图像和恢复图像的多尺度特征,并完成对应尺度特征的级联操作,为异常区域的定位提供更有效的信息。最后使用具有代表性的异常检测数据集进行实验,证明了本发明方法较其他异常检测算法具有更好的检测效果。To sum up, in order to solve the problems of reconstruction blur and reconstruction anomalies existing in reconstruction-based methods, the present invention proposes a method of industrial anomaly detection and segmentation based on image restoration. By forging abnormal samples through the significant anomaly forgery method based on Perlin noise and Bezier curve, it increases the diversity of defective samples and encourages the model to learn the irregularities of abnormal samples. The FFM-SA module is used to effectively fuse the features output by the encoder with the multi-scale features filtered out by the block pyramid memory module to enhance the detailed information of the reconstructed image. The proposed segmentation sub-network extracts multi-scale features of abnormal images and restored images, and completes the cascade operation of corresponding scale features to provide more effective information for locating abnormal areas. Finally, experiments were conducted using representative anomaly detection data sets, which proved that the method of the present invention has better detection results than other anomaly detection algorithms.
本发明实施例中的部分步骤,可以利用软件实现,相应的软件程序可以存储在可读取的存储介质中,如光盘或硬盘等。Some steps in the embodiments of the present invention can be implemented using software, and corresponding software programs can be stored in readable storage media, such as optical disks or hard disks.
以上所述仅为本发明的较佳实施例,并不用以限制本发明,凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above are only preferred embodiments of the present invention and are not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of the present invention shall be included in the protection of the present invention. within the range.
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202310921316.3A CN116862894B (en) | 2023-07-26 | 2023-07-26 | An industrial defect detection method based on image restoration |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202310921316.3A CN116862894B (en) | 2023-07-26 | 2023-07-26 | An industrial defect detection method based on image restoration |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN116862894A true CN116862894A (en) | 2023-10-10 |
| CN116862894B CN116862894B (en) | 2025-09-23 |
Family
ID=88230449
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202310921316.3A Active CN116862894B (en) | 2023-07-26 | 2023-07-26 | An industrial defect detection method based on image restoration |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN116862894B (en) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119130992A (en) * | 2024-09-12 | 2024-12-13 | 广州大学 | Industrial defect detection method based on feature information reconstruction based on multi-dimensional feature fusion |
| CN119599998A (en) * | 2024-11-27 | 2025-03-11 | 太原理工大学 | An industrial parts anomaly detection method based on SCFlow |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114677346A (en) * | 2022-03-21 | 2022-06-28 | 西安电子科技大学广州研究院 | End-to-end semi-supervised image surface defect detection method based on memory information |
| CN115439442A (en) * | 2022-08-31 | 2022-12-06 | 湖北星盛电气装备研究院有限公司 | Industrial product surface defect detection and positioning method and system based on commonality and difference |
| CN115661097A (en) * | 2022-11-02 | 2023-01-31 | 北京大学深圳研究生院 | Object surface defect detection method and system |
| CN115661543A (en) * | 2022-11-08 | 2023-01-31 | 浙江科技学院 | A multi-scale defect detection method for industrial parts based on generative adversarial networks |
| US20230154177A1 (en) * | 2020-11-04 | 2023-05-18 | Chengdu Koala Uran Technology Co., Ltd. | Autoregression Image Abnormity Detection Method of Enhancing Latent Space Based on Memory |
| CN116205876A (en) * | 2023-02-20 | 2023-06-02 | 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) | Unsupervised notebook appearance defect detection method based on multi-scale standardized flow |
-
2023
- 2023-07-26 CN CN202310921316.3A patent/CN116862894B/en active Active
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20230154177A1 (en) * | 2020-11-04 | 2023-05-18 | Chengdu Koala Uran Technology Co., Ltd. | Autoregression Image Abnormity Detection Method of Enhancing Latent Space Based on Memory |
| CN114677346A (en) * | 2022-03-21 | 2022-06-28 | 西安电子科技大学广州研究院 | End-to-end semi-supervised image surface defect detection method based on memory information |
| CN115439442A (en) * | 2022-08-31 | 2022-12-06 | 湖北星盛电气装备研究院有限公司 | Industrial product surface defect detection and positioning method and system based on commonality and difference |
| CN115661097A (en) * | 2022-11-02 | 2023-01-31 | 北京大学深圳研究生院 | Object surface defect detection method and system |
| CN115661543A (en) * | 2022-11-08 | 2023-01-31 | 浙江科技学院 | A multi-scale defect detection method for industrial parts based on generative adversarial networks |
| CN116205876A (en) * | 2023-02-20 | 2023-06-02 | 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) | Unsupervised notebook appearance defect detection method based on multi-scale standardized flow |
Non-Patent Citations (2)
| Title |
|---|
| CHUNG H,PARK J,KEUM J,ET AL.: "Unsupervised anomaly Detection Using Style Distillation[J]. arXiv preprint arXiv", IEEE, 31 December 2020 (2020-12-31) * |
| 罗东亮等: "《工业缺陷检测深度学习方法综述》", 中国科学: 信息科学, 13 June 2022 (2022-06-13), pages 1002 - 1039 * |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119130992A (en) * | 2024-09-12 | 2024-12-13 | 广州大学 | Industrial defect detection method based on feature information reconstruction based on multi-dimensional feature fusion |
| CN119599998A (en) * | 2024-11-27 | 2025-03-11 | 太原理工大学 | An industrial parts anomaly detection method based on SCFlow |
Also Published As
| Publication number | Publication date |
|---|---|
| CN116862894B (en) | 2025-09-23 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN114841972B (en) | Transmission line defect recognition method based on saliency map and semantic embedding feature pyramid | |
| Damacharla et al. | TLU-net: a deep learning approach for automatic steel surface defect detection | |
| CN110084292B (en) | Target detection method based on DenseNet and multi-scale feature fusion | |
| CN113920107A (en) | A method of insulator damage detection based on improved yolov5 algorithm | |
| CN111524135A (en) | Image enhancement-based method and system for detecting defects of small hardware fittings of power transmission line | |
| CN110827213A (en) | Super-resolution image restoration method based on generation type countermeasure network | |
| CN116862894A (en) | An industrial defect detection method based on image restoration | |
| CN113569756B (en) | Abnormal behavior detection and location method, system, terminal equipment and readable storage medium | |
| CN111401358B (en) | Instrument dial correction method based on neural network | |
| CN111445484B (en) | A pixel-level segmentation method for abnormal regions of industrial images based on image-level annotation | |
| CN112419317B (en) | Visual loop detection method based on self-coding network | |
| CN111881743B (en) | Facial feature point positioning method based on semantic segmentation | |
| CN111242144B (en) | Method and device for detecting abnormality of power grid equipment | |
| CN118279272A (en) | Steel plate surface defect detection method based on improvement YOLOv8 | |
| CN116580195A (en) | Semantic Segmentation Method and System of Remote Sensing Image Based on ConvNeXt Convolution | |
| CN115035097B (en) | Cross-scene strip steel surface defect detection method based on domain adaptation | |
| CN113506239B (en) | Strip steel surface defect detection method based on cross-stage local network | |
| Shit et al. | An encoder‐decoder based CNN architecture using end to end dehaze and detection network for proper image visualization and detection | |
| Li et al. | Fabric defect segmentation system based on a lightweight GAN for industrial Internet of Things | |
| CN114820541A (en) | Defect detection method based on reconstructed network | |
| CN117876842A (en) | A method and system for detecting anomalies of industrial products based on generative adversarial networks | |
| CN114332473A (en) | Object detection method, object detection device, computer equipment, storage medium and program product | |
| CN115546223A (en) | Method and system for detecting loss of fastening bolt of equipment under train | |
| CN111462140A (en) | A real-time image instance segmentation method based on block stitching | |
| CN114841992A (en) | Defect detection method based on recurrent generative adversarial network and structural similarity |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |