+

CN119130992A - Industrial defect detection method based on feature information reconstruction based on multi-dimensional feature fusion - Google Patents

Industrial defect detection method based on feature information reconstruction based on multi-dimensional feature fusion Download PDF

Info

Publication number
CN119130992A
CN119130992A CN202411276764.3A CN202411276764A CN119130992A CN 119130992 A CN119130992 A CN 119130992A CN 202411276764 A CN202411276764 A CN 202411276764A CN 119130992 A CN119130992 A CN 119130992A
Authority
CN
China
Prior art keywords
characteristic information
information
feature
module
encoder
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202411276764.3A
Other languages
Chinese (zh)
Other versions
CN119130992B (en
Inventor
彭绍湖
钟天葵
彭凌西
黄伟彬
黄靖波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou University
Original Assignee
Guangzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou University filed Critical Guangzhou University
Priority to CN202411276764.3A priority Critical patent/CN119130992B/en
Publication of CN119130992A publication Critical patent/CN119130992A/en
Application granted granted Critical
Publication of CN119130992B publication Critical patent/CN119130992B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0495Quantised networks; Sparse networks; Compressed networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/096Transfer learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/70Labelling scene content, e.g. deriving syntactic or semantic representations
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30108Industrial image inspection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method for detecting industrial defects by reconstructing characteristic information based on multidimensional characteristic fusion, and relates to the technical field of defect detection. Compared with the prior defect detection method, the method solves the problems that the traditional defect detection method is sensitive to environmental changes and poor in generalization capability, cannot meet the requirement of instantaneity, is poor in reconstruction performance effect and low in generalization performance of a system, and introduces a cutting type self-encoder AE to complete normal characteristic information reconstruction of a defect sample, so that the parameter number and the calculation amount of a network are reduced. And the embedded module is used for replacing the clipping part, so that the system performance is maintained, the characteristic information of the normal image is increased, and the reconstruction function of the reconstruction module is completed. And the MFF module is introduced to reuse the feature information of the encoder module in the AE reconstruction module, and an information correction mechanism is additionally introduced to the MFF module to prevent the problem of reduced reconstruction performance caused by leakage of abnormal information, and the normal feature information of the teacher network is utilized to guide the correct use of the multidimensional feature information.

Description

Feature information reconstruction industrial defect detection method based on multidimensional feature fusion
Technical Field
The invention relates to the technical field of defect detection, in particular to a method for detecting industrial defects by reconstructing characteristic information based on multidimensional characteristic fusion.
Background
In industrial inspection, defect inspection is a very important ring. By detecting the defects of the product, the problems can be found and repaired in time, and the quality and safety of the product are ensured. However, the traditional manual detection is time-consuming and labor-consuming, and may also bring about a bad condition of missed detection and false detection. Therefore, with the development of technology, machine vision is widely introduced into the field of defect detection for industrial inspection, and excellent results are achieved.
Conventional industrial defect detection techniques detect based on texture features, shape features, and deep learning methods.
Defect detection methods based on texture features can be further divided into two categories, statistical methods and signal processing methods. The statistical method mainly comprises the steps of regarding gray value distribution on the surface of an object as random distribution, analyzing the distribution of random variables from a statistical angle, and describing the spatial distribution of gray values through characteristics such as histogram characteristics, gray co-occurrence matrix, local binary pattern, autocorrelation function, mathematical morphology and the like. The signal processing method mainly treats an image as a two-dimensional signal, and analyzes the image from the viewpoint of signal filter design, so that the method is also called a spectrum method.
The method based on the shape features can effectively utilize the interested target in the image for searching. Among them, contour-based methods are the main method types. Contour-based methods obtain shape parameters of an image by describing the outer boundary features of an object, representative methods being Hough transforms and Fourier shape descriptors. The hough transform uses the global features of the image to connect edge pixels to form a closed boundary of the region, the theoretical basis of which is the dual of the point-to-line. The method is mainly used for detecting the defects of the bottle surface, and in the ROI extraction stage, the boundary line of the light source is detected by adopting the fast Hough transformation.
However, the conventional detection method based on texture features and shape features is often based on a simple feature description and classification method, complex texture and shape changes are difficult to process, and the conventional image processing algorithm has a relatively sensitive reaction to the brightness change of the environment, so that the statistics information of the histogram can be influenced by over-brightness or over-darkness.
Deep learning-based methods can also be divided into two categories, reconstruction-based and embedding-based industrial defect detection methods. The reconstruction-based industrial defect detection technique models image feature information in an embedded space based on a self-encoder AE and reconstructs from the embedded space. But cannot be reconstructed because the anomaly information does not evolve during the training process. Therefore, the difference between the detected image and its reconstructed image represents an abnormality detection result. The core idea of the embedded method is to store target characteristic information by using a pre-training model, and directly use or indirectly use the target characteristic information. The former uses the characteristic information compressed by strategy to compare with the characteristic information of the sample to locate the abnormality, and the latter establishes a memory library with the characteristic information by different means to assist in locating the abnormality.
In order to improve the reconstruction performance, the reconstruction module of the reconstruction-based detection method uses a neural network model with overlarge parameter and calculation amount, so that the parameter amount and calculation amount of the whole system are overlarge, and the real-time requirement cannot be met. Because the utilization of the characteristic information is ignored, the performance effect of reconstruction is not optimal. The detection method based on embedding is not high in generalization performance of the system due to the introduction of an additional calculation module and an embedding module.
In order to solve the problems, the invention provides a method for detecting industrial defects by reconstructing characteristic information based on multidimensional characteristic fusion.
Disclosure of Invention
The invention aims to provide a characteristic information reconstruction industrial defect detection method based on multidimensional characteristic fusion so as to solve the problems in the background technology:
The traditional defect detection method is sensitive to environmental changes and poor in generalization capability, cannot meet the requirement of real-time performance, is poor in reconstruction performance effect and is low in generalization performance of the system.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
The method for detecting the industrial defect by reconstructing the characteristic information based on the multidimensional characteristic fusion comprises the following steps:
S1, building a cutting type self-encoder AE;
s2, introducing a teacher network to guide a cutting type self-encoder AE to finish the reconstruction target of normal characteristic information;
s3, building an embedded module, defining a group of memories, addressing the similarity of the memory groups by utilizing input features, and obtaining reconstruction features according to the similarity;
S4, constructing an MFF module to complete guidance of the decoder characteristic reconstruction by the decoder multi-scale characteristics;
S5, constructing a lightweight segmentation network to finish the segmentation and positioning of the defects.
Preferably, in the clip-type self-encoder AE of S1, encoder and the decoder are composed of the first three blocks of ResNet without pre-training, encoder is represented as E k, k E {1, Λ, N }, and the decoder is represented as D k, k E {1, Λ, N }, and the middle is linked by an embedded memory module M, and the image feature information corresponds to each of the image feature informationRepresenting a projection of the original data I into the embedding space; Wherein C k、Hk、Wk represents the number of channels, the height and the width of the k-th layer activation tensor, respectively.
Preferably, the teacher network of S2 uses ResNet model pre-trained in image Net dataset and uses the characteristic information tensor of the first three block layers to conduct information guidance, uses cosine similarity as KD loss, obtains the characteristic information tensor of decoder through self-encoder AEObtaining normal characteristic information tensor through pretrained teacher network of normal image I n K ε {1, Λ, N }, calculate with channel C k as axisAndThe similarity loss is taken as the difference between normal characteristic information and abnormal characteristic information, the self-encoder AE has the normal characteristic information reconstruction capability of an abnormal image by reducing the similarity loss, and the similarity loss L rec(In,Ica) is calculated as follows:
wherein d k (h, w) is the vector cosine similarity loss, and I ca is the outlier.
Preferably, the input of the embedding module in S3 isThe embedded module defines a set of internal memoriesFor reconstructing featuresSimilarity addressing is performed on the memory group by using the input characteristics:
Wherein m i is a subset of the memory group, i is the number of memories, w is the set of similarity between the input feature and the memory group feature, ω i represents the similarity between the input feature and the memory group feature;
similarity ω i is calculated by cosine distance:
Wherein, Is an input featureT 0 is the transpose of the vector;
The memory group is recombined according to the characteristics of different size proportions by the size of the similarity, and finally a processing procedure taking the query characteristics as input and the recombined characteristics as output is obtained, wherein the output is defined as
The memory group M is obtained by a first round of model distillation training mode, and during the first round of model distillation training, normal sample images I n are input from a model of the encoder AE, and normal characteristic information is obtained after Encoder dataThe memory group is internally updated by model parameters, normal characteristic information dimension reduction data is stored as memory data, and the memory group is subjected to similarity addressing to obtain reconstructed characteristicsFittingAndAnd finally obtaining the memory group M embedded in the module.
Preferably, the MFF module input in S4 is a characteristic output from different blocks of encoder AEE 1,E2,E3 is subjected to pooling treatment, alignment of characteristic dimensions is completed after treatment, E 1,E2,E3 after pooling is used as a characteristic information block, semantic information of multiple scales is contained in the characteristic information block, information fusion in the channel direction is carried out on block features through 1 multiplied by 1, channel adjustment is carried out on the characteristic information block, and the characteristic information block is sequentially adjusted to beCorresponding channel and feature sizes, resulting in M 3,M2,M1={pool(Dk, k ε 3,2, 1);
The MFF module also introduces a difference information suppression function, and is used for measuring the characteristic information tensor of the multi-scale characteristic information M 3,M2,M1 and the teacher network based on a cosine formula The difference between:
Generating a threshold using the activation function Relu (·) and suppressing the transfer of exception information using the threshold:
the multi-scale fusion information for obtaining the suppression anomaly information is directly acted on the decoder corresponding feature layer of the self-encoder AE.
Preferably, the segmentation network in the S5 takes the abnormal image as the characteristic information { T 1,T2,T3 } of the input teacher network and takes the abnormal image as the characteristic information { S D1,SD2,SD3 } of the Decoder of the AE self-encoder, and takes the difference of characteristic mapping between the characteristic information as the input of the segmentation network;
in the partitioning network, deeper information of advanced semantic features is progressively transferred to shallow features by an IFC module that outputs from the partitioning network Encoder As an input to the IFC module, forThe output of each layer is convolved and then passes through a pooling layer, a CA attention processing Block is introduced to the feature blocks with different scales, the front dimension, the back dimension and the channels of the feature blocks are unchanged after passing through a CA attention module, the number of the channels is convolved and adjusted again after passing through a CA Block, and the method is obtainedThe output of the IFC module is obtained through the following formula:
Then according to the U-Lite basic structure, the processed multi-size characteristic information and the corresponding Decoder characteristic information block are fused to obtain the fused characteristic information
Through the above process, the IFC transmits the deepest feature information layer by layer upwards, each shallow feature contains the feature information of the deeper layer from the sub-layer, and finally transmits the corresponding IFC output multi-scale feature information to the Decoder module of the U-Lite, and finally the segmentation function is completed, and the segmentation module outputs a segmentation value Mask pre with a specified size.
Preferably, the segmentation network optimizes segmentation training using focus loss and L1 loss;
Masks pre and anomalyMask GT are linearly sampled to the same size dimension, where i, j represents the xy coordinates of the image:
Wherein, gamma is a super parameter, p ij is a linear sampling result, L focal is a focus loss, L l1 is an L1 loss;
And (3) completing training of the segmentation network through the two loss functions of the focus loss and the L1 loss, and finally outputting a Mask for displaying the classification of the pixel points of the detection sample by the segmentation network model.
Compared with the prior art, the invention provides the method for detecting the industrial defect by reconstructing the characteristic information based on the multidimensional characteristic fusion, which has the following beneficial effects:
The invention introduces the cutting type self-encoder AE to complete the normal characteristic information reconstruction of the defect sample, cuts the deepest convolution layer and greatly reduces the parameter quantity and the calculated quantity of the network. And the embedded module is used for replacing a clipped network convolution layer, so that the system performance is prevented from being reduced, the characteristic information of the normal industrial sample image is increased, and the reconstruction function of the reconstruction module is completed. And an information correction mechanism is additionally introduced into the MFF module, so that the problem of reduced reconstruction performance caused by leakage of abnormal information in the reconstruction of the multidimensional feature information is prevented, and the normal feature information of a teacher network is utilized to guide the correct use of the multidimensional feature information.
Drawings
FIG. 1 is a flow chart of the detection method in embodiment 1 of the present invention;
FIG. 2 is a schematic diagram of the detection network mentioned in embodiment 1 of the present invention;
FIG. 3 is a schematic diagram of the IFC module of embodiment 1 of the present invention;
fig. 4 is a schematic diagram of a teacher network for feature fusion students according to embodiment 1 of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments.
The invention introduces the cutting type self-encoder AE to complete the normal characteristic information reconstruction of the defect sample, cuts the deepest convolution layer and greatly reduces the parameter quantity and the calculated quantity of the network. And the embedded module is used for replacing a clipped network convolution layer, so that the system performance is prevented from being reduced, the characteristic information of the normal industrial sample image is increased, and the reconstruction function of the reconstruction module is completed. And an information correction mechanism is additionally introduced into the MFF module, so that the problem of reduced reconstruction performance caused by leakage of abnormal information in the reconstruction of the multidimensional feature information is prevented, and the normal feature information of a teacher network is utilized to guide the correct use of the multidimensional feature information. Specifically, the following are included.
Example 1:
Referring to fig. 1-4, the method for detecting defects in the MVTec AD dataset based on the feature information reconstruction industrial defect detection of the multi-dimensional feature fusion of the present invention:
The preparation phase is to adjust the mean and variance of the MVTec AD dataset and adjust the size of the image using bilinear interpolation 256 x 256.
And simulating an abnormal image, namely generating random two-dimensional Berlin noise, and binarizing according to a preset threshold value to obtain an abnormal mask M. The anomaly image I ca is generated by replacing the mask region with a linear combination of the anomaly-free image I n and any image a in the external data source.
The defect detection model is built, and the method comprises the following steps:
s1, building a cutting type self-encoder AE, namely a student network, specifically comprising the following steps:
With reference to the link layer of the U-Net network, the connection structure of the link layer can complete the transmission of characteristic information, so that the performance of the model is improved, so that the multi-scale characteristic fusion student teacher network provided by us can refer to FIG. 4, and the same connection structure is used for helping the model to transmit the characteristic information after the multi-scale characteristic fusion is completed.
In the self-encoder AE, encoder and the decoder are both composed of the first three blocks of ResNet18 without pre-training, encoder can be represented as E k, k E {1, Λ, N }, the decoder can be represented as D k, k E {1, Λ, N }, the middle is linked by an embedded memory module M, and the image characteristic information corresponds to each of the image characteristic informationRepresenting a projection of the original data I into the embedding space; Wherein C k、Hk、Wk represents the number of channels, the height and the width of the k-th layer activation tensor, respectively.
S2, introducing a teacher network to guide a cutting type self-encoder AE to finish the reconstruction target of normal characteristic information, wherein the method comprises the following steps of:
The teacher network is introduced to guide the cut-out self-encoder AE to complete the reconstruction target of the normal characteristic information.
The teacher network uses ResNet model pre-trained in image Net data set to conduct information guidance by using characteristic information tensor of the first three block layers for data alignment, uses cosine similarity as KD loss for knowledge transfer in T-S model to accurately capture relation in high-dimensional and low-dimensional information, obtains characteristic information tensor of decoder by self-encoder AEReferring to a student network model denoising mechanism of DeSTSeg models, a normal image I n obtains a normal characteristic information tensor through a pre-trained teacher network ResNet18Calculated by taking the channel C k as the axisAndThe vector cosine similarity loss of the self-encoder AE is used as the difference between normal characteristic information and abnormal characteristic information, the self-encoder AE has the normal characteristic information reconstruction capability of an abnormal image by reducing the similarity loss, and a loss formula is as follows:
Wherein d k (h, w) is the vector cosine similarity loss, I ca is the outlier, and L rec(In,Ica is the similarity loss.
However, the reconstruction from abnormal feature information to normal feature information is a challenging task, and reference TrustMAE is made, so that the introduction of additional memory information can help the student network decoder part to better complete the feature reconstruction task. The introduced embedding module replaces the original ResNet18 deepest block, i.e., the convolution of 2x conv3x3, c=512.
S3, building an embedded module, defining a group of memories, addressing the similarity of the memory groups by utilizing the input characteristics, and obtaining reconstruction characteristics according to the similarity, wherein the method comprises the following steps of:
The input of the embedded module is The embedded module defines a group of internal memoriesFor reconstructing featuresUsing input featuresSimilarity addressing is performed on the memory group:
Wherein m i is a subset of the memory group, i is the number of memories, w is the set of similarity between the input feature and the memory group feature, ω i represents the similarity between the input feature and the memory group feature;
similarity ω i is calculated by cosine distance:
Wherein, Is an input featureT 0 is the transpose of the vector;
The memory group is recombined according to the characteristics of different size proportions by the size of the similarity, and finally a processing procedure taking the query characteristics as input and the recombined characteristics as output is obtained, wherein the output is defined as
The memory group M is obtained by a first round of model distillation training mode, and during the first round of model distillation training, a model of the student network inputs a normal sample image I n, and data passes through the student network Encoder to obtain normal characteristic informationThe memory group is internally updated by model parameters, normal characteristic information dimension reduction data is stored as memory data, and the memory group is subjected to similarity addressing to obtain reconstructed characteristicsFittingAndAnd finally obtaining the memory group M embedded in the module.
S4, constructing an MFF module to complete guidance of the decoder characteristic reconstruction by the decoder multi-scale characteristics, wherein the method comprises the following steps:
The MFF (Multi-scale feature fusion) module inputs are characteristic outputs from different blocks of the encoder AE In order to reduce the burden of an MFF module on the operand, notifying the patch information of the extracted features, pooling E 1,E2,E3, finishing the alignment of feature dimensions after processing, fusing the multi-scale feature information, taking E 1,E2,E3 after pooling as a feature information block, wherein the feature information block contains semantic information of a plurality of scales, fusing the information of the block features in the channel direction through 1X 1, and in order to check in, i.e. guiding features of the same level by the same level of semantic, avoiding the error of the semantic guidance, and carrying out channel adjustment on the feature information block in sequence, wherein the method comprises the steps ofCorresponding channel and feature size, M 3,M2,M1={pool(Dk, k.epsilon.3, 2, 1) } can be obtained due to the feature information of the self-encoder AEIf the information is directly guided, abnormal characteristic information leakage problem during characteristic information reconstruction can be generated, so that the MFF module introduces a difference information suppression function, wherein the difference information suppression function requires two input data, namely multi-scale characteristic information M 3,M2,M1 and characteristic information tensor of a teacher networkThe former has abnormal characteristic information, the latter only has characteristic information of normal images, the difference between the characteristic information represents abnormal information points, and the difference between the two is measured based on a cosine formula:
Generating a threshold using the activation function Relu (·) and suppressing the transfer of exception information using the threshold:
The multi-scale fusion information for obtaining the abnormal information suppression is directly acted on the student decoder corresponding feature layer of the AE.
S5, constructing a lightweight segmentation network to finish the segmentation and positioning of the defects. The method comprises the following steps:
The method comprises the steps of taking an abnormal image as characteristic information { T 1,T2,T3 } of an input teacher network and taking the abnormal image as characteristic information { S D1,SD2,SD3 } of a Decoder of an AE self-encoder, taking the difference of characteristic mapping between the characteristic information as the input of a segmentation network, and carrying out dimension reduction processing on data information to be segmented to reduce the channel dimension to 3 dimensions by a residual block of ResNet before segmentation because of considering the consumption of calculation performance.
Referring to fig. 3, in the segmentation network, deeper information of the advanced semantic features is further gradually transferred to the shallow features through the IFC module, so that the multi-path refinement function is completed and the problem of information difference caused by reconstructed multi-scale feature information is solved. The IFC module outputs from the splitting network EncoderAs an input to the IFC module, forThe output of each layer is convolved to complete the compression of the channel, and then the size of the feature is reduced by a pooling layer, a CA attention processing block is introduced to the feature blocks with different scales, the front and back dimensions of the feature blocks and the channel are kept unchanged after passing through the CA attention module, and the channel number is convolved again after CABlock in order to keep the fusion of the subsequent multi-scale feature information, so that the number of the channels is obtained at the momentThe output of the IFC module is obtained through the following formula:
Then according to the U-Lite basic structure, the processed multi-size characteristic information and the corresponding Decoder characteristic information block are fused to obtain the fused characteristic information
Through the above process, the IFC transmits the deepest feature information layer by layer upwards, each shallow feature contains the feature information of the deeper layer from the sub-layer, and finally transmits the corresponding IFC output multi-scale feature information to the Decoder module of the U-Lite, and finally the segmentation function is completed, and the segmentation module outputs a segmentation value Mask pre with a specified size.
The function of the segmentation network in the system is to segment the positions of the reconstructed feature and the defect feature in the image, namely, the segmentation is to classify the pixel points of the image into normal pixel points and abnormal pixel points, and finally, the defect positioning of the defect sample is completed.
The segmentation network optimizes the segmentation training with focus loss and L1 loss, which can help the model focus on a few classes and difficult samples. And the L1 loss can improve the sparsity of output, so that the boundary of the segmentation mask is clearer. Masks pre and anomalyMask GT are linearly sampled to the same size dimension, where i, j represents the xy coordinates of the image:
Wherein, gamma is a super parameter, p ij is a linear sampling result, L focal is a focus loss, L l1 is an L1 loss;
And (3) completing training of the segmentation network through the two loss functions of the focus loss and the L1 loss, and finally outputting a Mask for displaying the classification of the pixel points of the detection sample by the segmentation network model.
Training phase:
adapting to data distribution, obtaining a storage module:
traversing the normal image of the industrial sample to obtain the characteristic output of the student network and the teacher network The teacher network is used for guiding the learning network to reconstruct the characteristic information:
initializing an embedding module, and utilizing a loss function L m to enable the output of a student network to be similar to that of a teacher network, so as to complete the construction of the embedding module:
The model was trained using Adam optimizer Adam (L rec+Lm).
Reconstructing abnormal characteristic information:
Traversing the normal image I n and the abnormal image I ca;
I n and I ca are respectively used as input of a teacher network and a student network to obtain image characteristic information
Taking student network encoder output as MFF module input to obtain multidimensional feature fusion information;
injecting the multidimensional feature fusion information into a student network decoder module;
calculating a loss function:
the model is trained using an SGD optimizer SGD (L rec).
Training a segmentation network:
reconstructing output characteristic information and teacher output characteristic information by using student network, and integrating the two information:
The channels are adjusted using ResNet < 18 > residual modules:
inputting the adjusted characteristic information into a segmentation network to obtain an output of the segmentation network:
Calculate the loss function and train the model, where Mask pre represents the correct classification Mask for the image:
Lfocal=(Maskpre,MaskGT)
Adam(Lfocal)
Reasoning:
The defect detection flow of the system comprises the steps of inputting an image to be detected, inputting an AE { E k,Dk }, and inputting a teacher network based on a student network: Splitting the network: the method comprises the steps of traversing an image to be detected, outputting a segmented image of the image to be detected, and specifically comprising the following steps:
preprocessing is completed by adjusting the size and mean variance of the image to be detected;
and taking the image to be detected as an input to transmit a teacher and a student network, and completing characteristic information reconstruction by the student network:
Adjusting feature information using ResNet < 18 > residual modules Number of channels:
Obtaining a defect segmentation image output of the image to be detected by using a segmentation network Seg:
Finally, the system outputs a defect segmentation image of the image to be detected.
Comparing the defect detection method of the MVTec AD dataset with the traditional Patch core, EFFICIENTAD, RD, RD ++, dream, fastflow methods, reference may be made to table 1:
Table 1Computational efficiency performance
As can be seen from the table, in the method of this example, the average value of AUROC [ Area render THE RECEIVER Operating Characteristic Curve ] image level in MVTecAD public industrial defect data reaches 99.40%, the average value of AUROC pixel level reaches 98.25%, and the average value of PRO [ Per-Region Overlap ] measurement reaches 95.44%. And, the image level defect detection performance reaches the SOTA level in the MPDD common industrial defect data.
The foregoing is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art, who is within the scope of the present invention, should make equivalent substitutions or modifications according to the technical scheme of the present invention and the inventive concept thereof, and should be covered by the scope of the present invention.

Claims (7)

1. The method for detecting the industrial defect by reconstructing the characteristic information based on the multidimensional characteristic fusion is characterized by comprising the following steps:
S1, building a cutting type self-encoder AE;
s2, introducing a teacher network to guide a cutting type self-encoder AE to finish the reconstruction target of normal characteristic information;
s3, building an embedded module, defining a group of memories, addressing the similarity of the memory groups by utilizing input features, and obtaining reconstruction features according to the similarity;
S4, constructing an MFF module to complete guidance of the decoder characteristic reconstruction by the decoder multi-scale characteristics;
s5, constructing a lightweight segmentation network to finish segmentation and positioning of the industrial sample defects.
2. The method for detecting industrial defects by reconstructing feature information based on multi-dimensional feature fusion according to claim 1, wherein in the clipping self-encoder AE of S1, encoder and the decoder are both composed of the first three blocks of ResNet18 without pre-training, encoder is represented as E k, k E {1, Λ, N }, and the decoder is represented as D k, k E {1, Λ, N }, and are linked by an embedded memory module M in the middle, and the image feature information corresponds to each of the image feature information Representing a projection of the original data I into the embedding space; Wherein C k、Hk、Wk represents the number of channels, the height and the width of the k-th layer activation tensor, respectively.
3. The method for detecting industrial defects by reconstructing characteristic information based on multi-dimensional characteristic fusion according to claim 1, wherein the teacher network of S2 uses ResNet model pre-trained in image Net dataset to conduct information guidance by using characteristic information tensors of the first three block layers, utilizes cosine similarity as KD loss, obtains characteristic information tensors of decoder through self-encoder AEObtaining normal characteristic information tensor through pretrained teacher network of normal image I n Calculated by taking the channel C k as the axisAndThe similarity loss is taken as the difference between normal characteristic information and abnormal characteristic information, the self-encoder AE has the normal characteristic information reconstruction capability of an abnormal image by reducing the similarity loss, and the similarity loss L rec(In,Ica) is calculated as follows:
wherein d k (h, w) is the vector cosine similarity loss, and I ca is the outlier.
4. The method for detecting industrial defects by reconstructing feature information based on multi-dimensional feature fusion according to claim 1, wherein the input of the embedding module in S3 is thatThe embedded module defines a set of internal memoriesFor reconstructing featuresSimilarity addressing is performed on the memory group by using the input characteristics:
Wherein m i is a subset of the memory group, i is the number of memories, w is the set of similarity between the input feature and the memory group feature, ω i represents the similarity between the input feature and the memory group feature;
similarity ω i is calculated by cosine distance:
Wherein, Is an input featureT 0 is the transpose of the vector;
The memory group is recombined according to the characteristics of different size proportions by the size of the similarity, and finally a processing procedure taking the query characteristics as input and the recombined characteristics as output is obtained, wherein the output is defined as
The memory group M is obtained by a first round of model distillation training mode, and during the first round of model distillation training, normal sample images I n are input from a model of the encoder AE, and normal characteristic information is obtained after Encoder dataThe memory group is internally updated by model parameters, normal characteristic information dimension reduction data is stored as memory data, and the memory group is subjected to similarity addressing to obtain reconstructed characteristicsFittingAndAnd finally obtaining the memory group M embedded in the module.
5. The method for detecting industrial defects by reconstructing feature information based on multi-dimensional feature fusion according to claim 1, wherein the MFF module input in S4 is the feature output from different blocks of encoder AEE 1,E2,E3 is subjected to pooling treatment, alignment of characteristic dimensions is completed after treatment, E 1,E2,E3 after pooling is used as a characteristic information block, semantic information of multiple scales is contained in the characteristic information block, information fusion in the channel direction is carried out on block features through 1 multiplied by 1, channel adjustment is carried out on the characteristic information block, and the characteristic information block is sequentially adjusted to beCorresponding channel and feature sizes, resulting in M 3,M2,M1={pool(Dk, k ε 3,2, 1);
The MFF module also introduces a difference information suppression function, and is used for measuring the characteristic information tensor of the multi-scale characteristic information M 3,M2,M1 and the teacher network based on a cosine formula The difference between:
Generating a threshold using the activation function Relu (·) and suppressing the transfer of exception information using the threshold:
the multi-scale fusion information for obtaining the suppression anomaly information is directly acted on the decoder corresponding feature layer of the self-encoder AE.
6. The method for detecting the industrial defect by reconstructing the characteristic information based on the multi-dimensional characteristic fusion according to claim 1, wherein the segmentation network in the S5 takes an abnormal image as the characteristic information { T 1,T2,T3 } of an input teacher network and takes the abnormal image as the characteristic information { S D1,SD2,SD3 } of a Decoder of an AE self-encoder, and takes the difference of characteristic mapping between the characteristic information as the input of the segmentation network;
in the partitioning network, deeper information of advanced semantic features is progressively transferred to shallow features by an IFC module that outputs from the partitioning network Encoder As an input to the IFC module, forThe output of each layer is convolved and then passes through a pooling layer, a CA attention processing Block is introduced to the feature blocks with different scales, the front dimension, the back dimension and the channels of the feature blocks are unchanged after passing through a CA attention module, the number of the channels is convolved and adjusted again after passing through a CA Block, and the method is obtainedThe output of the IFC module is obtained through the following formula:
Then according to the U-Lite basic structure, the processed multi-size characteristic information and the corresponding Decoder characteristic information block are fused to obtain the fused characteristic information
Through the above process, the IFC transmits the deepest feature information layer by layer upwards, each shallow feature contains the feature information of the deeper layer from the sub-layer, and finally transmits the corresponding IFC output multi-scale feature information to the Decoder module of the U-Lite, and finally the segmentation function is completed, and the segmentation module outputs a segmentation value Mask pre with a specified size.
7. The method for detecting the industrial defect by reconstructing the characteristic information based on the multi-dimensional characteristic fusion according to claim 6, wherein the segmentation network optimizes the segmentation training by utilizing focus loss and L1 loss;
Masks pre and anomalyMask GT are linearly sampled to the same size dimension, where i, j represents the xy coordinates of the image:
pij=MaskGT(i,j)×Maskpre(i,j)+[1-MaskGT(i,j)]×[1-Maskpre(i,j)]
Wherein, gamma is a super parameter, p ij is a linear sampling result, L focal is a focus loss, L l1 is an L1 loss;
And (3) completing training of the segmentation network through the two loss functions of the focus loss and the L1 loss, and finally outputting a Mask for displaying the classification of the pixel points of the detection sample by the segmentation network model.
CN202411276764.3A 2024-09-12 2024-09-12 Industrial defect detection method based on feature information reconstruction based on multi-dimensional feature fusion Active CN119130992B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202411276764.3A CN119130992B (en) 2024-09-12 2024-09-12 Industrial defect detection method based on feature information reconstruction based on multi-dimensional feature fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202411276764.3A CN119130992B (en) 2024-09-12 2024-09-12 Industrial defect detection method based on feature information reconstruction based on multi-dimensional feature fusion

Publications (2)

Publication Number Publication Date
CN119130992A true CN119130992A (en) 2024-12-13
CN119130992B CN119130992B (en) 2025-08-05

Family

ID=93766861

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202411276764.3A Active CN119130992B (en) 2024-09-12 2024-09-12 Industrial defect detection method based on feature information reconstruction based on multi-dimensional feature fusion

Country Status (1)

Country Link
CN (1) CN119130992B (en)

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113434418A (en) * 2021-06-29 2021-09-24 扬州大学 Knowledge-driven software defect detection and analysis method and system
CN113572539A (en) * 2021-06-24 2021-10-29 西安电子科技大学 Storage-enhanced unsupervised spectral anomaly detection method, system, device, medium
CN114332007A (en) * 2021-12-28 2022-04-12 福州大学 A Transformer-based Industrial Defect Detection and Recognition Method
CN114742799A (en) * 2022-04-18 2022-07-12 华中科技大学 Industrial scene unknown type defect segmentation method based on self-supervision heterogeneous network
CN115641474A (en) * 2022-10-21 2023-01-24 华中科技大学 Unknown type defect detection method and device based on efficient student network
CN116597203A (en) * 2023-05-11 2023-08-15 电子科技大学 Knowledge distillation-based anomaly detection method for asymmetric self-encoder
CN116778269A (en) * 2023-05-30 2023-09-19 沈阳化工大学 A method for building a product surface defect detection model based on autoencoder reconstruction
CN116862894A (en) * 2023-07-26 2023-10-10 江南大学 An industrial defect detection method based on image restoration
CN116862885A (en) * 2023-07-14 2023-10-10 江苏济远医疗科技有限公司 Segmentation guide denoising knowledge distillation method and device for ultrasonic image lesion detection
CN117576535A (en) * 2024-01-15 2024-02-20 腾讯科技(深圳)有限公司 Image recognition method, device, equipment and storage medium
CN117593602A (en) * 2023-11-15 2024-02-23 上海大学 Knowledge distillation anomaly detection method based on asymmetric teacher student network
CN117635585A (en) * 2023-12-06 2024-03-01 华中科技大学 Texture surface defect detection method based on teacher-student network
US20240210329A1 (en) * 2022-12-27 2024-06-27 Zhejiang University Method for detecting abnormal defect on steel surface based on semi-supervised contrastive learning

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113572539A (en) * 2021-06-24 2021-10-29 西安电子科技大学 Storage-enhanced unsupervised spectral anomaly detection method, system, device, medium
CN113434418A (en) * 2021-06-29 2021-09-24 扬州大学 Knowledge-driven software defect detection and analysis method and system
CN114332007A (en) * 2021-12-28 2022-04-12 福州大学 A Transformer-based Industrial Defect Detection and Recognition Method
CN114742799A (en) * 2022-04-18 2022-07-12 华中科技大学 Industrial scene unknown type defect segmentation method based on self-supervision heterogeneous network
CN115641474A (en) * 2022-10-21 2023-01-24 华中科技大学 Unknown type defect detection method and device based on efficient student network
US20240210329A1 (en) * 2022-12-27 2024-06-27 Zhejiang University Method for detecting abnormal defect on steel surface based on semi-supervised contrastive learning
CN116597203A (en) * 2023-05-11 2023-08-15 电子科技大学 Knowledge distillation-based anomaly detection method for asymmetric self-encoder
CN116778269A (en) * 2023-05-30 2023-09-19 沈阳化工大学 A method for building a product surface defect detection model based on autoencoder reconstruction
CN116862885A (en) * 2023-07-14 2023-10-10 江苏济远医疗科技有限公司 Segmentation guide denoising knowledge distillation method and device for ultrasonic image lesion detection
CN116862894A (en) * 2023-07-26 2023-10-10 江南大学 An industrial defect detection method based on image restoration
CN117593602A (en) * 2023-11-15 2024-02-23 上海大学 Knowledge distillation anomaly detection method based on asymmetric teacher student network
CN117635585A (en) * 2023-12-06 2024-03-01 华中科技大学 Texture surface defect detection method based on teacher-student network
CN117576535A (en) * 2024-01-15 2024-02-20 腾讯科技(深圳)有限公司 Image recognition method, device, equipment and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
SHAOHU PENG: "《A Domestic Trash Detection Model Based on Improved YOLOX》", 《SENSORS 2022, 22(18), 6974; HTTPS://DOI.ORG/10.3390/S22186974》, 3 August 2022 (2022-08-03) *
曾启林, 盛保信, 孙旭, 王志成, 黄伟彬: "《航空轮胎激光无损检测技术》", 《轮胎工业》, 10 February 2005 (2005-02-10) *
鄢宁: "《基于深度学习的工业产品缺陷检测算法的研究》", 《硕士电子期刊出版信息》, 15 May 2024 (2024-05-15) *

Also Published As

Publication number Publication date
CN119130992B (en) 2025-08-05

Similar Documents

Publication Publication Date Title
Qi et al. SGUIE-Net: Semantic attention guided underwater image enhancement with multi-scale perception
CN114677346B (en) Method for detecting end-to-end semi-supervised image surface defects based on memory information
CN117974693B (en) Image segmentation method, device, computer equipment and storage medium
Zhang et al. AIDEDNet: Anti-interference and detail enhancement dehazing network for real-world scenes
CN115439442A (en) Industrial product surface defect detection and positioning method and system based on commonality and difference
CN116645369A (en) Anomaly detection method based on twin self-encoder and two-way information depth supervision
Wu et al. AEKD: Unsupervised auto-encoder knowledge distillation for industrial anomaly detection
CN117635585A (en) Texture surface defect detection method based on teacher-student network
CN117173131B (en) Anomaly detection method based on distillation and memory-guided reconstruction
CN110930378A (en) Emphysema image processing method and system based on low data demand
CN117809123B (en) Anomaly detection and reconstruction method and system for double-stage image
CN117934425A (en) Image anomaly detection method based on self-supervision learning and knowledge distillation
Teng et al. Unsupervised learning method for underwater concrete crack image enhancement and augmentation based on cross domain translation strategy
CN116051382B (en) A data enhancement method based on deep reinforcement learning generative adversarial neural network and super-resolution reconstruction
CN117291850A (en) Infrared polarized image fusion enhancement method based on learnable low-rank representation
Zuluaga et al. Blind microscopy image denoising with a deep residual and multiscale encoder/decoder network
CN119648602A (en) Single image shadow removal method and computer device based on global context perception
CN118096601A (en) Image restoration method and system based on wavelet transformation and multi-scale residual error network
Zhao et al. Rethinking superpixel segmentation from biologically inspired mechanisms
Zhang et al. Multi morphological sparse regularized image super-resolution reconstruction based on machine learning algorithm
CN114862803A (en) A fast Fourier convolution-based anomaly detection method for industrial images
CN119206363A (en) Knowledge distillation and VAE-based multi-scale perception feature anomaly positioning method
CN119130992B (en) Industrial defect detection method based on feature information reconstruction based on multi-dimensional feature fusion
CN116342392B (en) Single remote sensing image super-resolution method based on deep learning
Gu et al. Unsupervised anomaly detection of industrial images based on dual generator reconstruction networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载