CN119130992A

CN119130992A - Industrial defect detection method based on feature information reconstruction based on multi-dimensional feature fusion

Info

Publication number: CN119130992A
Application number: CN202411276764.3A
Authority: CN
Inventors: 彭绍湖; 钟天葵; 彭凌西; 黄伟彬; 黄靖波
Original assignee: Guangzhou University
Current assignee: Guangzhou University
Priority date: 2024-09-12
Filing date: 2024-09-12
Publication date: 2024-12-13
Anticipated expiration: 2044-09-12
Also published as: CN119130992B

Abstract

The invention discloses a method for detecting industrial defects by reconstructing characteristic information based on multidimensional characteristic fusion, and relates to the technical field of defect detection. Compared with the prior defect detection method, the method solves the problems that the traditional defect detection method is sensitive to environmental changes and poor in generalization capability, cannot meet the requirement of instantaneity, is poor in reconstruction performance effect and low in generalization performance of a system, and introduces a cutting type self-encoder AE to complete normal characteristic information reconstruction of a defect sample, so that the parameter number and the calculation amount of a network are reduced. And the embedded module is used for replacing the clipping part, so that the system performance is maintained, the characteristic information of the normal image is increased, and the reconstruction function of the reconstruction module is completed. And the MFF module is introduced to reuse the feature information of the encoder module in the AE reconstruction module, and an information correction mechanism is additionally introduced to the MFF module to prevent the problem of reduced reconstruction performance caused by leakage of abnormal information, and the normal feature information of the teacher network is utilized to guide the correct use of the multidimensional feature information.

Description

Feature information reconstruction industrial defect detection method based on multidimensional feature fusion

Technical Field

The invention relates to the technical field of defect detection, in particular to a method for detecting industrial defects by reconstructing characteristic information based on multidimensional characteristic fusion.

Background

In industrial inspection, defect inspection is a very important ring. By detecting the defects of the product, the problems can be found and repaired in time, and the quality and safety of the product are ensured. However, the traditional manual detection is time-consuming and labor-consuming, and may also bring about a bad condition of missed detection and false detection. Therefore, with the development of technology, machine vision is widely introduced into the field of defect detection for industrial inspection, and excellent results are achieved.

Conventional industrial defect detection techniques detect based on texture features, shape features, and deep learning methods.

Defect detection methods based on texture features can be further divided into two categories, statistical methods and signal processing methods. The statistical method mainly comprises the steps of regarding gray value distribution on the surface of an object as random distribution, analyzing the distribution of random variables from a statistical angle, and describing the spatial distribution of gray values through characteristics such as histogram characteristics, gray co-occurrence matrix, local binary pattern, autocorrelation function, mathematical morphology and the like. The signal processing method mainly treats an image as a two-dimensional signal, and analyzes the image from the viewpoint of signal filter design, so that the method is also called a spectrum method.

The method based on the shape features can effectively utilize the interested target in the image for searching. Among them, contour-based methods are the main method types. Contour-based methods obtain shape parameters of an image by describing the outer boundary features of an object, representative methods being Hough transforms and Fourier shape descriptors. The hough transform uses the global features of the image to connect edge pixels to form a closed boundary of the region, the theoretical basis of which is the dual of the point-to-line. The method is mainly used for detecting the defects of the bottle surface, and in the ROI extraction stage, the boundary line of the light source is detected by adopting the fast Hough transformation.

However, the conventional detection method based on texture features and shape features is often based on a simple feature description and classification method, complex texture and shape changes are difficult to process, and the conventional image processing algorithm has a relatively sensitive reaction to the brightness change of the environment, so that the statistics information of the histogram can be influenced by over-brightness or over-darkness.

Deep learning-based methods can also be divided into two categories, reconstruction-based and embedding-based industrial defect detection methods. The reconstruction-based industrial defect detection technique models image feature information in an embedded space based on a self-encoder AE and reconstructs from the embedded space. But cannot be reconstructed because the anomaly information does not evolve during the training process. Therefore, the difference between the detected image and its reconstructed image represents an abnormality detection result. The core idea of the embedded method is to store target characteristic information by using a pre-training model, and directly use or indirectly use the target characteristic information. The former uses the characteristic information compressed by strategy to compare with the characteristic information of the sample to locate the abnormality, and the latter establishes a memory library with the characteristic information by different means to assist in locating the abnormality.

In order to improve the reconstruction performance, the reconstruction module of the reconstruction-based detection method uses a neural network model with overlarge parameter and calculation amount, so that the parameter amount and calculation amount of the whole system are overlarge, and the real-time requirement cannot be met. Because the utilization of the characteristic information is ignored, the performance effect of reconstruction is not optimal. The detection method based on embedding is not high in generalization performance of the system due to the introduction of an additional calculation module and an embedding module.

In order to solve the problems, the invention provides a method for detecting industrial defects by reconstructing characteristic information based on multidimensional characteristic fusion.

Disclosure of Invention

The invention aims to provide a characteristic information reconstruction industrial defect detection method based on multidimensional characteristic fusion so as to solve the problems in the background technology:

The traditional defect detection method is sensitive to environmental changes and poor in generalization capability, cannot meet the requirement of real-time performance, is poor in reconstruction performance effect and is low in generalization performance of the system.

In order to achieve the above purpose, the present invention adopts the following technical scheme:

The method for detecting the industrial defect by reconstructing the characteristic information based on the multidimensional characteristic fusion comprises the following steps:

S1, building a cutting type self-encoder AE;

s2, introducing a teacher network to guide a cutting type self-encoder AE to finish the reconstruction target of normal characteristic information;

s3, building an embedded module, defining a group of memories, addressing the similarity of the memory groups by utilizing input features, and obtaining reconstruction features according to the similarity;

S4, constructing an MFF module to complete guidance of the decoder characteristic reconstruction by the decoder multi-scale characteristics;

S5, constructing a lightweight segmentation network to finish the segmentation and positioning of the defects.

Preferably, in the clip-type self-encoder AE of S1, encoder and the decoder are composed of the first three blocks of ResNet without pre-training, encoder is represented as E _k, k E {1, Λ, N }, and the decoder is represented as D _k, k E {1, Λ, N }, and the middle is linked by an embedded memory module M, and the image feature information corresponds to each of the image feature informationRepresenting a projection of the original data I into the embedding space; Wherein C _k、H_k、W_k represents the number of channels, the height and the width of the k-th layer activation tensor, respectively.

Preferably, the teacher network of S2 uses ResNet model pre-trained in image Net dataset and uses the characteristic information tensor of the first three block layers to conduct information guidance, uses cosine similarity as KD loss, obtains the characteristic information tensor of decoder through self-encoder AEObtaining normal characteristic information tensor through pretrained teacher network of normal image I _n K ε {1, Λ, N }, calculate with channel C _k as axisAndThe similarity loss is taken as the difference between normal characteristic information and abnormal characteristic information, the self-encoder AE has the normal characteristic information reconstruction capability of an abnormal image by reducing the similarity loss, and the similarity loss L _rec(I_n,I_ca) is calculated as follows:

wherein d ^k (h, w) is the vector cosine similarity loss, and I _ca is the outlier.

Preferably, the input of the embedding module in S3 isThe embedded module defines a set of internal memoriesFor reconstructing featuresSimilarity addressing is performed on the memory group by using the input characteristics:

Wherein m _i is a subset of the memory group, i is the number of memories, w is the set of similarity between the input feature and the memory group feature, ω _i represents the similarity between the input feature and the memory group feature;

similarity ω _i is calculated by cosine distance:

Wherein, Is an input featureT ₀ is the transpose of the vector;

The memory group is recombined according to the characteristics of different size proportions by the size of the similarity, and finally a processing procedure taking the query characteristics as input and the recombined characteristics as output is obtained, wherein the output is defined as

The memory group M is obtained by a first round of model distillation training mode, and during the first round of model distillation training, normal sample images I _n are input from a model of the encoder AE, and normal characteristic information is obtained after Encoder dataThe memory group is internally updated by model parameters, normal characteristic information dimension reduction data is stored as memory data, and the memory group is subjected to similarity addressing to obtain reconstructed characteristicsFittingAndAnd finally obtaining the memory group M embedded in the module.

Preferably, the MFF module input in S4 is a characteristic output from different blocks of encoder AEE ₁,E₂,E₃ is subjected to pooling treatment, alignment of characteristic dimensions is completed after treatment, E ₁,E₂,E₃ after pooling is used as a characteristic information block, semantic information of multiple scales is contained in the characteristic information block, information fusion in the channel direction is carried out on block features through 1 multiplied by 1, channel adjustment is carried out on the characteristic information block, and the characteristic information block is sequentially adjusted to beCorresponding channel and feature sizes, resulting in M ₃,M₂,M₁＝{pool(D_k, k ε 3,2, 1);

The MFF module also introduces a difference information suppression function, and is used for measuring the characteristic information tensor of the multi-scale characteristic information M ₃,M₂,M₁ and the teacher network based on a cosine formula The difference between:

Generating a threshold using the activation function Relu (·) and suppressing the transfer of exception information using the threshold:

the multi-scale fusion information for obtaining the suppression anomaly information is directly acted on the decoder corresponding feature layer of the self-encoder AE.

Preferably, the segmentation network in the S5 takes the abnormal image as the characteristic information { T ₁,T₂,T₃ } of the input teacher network and takes the abnormal image as the characteristic information { S _D1,S_D2,S_D3 } of the Decoder of the AE self-encoder, and takes the difference of characteristic mapping between the characteristic information as the input of the segmentation network;

in the partitioning network, deeper information of advanced semantic features is progressively transferred to shallow features by an IFC module that outputs from the partitioning network Encoder As an input to the IFC module, forThe output of each layer is convolved and then passes through a pooling layer, a CA attention processing Block is introduced to the feature blocks with different scales, the front dimension, the back dimension and the channels of the feature blocks are unchanged after passing through a CA attention module, the number of the channels is convolved and adjusted again after passing through a CA Block, and the method is obtainedThe output of the IFC module is obtained through the following formula:

Then according to the U-Lite basic structure, the processed multi-size characteristic information and the corresponding Decoder characteristic information block are fused to obtain the fused characteristic information

Through the above process, the IFC transmits the deepest feature information layer by layer upwards, each shallow feature contains the feature information of the deeper layer from the sub-layer, and finally transmits the corresponding IFC output multi-scale feature information to the Decoder module of the U-Lite, and finally the segmentation function is completed, and the segmentation module outputs a segmentation value Mask _pre with a specified size.

Preferably, the segmentation network optimizes segmentation training using focus loss and L1 loss;

Masks _pre and anomalyMask _GT are linearly sampled to the same size dimension, where i, j represents the xy coordinates of the image:

Wherein, gamma is a super parameter, p _ij is a linear sampling result, L _focal is a focus loss, L _l1 is an L1 loss;

And (3) completing training of the segmentation network through the two loss functions of the focus loss and the L1 loss, and finally outputting a Mask for displaying the classification of the pixel points of the detection sample by the segmentation network model.

Compared with the prior art, the invention provides the method for detecting the industrial defect by reconstructing the characteristic information based on the multidimensional characteristic fusion, which has the following beneficial effects:

The invention introduces the cutting type self-encoder AE to complete the normal characteristic information reconstruction of the defect sample, cuts the deepest convolution layer and greatly reduces the parameter quantity and the calculated quantity of the network. And the embedded module is used for replacing a clipped network convolution layer, so that the system performance is prevented from being reduced, the characteristic information of the normal industrial sample image is increased, and the reconstruction function of the reconstruction module is completed. And an information correction mechanism is additionally introduced into the MFF module, so that the problem of reduced reconstruction performance caused by leakage of abnormal information in the reconstruction of the multidimensional feature information is prevented, and the normal feature information of a teacher network is utilized to guide the correct use of the multidimensional feature information.

Drawings

FIG. 1 is a flow chart of the detection method in embodiment 1 of the present invention;

FIG. 2 is a schematic diagram of the detection network mentioned in embodiment 1 of the present invention;

FIG. 3 is a schematic diagram of the IFC module of embodiment 1 of the present invention;

fig. 4 is a schematic diagram of a teacher network for feature fusion students according to embodiment 1 of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments.

The invention introduces the cutting type self-encoder AE to complete the normal characteristic information reconstruction of the defect sample, cuts the deepest convolution layer and greatly reduces the parameter quantity and the calculated quantity of the network. And the embedded module is used for replacing a clipped network convolution layer, so that the system performance is prevented from being reduced, the characteristic information of the normal industrial sample image is increased, and the reconstruction function of the reconstruction module is completed. And an information correction mechanism is additionally introduced into the MFF module, so that the problem of reduced reconstruction performance caused by leakage of abnormal information in the reconstruction of the multidimensional feature information is prevented, and the normal feature information of a teacher network is utilized to guide the correct use of the multidimensional feature information. Specifically, the following are included.

Example 1:

Referring to fig. 1-4, the method for detecting defects in the MVTec AD dataset based on the feature information reconstruction industrial defect detection of the multi-dimensional feature fusion of the present invention:

The preparation phase is to adjust the mean and variance of the MVTec AD dataset and adjust the size of the image using bilinear interpolation 256 x 256.

And simulating an abnormal image, namely generating random two-dimensional Berlin noise, and binarizing according to a preset threshold value to obtain an abnormal mask M. The anomaly image I _ca is generated by replacing the mask region with a linear combination of the anomaly-free image I _n and any image a in the external data source.

The defect detection model is built, and the method comprises the following steps:

s1, building a cutting type self-encoder AE, namely a student network, specifically comprising the following steps:

With reference to the link layer of the U-Net network, the connection structure of the link layer can complete the transmission of characteristic information, so that the performance of the model is improved, so that the multi-scale characteristic fusion student teacher network provided by us can refer to FIG. 4, and the same connection structure is used for helping the model to transmit the characteristic information after the multi-scale characteristic fusion is completed.

In the self-encoder AE, encoder and the decoder are both composed of the first three blocks of ResNet18 without pre-training, encoder can be represented as E _k, k E {1, Λ, N }, the decoder can be represented as D _k, k E {1, Λ, N }, the middle is linked by an embedded memory module M, and the image characteristic information corresponds to each of the image characteristic informationRepresenting a projection of the original data I into the embedding space; Wherein C _k、H_k、W_k represents the number of channels, the height and the width of the k-th layer activation tensor, respectively.

S2, introducing a teacher network to guide a cutting type self-encoder AE to finish the reconstruction target of normal characteristic information, wherein the method comprises the following steps of:

The teacher network is introduced to guide the cut-out self-encoder AE to complete the reconstruction target of the normal characteristic information.

The teacher network uses ResNet model pre-trained in image Net data set to conduct information guidance by using characteristic information tensor of the first three block layers for data alignment, uses cosine similarity as KD loss for knowledge transfer in T-S model to accurately capture relation in high-dimensional and low-dimensional information, obtains characteristic information tensor of decoder by self-encoder AEReferring to a student network model denoising mechanism of DeSTSeg models, a normal image I _n obtains a normal characteristic information tensor through a pre-trained teacher network ResNet18Calculated by taking the channel C _k as the axisAndThe vector cosine similarity loss of the self-encoder AE is used as the difference between normal characteristic information and abnormal characteristic information, the self-encoder AE has the normal characteristic information reconstruction capability of an abnormal image by reducing the similarity loss, and a loss formula is as follows:

Wherein d ^k (h, w) is the vector cosine similarity loss, I _ca is the outlier, and L _rec(I_n,I_ca is the similarity loss.

However, the reconstruction from abnormal feature information to normal feature information is a challenging task, and reference TrustMAE is made, so that the introduction of additional memory information can help the student network decoder part to better complete the feature reconstruction task. The introduced embedding module replaces the original ResNet18 deepest block, i.e., the convolution of 2x conv3x3, c=512.

S3, building an embedded module, defining a group of memories, addressing the similarity of the memory groups by utilizing the input characteristics, and obtaining reconstruction characteristics according to the similarity, wherein the method comprises the following steps of:

The input of the embedded module is The embedded module defines a group of internal memoriesFor reconstructing featuresUsing input featuresSimilarity addressing is performed on the memory group:

similarity ω _i is calculated by cosine distance:

Wherein, Is an input featureT ₀ is the transpose of the vector;

The memory group M is obtained by a first round of model distillation training mode, and during the first round of model distillation training, a model of the student network inputs a normal sample image I _n, and data passes through the student network Encoder to obtain normal characteristic informationThe memory group is internally updated by model parameters, normal characteristic information dimension reduction data is stored as memory data, and the memory group is subjected to similarity addressing to obtain reconstructed characteristicsFittingAndAnd finally obtaining the memory group M embedded in the module.

S4, constructing an MFF module to complete guidance of the decoder characteristic reconstruction by the decoder multi-scale characteristics, wherein the method comprises the following steps:

The MFF (Multi-scale feature fusion) module inputs are characteristic outputs from different blocks of the encoder AE In order to reduce the burden of an MFF module on the operand, notifying the patch information of the extracted features, pooling E ₁,E₂,E₃, finishing the alignment of feature dimensions after processing, fusing the multi-scale feature information, taking E ₁,E₂,E₃ after pooling as a feature information block, wherein the feature information block contains semantic information of a plurality of scales, fusing the information of the block features in the channel direction through 1X 1, and in order to check in, i.e. guiding features of the same level by the same level of semantic, avoiding the error of the semantic guidance, and carrying out channel adjustment on the feature information block in sequence, wherein the method comprises the steps ofCorresponding channel and feature size, M ₃,M₂,M₁＝{pool(D_k, k.epsilon.3, 2, 1) } can be obtained due to the feature information of the self-encoder AEIf the information is directly guided, abnormal characteristic information leakage problem during characteristic information reconstruction can be generated, so that the MFF module introduces a difference information suppression function, wherein the difference information suppression function requires two input data, namely multi-scale characteristic information M ₃,M₂,M₁ and characteristic information tensor of a teacher networkThe former has abnormal characteristic information, the latter only has characteristic information of normal images, the difference between the characteristic information represents abnormal information points, and the difference between the two is measured based on a cosine formula:

The multi-scale fusion information for obtaining the abnormal information suppression is directly acted on the student decoder corresponding feature layer of the AE.

S5, constructing a lightweight segmentation network to finish the segmentation and positioning of the defects. The method comprises the following steps:

The method comprises the steps of taking an abnormal image as characteristic information { T ₁,T₂,T₃ } of an input teacher network and taking the abnormal image as characteristic information { S _D1,S_D2,S_D3 } of a Decoder of an AE self-encoder, taking the difference of characteristic mapping between the characteristic information as the input of a segmentation network, and carrying out dimension reduction processing on data information to be segmented to reduce the channel dimension to 3 dimensions by a residual block of ResNet before segmentation because of considering the consumption of calculation performance.

Referring to fig. 3, in the segmentation network, deeper information of the advanced semantic features is further gradually transferred to the shallow features through the IFC module, so that the multi-path refinement function is completed and the problem of information difference caused by reconstructed multi-scale feature information is solved. The IFC module outputs from the splitting network EncoderAs an input to the IFC module, forThe output of each layer is convolved to complete the compression of the channel, and then the size of the feature is reduced by a pooling layer, a CA attention processing block is introduced to the feature blocks with different scales, the front and back dimensions of the feature blocks and the channel are kept unchanged after passing through the CA attention module, and the channel number is convolved again after CABlock in order to keep the fusion of the subsequent multi-scale feature information, so that the number of the channels is obtained at the momentThe output of the IFC module is obtained through the following formula:

The function of the segmentation network in the system is to segment the positions of the reconstructed feature and the defect feature in the image, namely, the segmentation is to classify the pixel points of the image into normal pixel points and abnormal pixel points, and finally, the defect positioning of the defect sample is completed.

The segmentation network optimizes the segmentation training with focus loss and L1 loss, which can help the model focus on a few classes and difficult samples. And the L1 loss can improve the sparsity of output, so that the boundary of the segmentation mask is clearer. Masks _pre and anomalyMask _GT are linearly sampled to the same size dimension, where i, j represents the xy coordinates of the image:

Training phase:

adapting to data distribution, obtaining a storage module:

traversing the normal image of the industrial sample to obtain the characteristic output of the student network and the teacher network The teacher network is used for guiding the learning network to reconstruct the characteristic information:

initializing an embedding module, and utilizing a loss function L _m to enable the output of a student network to be similar to that of a teacher network, so as to complete the construction of the embedding module:

The model was trained using Adam optimizer Adam (L _rec+L_m).

Reconstructing abnormal characteristic information:

Traversing the normal image I _n and the abnormal image I _ca;

I _n and I _ca are respectively used as input of a teacher network and a student network to obtain image characteristic information

Taking student network encoder output as MFF module input to obtain multidimensional feature fusion information;

injecting the multidimensional feature fusion information into a student network decoder module;

calculating a loss function:

the model is trained using an SGD optimizer SGD (L _rec).

Training a segmentation network:

reconstructing output characteristic information and teacher output characteristic information by using student network, and integrating the two information:

The channels are adjusted using ResNet < 18 > residual modules:

inputting the adjusted characteristic information into a segmentation network to obtain an output of the segmentation network:

Calculate the loss function and train the model, where Mask _pre represents the correct classification Mask for the image:

L_focal＝(Mask_pre,Mask_GT)

Adam(L_focal)

Reasoning:

The defect detection flow of the system comprises the steps of inputting an image to be detected, inputting an AE { E _k,D_k }, and inputting a teacher network based on a student network: Splitting the network: the method comprises the steps of traversing an image to be detected, outputting a segmented image of the image to be detected, and specifically comprising the following steps:

preprocessing is completed by adjusting the size and mean variance of the image to be detected;

and taking the image to be detected as an input to transmit a teacher and a student network, and completing characteristic information reconstruction by the student network:

Adjusting feature information using ResNet < 18 > residual modules Number of channels:

Obtaining a defect segmentation image output of the image to be detected by using a segmentation network Seg:

Finally, the system outputs a defect segmentation image of the image to be detected.

Comparing the defect detection method of the MVTec AD dataset with the traditional Patch core, EFFICIENTAD, RD, RD ++, dream, fastflow methods, reference may be made to table 1:

Table 1Computational efficiency performance

As can be seen from the table, in the method of this example, the average value of AUROC [ Area render THE RECEIVER Operating Characteristic Curve ] image level in MVTecAD public industrial defect data reaches 99.40%, the average value of AUROC pixel level reaches 98.25%, and the average value of PRO [ Per-Region Overlap ] measurement reaches 95.44%. And, the image level defect detection performance reaches the SOTA level in the MPDD common industrial defect data.

The foregoing is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art, who is within the scope of the present invention, should make equivalent substitutions or modifications according to the technical scheme of the present invention and the inventive concept thereof, and should be covered by the scope of the present invention.

Claims

1. The method for detecting the industrial defect by reconstructing the characteristic information based on the multidimensional characteristic fusion is characterized by comprising the following steps:

S1, building a cutting type self-encoder AE;

s5, constructing a lightweight segmentation network to finish segmentation and positioning of the industrial sample defects.

2. The method for detecting industrial defects by reconstructing feature information based on multi-dimensional feature fusion according to claim 1, wherein in the clipping self-encoder AE of S1, encoder and the decoder are both composed of the first three blocks of ResNet18 without pre-training, encoder is represented as E _k, k E {1, Λ, N }, and the decoder is represented as D _k, k E {1, Λ, N }, and are linked by an embedded memory module M in the middle, and the image feature information corresponds to each of the image feature information Representing a projection of the original data I into the embedding space; Wherein C _k、H_k、W_k represents the number of channels, the height and the width of the k-th layer activation tensor, respectively.

3. The method for detecting industrial defects by reconstructing characteristic information based on multi-dimensional characteristic fusion according to claim 1, wherein the teacher network of S2 uses ResNet model pre-trained in image Net dataset to conduct information guidance by using characteristic information tensors of the first three block layers, utilizes cosine similarity as KD loss, obtains characteristic information tensors of decoder through self-encoder AEObtaining normal characteristic information tensor through pretrained teacher network of normal image I _n Calculated by taking the channel C _k as the axisAndThe similarity loss is taken as the difference between normal characteristic information and abnormal characteristic information, the self-encoder AE has the normal characteristic information reconstruction capability of an abnormal image by reducing the similarity loss, and the similarity loss L _rec(I_n,I_ca) is calculated as follows:

4. The method for detecting industrial defects by reconstructing feature information based on multi-dimensional feature fusion according to claim 1, wherein the input of the embedding module in S3 is thatThe embedded module defines a set of internal memoriesFor reconstructing featuresSimilarity addressing is performed on the memory group by using the input characteristics:

similarity ω _i is calculated by cosine distance:

Wherein, Is an input featureT ₀ is the transpose of the vector;

5. The method for detecting industrial defects by reconstructing feature information based on multi-dimensional feature fusion according to claim 1, wherein the MFF module input in S4 is the feature output from different blocks of encoder AEE ₁,E₂,E₃ is subjected to pooling treatment, alignment of characteristic dimensions is completed after treatment, E ₁,E₂,E₃ after pooling is used as a characteristic information block, semantic information of multiple scales is contained in the characteristic information block, information fusion in the channel direction is carried out on block features through 1 multiplied by 1, channel adjustment is carried out on the characteristic information block, and the characteristic information block is sequentially adjusted to beCorresponding channel and feature sizes, resulting in M ₃,M₂,M₁＝{pool(D_k, k ε 3,2, 1);

6. The method for detecting the industrial defect by reconstructing the characteristic information based on the multi-dimensional characteristic fusion according to claim 1, wherein the segmentation network in the S5 takes an abnormal image as the characteristic information { T ₁,T₂,T₃ } of an input teacher network and takes the abnormal image as the characteristic information { S _D1,S_D2,S_D3 } of a Decoder of an AE self-encoder, and takes the difference of characteristic mapping between the characteristic information as the input of the segmentation network;

7. The method for detecting the industrial defect by reconstructing the characteristic information based on the multi-dimensional characteristic fusion according to claim 6, wherein the segmentation network optimizes the segmentation training by utilizing focus loss and L1 loss;

p_ij＝Mask_GT(i,j)×Mask_pre(i,j)+[1-Mask_GT(i,j)]×[1-Mask_pre(i,j)]