CN119180988A - A visual recognition method and device based on computer processing - Google Patents
A visual recognition method and device based on computer processing Download PDFInfo
- Publication number
- CN119180988A CN119180988A CN202411219159.2A CN202411219159A CN119180988A CN 119180988 A CN119180988 A CN 119180988A CN 202411219159 A CN202411219159 A CN 202411219159A CN 119180988 A CN119180988 A CN 119180988A
- Authority
- CN
- China
- Prior art keywords
- wafer
- defect
- image
- model
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0475—Generative networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/094—Adversarial learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/42—Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/52—Scale-space analysis, e.g. wavelet analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/766—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using regression, e.g. by projecting features on hyperplanes
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/776—Validation; Performance evaluation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/70—Labelling scene content, e.g. deriving syntactic or semantic representations
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/30—Noise filtering
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/36—Applying a local operator, i.e. means to operate on image points situated in the vicinity of a given point; Non-linear local filtering operations, e.g. median filtering
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/06—Recognition of objects for industrial automation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Biophysics (AREA)
- Biodiversity & Conservation Biology (AREA)
- Testing Or Measuring Of Semiconductors Or The Like (AREA)
- Investigating Materials By The Use Of Optical Means Adapted For Particular Applications (AREA)
Abstract
The invention provides a visual identification method and a visual identification device based on computer processing, wherein the method comprises the steps of S1, collecting picture data based on a wafer generation line, preprocessing and label judging the collected picture data to obtain a wafer defect detection data set, S2, constructing a feature extraction and feature enhancement network aiming at the wafer defect detection data set obtained in the S1 to obtain a multi-level feature map, S3, designing a wafer defect detection model based on the multi-level feature map obtained in the S2 to obtain a semiconductor wafer defect detection model, S4, constructing a training strategy combining with antagonism collaborative learning based on the semiconductor wafer defect detection model obtained in the S3, training to obtain a final wafer defect detection model, S5, deploying the wafer defect detection model obtained in the S4 on a production process equipment visual module, and detecting and feeding back defect problems in real time. The invention has the following beneficial effects that the accuracy is maintained and the detection accuracy and the robustness are obviously improved.
Description
Technical Field
The invention belongs to the field of computer vision, and particularly relates to a multitasking and collaborative visual detection method for defects of a semiconductor wafer.
Background
With the rapid development of semiconductor technology, the types of defects occurring in the wafer fabrication process are becoming more and more complex and diverse. These defects not only affect the yield of the wafer, but also have profound effects on the performance and reliability of the final product. Accurate defect detection is a critical element in order to ensure high quality wafer fabrication. However, with the continuous progress of the process technology, the conventional inspection method is faced with small and complex defects, which gradually exposes the problem that it is difficult to meet the modern manufacturing requirements. Therefore, there is an urgent need to develop more advanced inspection technologies to cope with the increasing inspection demands and to ensure the production efficiency and the product quality.
Existing PCB semiconductor wafer defect detection mainly covers the following three methods, namely a template matching-based method, an image segmentation-based method and a machine learning-based method:
(1) Template matching-based method:
the method detects defects by pixel-level comparison of the wafer image to be inspected with a standard defect-free wafer image. Although simple to implement, there are the following disadvantages:
a) The requirement on image alignment is extremely high, and small deviation can lead to false detection;
b) The ability to detect non-repetitive defects is poor;
c) The calculation complexity is high, and the real-time requirement of online detection is difficult to meet.
(2) Image segmentation-based method:
The method firstly segments the wafer image and then analyzes the segmented region characteristics to identify defects. Its major drawbacks include:
a) The robustness of the segmentation algorithm is insufficient, and the segmentation algorithm is easily influenced by image noise and illumination variation;
b) The detection effect on micro defects and low-contrast defects is poor;
c) It is difficult to effectively distinguish defects from normal structural changes.
(3) A machine learning based method:
Such methods utilize traditional machine learning algorithms (e.g., SVM, random forest, etc.) to classify the extracted image features. Although improved over the former two methods, the following problems remain:
a) The feature engineering relies on manual design, so that complex defect features are difficult to comprehensively capture;
b) The generalization capability is limited, and the adaptability to the type of the defects which are not found is poor;
c) It is difficult to achieve both high accuracy and high efficiency detection.
Aiming at the limitations of the existing semiconductor wafer defect detection methods, the problems of insufficient precision, low efficiency, weak generalization capability and the like can be seen when the defects of the complex and diversified wafers are processed. Particularly, in the prior process, the traditional method is difficult to cope with increasingly miniaturized and diversified defect types, and cannot meet the requirements of modern semiconductor manufacturing on high-precision and high-efficiency defect detection. Therefore, there is a need to develop a new semiconductor wafer defect detection method to overcome these drawbacks and realize high-precision, high-efficiency and powerful semiconductor wafer defect detection.
Disclosure of Invention
In order to solve the problems, the invention provides a visual recognition method for computer processing, which effectively realizes high-precision and high-efficiency wafer defect detection, and particularly shows strong adaptability and robustness in the presence of complex and diversified defect types.
A visual recognition method for computer processing, the method comprising:
S1, acquiring original semiconductor wafer image data based on a real semiconductor wafer generation line, and cutting, denoising and data enhancement data preprocessing methods for an acquired original image data design center to ensure smoothness of image data, and performing label cutting to obtain a complete wafer defect detection data set;
S2, constructing a dynamic multi-scale feature extraction and feature enhancement network of the wafer defect image according to the wafer defect detection data set obtained in the S1, and obtaining a multi-level feature map containing local details and global semantic information of the wafer defect image;
S3, designing a multitasking wafer defect detection model based on a multi-stage feature map of the wafer defects to obtain classification, regression and circular boundary prediction results of the wafer defects, thereby realizing visual detection of the semiconductor wafer defects;
s4, constructing a training strategy combining the antagonism collaborative learning based on the semiconductor wafer defect detection model, and performing multistage progressive training to obtain a final wafer defect detection model;
And S5, deploying a model, namely deploying the wafer defect detection model obtained in the step S4 on a visual module of process equipment for producing the semiconductor wafer, and detecting and feeding back whether the produced semiconductor wafer has defect or not in real time.
The visual recognition method as described above, the collecting and manufacturing process of the data set in S1 includes:
(1) Raw data acquisition
The source and acquisition method of the semiconductor wafer defect image data is that the original data acquisition is carried out by installing high-precision industrial cameras on a semiconductor wafer production line. An ace series industrial camera (model: acA4112-20 um) from Basler corporation, germany was used, which had a4112 x 3008 pixel resolution, up to 20 frames/sec acquisition speed. The camera is mounted above the wafer transport track and cooperates with the light source system to ensure that a clear image of the wafer surface is captured.
To ensure image quality, an LED ring light source (model: CCS LDR2-70SW2-LA 1) is used to provide uniform and stable illumination conditions. In the image acquisition process, triggering of the camera and the light source is precisely controlled by a PLC (programmable logic controller) so as to ensure that each wafer can be completely captured.
The raw image data is saved in a 16-bit TIFF format, with 65536 gray levels per pixel, to ensure high dynamic range and detail retention of the image. The file size of each image is about 24MB, and the naming convention is "wafer_ YYYYMMDD _ HHMMSS _sequence number. Tiff", wherein YYYYMMDD represents the date, HHMMSS represents the time, and the sequence number is the sequential number collected on the current day.
Data diversity and representativeness analysis to ensure diversity and representativeness of data sets, wafer samples were collected from different production lots, from different process stages. The method specifically comprises the following steps:
wafers of different sizes 6 inches, 8 inches and 12 inches;
wafers of different process nodes are 28nm, 14nm and 7nm;
Wafers at different process stages, namely after photoetching, etching and polishing;
different defect types, scratch, crack, flaking and foreign object attachment;
different defect levels, ranging from mild to severe.
A total of 100,000 raw wafer images were acquired, containing various defect types and defect-free samples. Through data analysis, the distribution of various defects in a data set is basically balanced, and the defect-free sample accounts for about 60 percent so as to simulate the occurrence frequency of the defects in actual production. The acquired raw image data is denoted as dataset.
(2) Data preprocessing
ROI region clipping and its effect on model accuracy considering the circular nature of the wafer, a circular ROI (region of interest) clipping algorithm is designed:
Firstly detecting the edge of a wafer by using Hough circle transformation to obtain the center coordinate and the radius of the wafer, secondly taking the detected center as the center of a circle, taking a circular region with the radius slightly smaller than the detection radius (98 percent) as the ROI, and finally setting the pixel value of the region outside the ROI as 0, and reserving the original image information in the ROI. The cutting method can effectively remove the interference information of the wafer edge and improve the attention of the model to the defects of the central area.
Image noise reduction processing in wafer images, gaussian white noise and speckle noise are generally present due to thermal noise of sensor readout circuits in ace-series industrial cameras and the influence of high-frequency components of the images. In order to reduce the influence of these noises on the image quality and the subsequent processing, image denoising processing is required.
First, gaussian white noise is removed by gaussian filtering. Gaussian filtering by convolution with the gaussian kernel can effectively smooth the image and remove noise. Specifically, the calculation formula of the gaussian filter is:
Where σ is the standard deviation of the gaussian kernel, x and y represent the pixel positions in the image, respectively, to represent the offset of each pixel point to the center of the filter. This offset is used to calculate the value of the gaussian distribution function to determine the weight of the pixel. Gaussian filtering can effectively reduce gaussian white noise in an image by convolving the image.
Data enhancement technology in order to increase the diversity of data and generalization capability of a model, the following data enhancement technology is adopted:
Random rotation, namely randomly rotating the image within the range of [ -10 degrees, 10 degrees ];
Random scaling, namely randomly scaling the image in the range of [0.9,1.1 ];
random translation, namely randomly translating within the range of +/-5% of the image width and height;
random brightness and contrast adjustment, wherein the brightness adjustment range is [0.8,1.2] and the contrast adjustment range is [0.8,1.2];
adding Gaussian noise with the mean value of 0 and standard deviation within the range of [0,0.05] into the random Gaussian noise;
random horizontal and vertical overturn, namely overturning at 50% probability;
These data enhancement techniques are implemented through Albumentations libraries and applied in real-time during the training process. Each original image would generate 5 enhanced images, expanding the dataset to 500,000. This enhancement strategy significantly improves the model's ability to adapt to various lighting conditions and defect morphologies.
The image data after the data preprocessing is recorded as data.
(3) Label arbitration
The manual labeling of the collected raw image data is to provide accurate training data for subsequent defect detection tasks. Industrial-level labeling software LabelImg is employed that has an intuitive user interface and rich functionality. The user can draw a bounding box on the image to annotate the target object through simple operation, and corresponding label information is added for each annotation box. The labeling procedure follows the following rules:
strict pixel level labeling, namely accurately drawing a boundary box of a defect area on an image by labeling personnel so as to ensure the accuracy and precision of labeling;
The defect classification, regression and circular boundaries are used as labels, wherein the classification labels are labeled label cls and correspond to five defect types (no defect, scratch, crack, flaking and foreign object adhesion), the value of each dimension of the classification labels represents the confidence score of the defect of the type, the regression labels are labeled label reg and contain the boundary frame coordinate information of a prediction target and are used for accurately positioning the position of the defect, and the circular boundary labels are labeled label cir and give circular parameters closely related to the defect boundary, so that the accuracy of boundary positioning can be further improved.
Each image may contain 0 or more defect labels, and labeling personnel need to label all existing defect areas in the image according to actual conditions;
and (3) storing the labeling result, namely storing the labeling result according to a PASCAL VOC format, wherein the format can be conveniently integrated and processed with the wafer defect detection model.
The labeling process requires labeling personnel to have certain expertise and experience so as to ensure the accuracy and usability of labeling results.
(4) Data set generation
In order to divide the marked data into a training set, a verification set and a test set, and prepare for training, evaluation and testing of the model, a data set needs to be constructed, namely the marked data set is marked as (data), and the training set, the verification set and the test set are divided according to the proportion of 8:1:1.
In order to divide the annotated data into a training set, a validation set and a test set, so as to prepare for training, evaluation and testing of the model, the data set (data, label) needs to be divided according to the proportion of 8:1:1, and the specific steps are as follows:
The labeled dataset (data, label) is divided into 80% of the dataset used as a training Set (TRAINING SET), 10% of the dataset used as a Validation Set (Validation Set), and 10% of the dataset used as a Test Set (Test Set).
Such partitioning can ensure that there is sufficient data for parameter adjustment and optimization during model training, and evaluation of the validation set and final performance validation of the test set are performed after model training to ensure generalization and validity of the model.
The visual recognition method as described above, the image dynamic multi-scale feature extraction and feature enhancement network construction process in S2 includes:
The scheme creatively provides a wafer image dynamic multi-scale feature enhancement network, and breaks through the limitation of traditional single-scale feature extraction. Through multi-scale feature extraction and enhancement and combination of a dynamic weight distribution mechanism, the network can capture local details and global semantics at the same time, and the richness and the robustness of feature representation are remarkably improved. Particularly, the module for dynamically adjusting the feature weight effectively highlights key information, suppresses irrelevant noise, and enhances the adaptability of the model in complex production line environments. In addition, the network design pays attention to the calculation efficiency, and the method is suitable for semiconductor wafer production line equipment by taking the performance and the instantaneity into consideration through the characteristic reuse and the light-weight structure. The method not only improves the accuracy of defect detection, but also enhances the robustness of the system under various visual interference conditions.
Processing an input wafer image from a bottom layer to a top layer, firstly extracting feature graphs { C2, C3, C4, C5} with different scales, and then constructing a multi-scale feature enhancement network according to the following steps:
Starting from the deepest P5 layer, an upper wafer feature map is generated by upsampling, and this high-level semantic information is transferred to the top layer of the multi-scale feature enhancement network:
M5=C5
wherein M5 is the top-most feature map in the multi-scale feature enhancement network, from the deepest level C5 feature map, and then performing 1×1 convolutional channel number adjustment on M5:
P5=Conv1×1(M5)
then constructing a multi-scale feature enhancement network from top to bottom:
M4=C4+Upsample(P5)
P4=Conv3×3(M4)
Conv3×3 is a3×3 convolution, and is used for fusing features of different scales, upsample is an up-sampling operation implemented by deconvolution, the steps are repeated to gradually generate { P3, P4, P5}, and finally P3, P4, P5 is a multi-scale feature enhancement network fused with multi-scale information;
The dynamic characteristic weight distribution module is integrated in the network to enhance the adaptability to the multi-scale characteristics and learn the correlation among the characteristics at different positions in the wafer characteristic diagram. By highlighting important features and suppressing irrelevant features, the mechanism first generates a spatial attention sub-module, i.e. an attention weight for each location, and then combines the spatial attention sub-module with the original feature map to obtain an enhanced feature representation. The method comprises the following specific steps:
The input of the feature weight dynamic allocation module is a wafer feature map X, the dimension is (C X H X W), two 1X 1 convolution checks X are used for channel compression, compressed wafer feature maps A and B are generated, and the dimension of the compressed wafer feature map is (C' X H X W):
A=X·Wa(C′×H×W)
B=X·Wb(C′×H×W)
Wherein, W a and W b are the weights of two 1×1 convolution kernels, respectively, and the similarity between the positions is calculated by using a and B to generate a spatial attention submodule map M, and the dimension is (h×w×h×w):
M=Softmax(AT·B)
Then, element level multiplication is carried out on the original input X and M, and a wafer characteristic diagram X' after weighting of the space attention submodule is obtained:
X′=X·M
and finally, adding X' and X, and outputting through a convolution layer:
O=γ·X′+X·Wc
Wherein, gamma is a leachable scaling factor for scaling the enhancement effect of the spatial attention sub-module, W c is a1×1 convolution kernel weight, and O is the final output wafer feature map of the feature weight dynamic allocation module.
The visual identification method as described above, the process of constructing the wafer defect detection model by the multitasking cooperation in S3 includes:
In the defect detection task, classification and regression are two interrelated but often separately processed sub-tasks. The traditional method respectively models and optimizes two tasks, and lacks knowledge fusion and mutual promotion between the two tasks. To solve this problem, the present invention devised a multitasking collaborative detection algorithm (Multi-task Collaborative Detection Algorithm, MCDA).
The key point of the MCDA is that a multi-task cooperative coding mechanism is introduced to mutually fuse the classification information and the regression information. The model cascades the feature graphs of the two subtasks through the convolution layer and mutually codes the information of the other party so as to obtain the classification and regression features fused with the two-way relation. In addition, the MCDA introduces a circular boundary prediction branch, and circular parameters related to the defect boundary can be predicted more finely through cascading classification and regression prediction results, so that the accuracy of boundary positioning is improved. Compared with the traditional separate modeling mode, the MCDA can better mine the inherent relation among classification, regression and shape, so that the consistency and accuracy of overall detection are improved. The method specifically comprises the following steps:
And (3) sending the dynamic multiscale feature extraction and the feature O output by the feature enhancement network into the MCDA, and simultaneously completing classification, regression and circular boundary prediction:
clspred,regpred,cirpred=MCDA(O)
Cls pred,regpred,cirpred is a classification prediction result, a regression prediction result, and a circular boundary prediction result of wafer defect detection, respectively.
The specific calculation process of the MCDA is as follows:
1) Performing basic convolution on O to obtain an initial predicted value { cls init,reginit }:
clsinit,reginit=Conv(O)
2) Performing relation coding on reg init and cls init to obtain regression characteristics reg fused fusing the relation between the reg init and cls init:
regfused=reginit+Conv(Concat(reginit,clsinit))
3) Meanwhile, cls init and reg init are subjected to relational coding to obtain classification characteristics cls fused fusing the relation between the cls init and reg init:
clsfused=clsinit+Conv(Concat(clsinit,reginit))
4) Convolution was performed on reg fused and cls fused, respectively, to yield the final classification prediction cls pred and regression prediction reg pred:
5) Cascading cls pred and reg pred, through an additional convolution branch, predicts the defect boundary circle parameter cir pred closely related to the defect boundary:
cirpred=Conv(Concat(clspred,regpred))
Through the multi-task collaborative coding, classification and regression knowledge can be fully fused, so that two tasks are mutually promoted. While introducing circular boundary prediction branches to more finely describe the shape and location of wafer defects.
In the existing target detection method, common loss functions such as FocalLoss, CIoULoss and the like only focus on optimizing indexes of a single task, and lack of consideration on interaction among multiple tasks and knowledge guidance. To solve this problem, the present invention proposes a multitasking synergy loss function.
The innovation of the loss function is to integrate three parts of classification, regression and circular boundary so as to realize joint optimization of three closely related subtasks of classification, regression and shape. Compared with the traditional single loss function, the multi-task cooperative loss function fully utilizes the correlation among the subtasks, realizes the mutual promotion among the tasks, and simultaneously integrates the guidance of priori knowledge, so that the model benefits from additional knowledge transfer in the optimization process, and the overall performance and the robustness of detection are improved.
Therefore, the invention adopts a multi-task cooperative loss function, and simultaneously optimizes three tasks of classification, regression and circular boundary prediction, namely:
L=λ1Lcls+λ2Lreg+λ3Lcir
Wherein L cls is a classification loss, focalLoss is used, L reg is a regression loss, CIoULoss is used, L cir is a circular boundary loss, polygonLoss is used, and lambda 1,λ2,λ3 is a super parameter for balancing each loss term.
Under the optimization of the multi-task cooperative loss function, the target detection network can output three prediction results, namely a classification prediction result cls pred, a regression prediction result reg pred and a circular boundary prediction result cir pred.
The classification predictor cls pred is a vector, and corresponds to four defect types (scratch, crack, flaking, foreign object adhesion), and the value of each dimension represents the confidence score of the defect type. The defect type in the current prediction frame can be determined by acquiring the dimension index with the highest score;
The regression prediction result reg pred contains the boundary frame coordinate information of the prediction target and is used for accurately positioning the position of the defect;
The circular boundary prediction result cir pred gives out a circular parameter closely related to the defect boundary, and the accuracy of boundary positioning is further improved.
After the three prediction results are obtained, the defect type can be judged, and accurate position and shape information is combined, so that accurate detection of various defects of the wafer is realized, and an important basis is provided for subsequent quality control and defect repair.
The visual recognition method as described above, wherein the step S4 includes:
The labeled training data set is input into the model, and the model is trained by the scheme by providing an innovative antagonistic collaborative learning strategy. The strategy is innovative multi-stage progressive training, the local detail and global semantic information in wafer image data and the dynamic relation of the local detail and the global semantic information in the fusion process are fully considered, and the performance of the model is improved through multi-scale sequential consistency and depth collaborative optimization.
(1) Multiscale timing consistency loss:
To enhance the model's learning of different time batches of data information, the present proposal proposes a multi-scale timing consistency penalty. The loss function measures the predicted consistency across different time scales:
Ltc=∑sλs*DKL(P(Y|Xt),P(Y|X{t-s}))
Where s represents different time scales, λ s is the corresponding weight, D KL is the KL divergence, and P (y|x t) represents the detection distribution at time step t, including classification detection, regression detection, and circular boundary detection, as summarized for one total detection result. This loss encourages the model to maintain predictive consistency across different time scales, thereby improving the modeling ability of different time batch data.
(2) Resistance training strategy:
to increase the robustness of the training, the present proposal introduces an antagonistic training strategy. In particular, a discriminator D is designed in an attempt to distinguish whether a feature is from local detail or global semantic information. The converged network F is then trained to spoof the arbiter:
Ladv=E[log(D(F(Xv,Xa)))]+E[log(1-D(Xv))+log(1-D(Xa))]
Wherein X v and X a are local detail and global semantic information inputs, respectively. Such countermeasure training forces the network to generate more indistinguishable, more tightly fused representations of features.
(3) Course learning and difficulty adaptation
According to the (2) resistance strategy, a course learning strategy based on sample difficulty is provided. The sample difficulty D (x) is defined as:
D(x)=1-exp(-γ(wvLv+waLa))
where L v and L a are loss of local detail and global semantic information, respectively, and γ is an adjustable parameter. In the training process, the proportion of difficult samples is gradually increased:
Wherein T is the current number of training steps, T is the total number of training steps, p 0 is the initial difficult sample proportion, and mu controls the rate of increase of difficulty.
(4) Dynamic batch normalization
The present proposal proposes a dynamic batch normalization (Dynamic Batch Normalization, DBN) technique to account for variability in illumination and noise conditions in semiconductor wafer production lines. The DBN dynamically adjusts normalization parameters according to the input statistical characteristics:
y=γ(x)(x-μ(x))/(σ(x)+ε)+β(x)
where γ (x) and β (x) are input dependent scaling and offset parameters. The method can be better suitable for the characteristic distribution change under different input conditions.
(5) Model training
Based on the method, the specific flow of model training is as follows:
1) Initializing a model parameter theta, wherein theta represents a parameter to be updated and comprises all weight matrixes and bias vectors in the wafer defect detection model;
2) For each training step t:
a. sampling a batch from the dataset, containing difficult samples in a proportion of p hard (t);
b. forward propagation is carried out to obtain multi-scale characteristics and prediction results;
c. Calculating a multi-scale time sequence consistency loss L tc;
d. performing an antagonism training strategy, and updating the discriminator D and the fusion network F;
e. applying dynamic batch normalization;
f. calculating the total loss L total=λtc*Ltc+λadv*Ladv;
g. back propagation, updating model parameters: Wherein the method comprises the steps of Is the learning rate.
3) And (2) repeating the step (2), and terminating the training process when the loss cannot be reduced by N continuous epochs. And storing final model parameters at the end of training to serve as a deployment model for wafer defect detection.
The visual inspection method as described above, the wafer defect inspection model deployment process in S5 includes:
(1) Hardware platform selection and configuration:
a. Selecting proper edge computing equipment, such as NVIDIAJetson series or Intel NUC high-performance embedded systems, according to the actual requirements of a semiconductor wafer production line;
b. Configuring necessary deep learning frames and dependency libraries, such as CUDA, CUDNN and the like, on a selected hardware platform to support efficient operation of the model;
c. Optimizing hardware resource allocation, and reasonably setting GPU memory use limit and CPU thread number to balance detection performance and system stability.
(2) Model integration and interface development:
a. Designing and realizing a model reasoning interface, which comprises an image preprocessing function module, a model reasoning function module and a result post-processing function module;
b. developing a communication interface with a production line control system to realize real-time transmission and feedback of detection results;
c. and a caching mechanism is constructed, so that data stream processing is optimized, and the influence of I/O operation on the detection speed is reduced.
(3) Real-time image acquisition and pretreatment:
a. the wafer image is acquired in real time through a high-speed industrial camera, so that the image quality and the acquisition frequency are ensured to meet the detection requirement;
b. Realizing an image preprocessing pipeline, including center clipping, denoising and data enhancement operations, so as to improve the quality of an input image;
c. and an asynchronous processing mechanism is adopted, and preprocessing is performed while the image is acquired, so that the computing resource is utilized to the maximum extent.
(4) Defect detection and result output:
a. Inputting the preprocessed image into a deployed wafer defect detection model, and executing reasoning operation;
b. outputting an analysis model, and extracting defect types, boundary frame coordinate information and circular parameter information;
c. and screening and grading the detection result according to a preset threshold value to reduce false alarm and missing report.
(5) And (3) visualizing and storing detection results:
a. developing a real-time visual interface, and intuitively displaying detection results, wherein the detection results comprise defect types, boundary frame coordinate information and circular parameter information;
b. the local storage function of the detection result is realized, and the local storage function comprises an original image, the detection result and related data;
(6) Linkage and feedback of the production line:
a. Transmitting the detection result to a production line control system in real time for automatic decision making;
b. Developing an alarm mechanism, and timely notifying related personnel and triggering emergency response of a production line when a serious defect is detected;
c. And the correlation analysis of the detection result and the production parameter is realized, and data support is provided for process optimization.
(7) Performance evaluation and continuous optimization:
a. Establishing a periodic performance evaluation mechanism, including statistical analysis of key indexes of detection accuracy, recall and processing speed;
b. developing an automatic test flow, and evaluating the performance of the model in an actual production environment by using a standard test set;
c. And continuously optimizing model parameters and deployment strategies according to the performance evaluation result and the production feedback, and continuously improving the overall performance of the detection system.
The invention also provides a visual recognition device for computer processing, which uses the visual recognition method.
Compared with the prior art, the invention has the following beneficial effects:
(1) Multitasking collaborative optimization:
The invention introduces a multi-task collaborative optimization strategy in wafer defect detection, and integrates classification, regression and circular boundary prediction tasks into an integral frame. Conventional detection methods often focus on a single task, such as performing only defect classification or region segmentation, resulting in limitations and deviations in the detection results. Through the multi-task collaborative optimization, the method and the system can utilize the relevance and complementarity among different tasks to improve the overall performance of the model. Specifically, the cooperative mechanism enables the model to improve the detection capability of tiny and edge defects while maintaining the accuracy when processing complex and diversified defects, and remarkably enhances the comprehensiveness and robustness of detection.
(2) Dynamic multi-scale feature extraction and enhancement:
The present invention innovatively proposes a dynamic multi-scale feature extraction and enhancement network architecture to address the different scale defects on semiconductor wafers. Conventional inspection methods typically employ fixed-scale feature extraction, which tends to lose critical information in the face of complex wafer defects, particularly when dealing with defects of varying sizes and morphologies. According to the invention, by introducing a dynamic multi-scale feature extraction network, the extraction scale can be dynamically adjusted according to the geometric characteristics and the position of the defect, so that the global structure and the local detail of the defect are fully captured in the multi-scale feature fusion process. Meanwhile, the feature enhancement mechanism further optimizes the expression capability of key features, so that the model can accurately identify fine defects under a complex background, and the accuracy and the robustness of detection are improved.
(3) Antagonistic collaborative learning training strategy:
In order to further improve the generalization capability of the model, the invention adopts a unique antagonistic collaborative learning training strategy. In the strategy, the model performs collaborative optimization on multiple tasks by simulating various complex conditions in an actual production environment in the training process. By introducing a contrast factor in the multi-stage progressive training, the model is able to effectively cope with different types of wafer defects, especially in the face of unknown or very challenging defects, yet still be able to maintain a high level of inspection accuracy. The strategy remarkably enhances the resistance of the model to various interference factors, so that the model shows stronger robustness and reliability in practical application.
Drawings
FIG. 1 is a diagram of a wafer image dynamic multi-scale feature enhancement network;
Fig. 2 is an overall block diagram of semiconductor wafer defect detection.
Detailed Description
FIG. 1 is a diagram of a wafer image dynamic multi-scale feature enhancement network incorporating feature weight dynamic allocation modules in the network to enhance adaptability to multi-scale features and learn correlations between features at different locations in the wafer feature map. By highlighting important features and suppressing irrelevant features, the mechanism first generates a spatial attention sub-module, i.e. an attention weight for each location, and then combines the spatial attention sub-module with the original feature map to obtain an enhanced feature representation.
Fig. 2 is a block diagram of a semiconductor wafer defect detection system, which includes the steps of firstly, collecting original semiconductor wafer image data based on a real semiconductor wafer generation line, preprocessing the data, including center cutting, denoising and data enhancement, ensuring the smoothness of the image data, finally obtaining a complete wafer defect detection data set, secondly, constructing a dynamic multi-scale feature extraction and feature enhancement network of a wafer defect image aiming at the data set, thereby generating a multi-stage feature map containing local details and global semantic information, then, designing a multi-stage feature map based on the multi-stage feature map, outputting a multi-stage collaborative wafer defect detection model, outputting classification, regression and circular boundary prediction results of wafer defects, realizing visual detection of the semiconductor wafer defects, and finally, performing progressive training by combining training strategies of antagonism collaborative learning, finally obtaining a high-efficiency wafer defect detection model, and disposing the high-efficiency wafer defect detection model into a visual module of semiconductor wafer production equipment, so as to realize real-time defect detection and feedback.
The above description is only of the preferred embodiments of the present application and is not intended to limit the present application, but various modifications and variations can be made to the present application by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.
While the foregoing describes the embodiments of the present invention, it should be understood that the present invention is not limited to the embodiments, and that various modifications and changes can be made by those skilled in the art without any inventive effort.
Claims (7)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202411219159.2A CN119180988A (en) | 2024-09-02 | 2024-09-02 | A visual recognition method and device based on computer processing |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202411219159.2A CN119180988A (en) | 2024-09-02 | 2024-09-02 | A visual recognition method and device based on computer processing |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN119180988A true CN119180988A (en) | 2024-12-24 |
Family
ID=93900659
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202411219159.2A Pending CN119180988A (en) | 2024-09-02 | 2024-09-02 | A visual recognition method and device based on computer processing |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN119180988A (en) |
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119380166A (en) * | 2024-12-30 | 2025-01-28 | 杭州宇泛智能科技股份有限公司 | Safety monitoring method based on hybrid large model and neural network algorithm |
| CN119619175A (en) * | 2025-02-12 | 2025-03-14 | 北京嘉海鼎盛科技有限公司 | A semiconductor testing equipment intelligent calibration method and system |
| CN119762733A (en) * | 2025-03-10 | 2025-04-04 | 合肥综合性国家科学中心能源研究院(安徽省能源实验室) | Wire rope defect identification and positioning system and method based on real-time target detection |
| CN119963555A (en) * | 2025-04-10 | 2025-05-09 | 广东索鲁达科技有限公司 | Method and system for detecting wafer defects based on attention pyramid changes |
| CN120451176A (en) * | 2025-07-14 | 2025-08-08 | 杭州映图智能科技有限公司 | AI-based bottle cap 360-degree full-view defect high-speed detection method and system |
-
2024
- 2024-09-02 CN CN202411219159.2A patent/CN119180988A/en active Pending
Cited By (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119380166A (en) * | 2024-12-30 | 2025-01-28 | 杭州宇泛智能科技股份有限公司 | Safety monitoring method based on hybrid large model and neural network algorithm |
| CN119619175A (en) * | 2025-02-12 | 2025-03-14 | 北京嘉海鼎盛科技有限公司 | A semiconductor testing equipment intelligent calibration method and system |
| CN119762733A (en) * | 2025-03-10 | 2025-04-04 | 合肥综合性国家科学中心能源研究院(安徽省能源实验室) | Wire rope defect identification and positioning system and method based on real-time target detection |
| CN119963555A (en) * | 2025-04-10 | 2025-05-09 | 广东索鲁达科技有限公司 | Method and system for detecting wafer defects based on attention pyramid changes |
| CN120451176A (en) * | 2025-07-14 | 2025-08-08 | 杭州映图智能科技有限公司 | AI-based bottle cap 360-degree full-view defect high-speed detection method and system |
| CN120451176B (en) * | 2025-07-14 | 2025-09-05 | 杭州映图智能科技有限公司 | AI-based bottle cap 360-degree full-view defect high-speed detection method and system |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN119180988A (en) | A visual recognition method and device based on computer processing | |
| US11816149B2 (en) | Electronic device and control method thereof | |
| CN114663346A (en) | Strip steel surface defect detection method based on improved YOLOv5 network | |
| CN113822880A (en) | A deep learning-based crack identification method | |
| CN107133569A (en) | The many granularity mask methods of monitor video based on extensive Multi-label learning | |
| CN115439458A (en) | Industrial image defect target detection algorithm based on depth map attention | |
| CN113808079B (en) | Industrial product surface defect self-adaptive detection method based on deep learning model AGLNet | |
| CN113487610B (en) | Herpes image recognition method and device, computer equipment and storage medium | |
| Cui et al. | Real-time detection of wood defects based on SPP-improved YOLO algorithm | |
| CN114627366A (en) | Industrial intelligent visual algorithm cloud service system and method | |
| Lin et al. | Integrated circuit board object detection and image augmentation fusion model based on YOLO | |
| CN120014500A (en) | Road crack analysis method based on improved YOLO algorithm | |
| CN119559435A (en) | Multi-modal small sample image classification method and system based on multi-scale dynamic feature fusion | |
| CN119625440B (en) | Flotation froth anomaly classification method and system based on multi-scale feature fusion | |
| CN119863700A (en) | Visual-based marine industry equipment identification method | |
| CN110097603B (en) | Fashionable image dominant hue analysis method | |
| CN119323567B (en) | Module lens dirt detection and glue line detection method | |
| Liu et al. | An improved YOLOv8 model and mask convolutional autoencoder for multi-scale defect detection of ceramic tiles | |
| CN110889418A (en) | Gas contour identification method | |
| CN119274122A (en) | A sanitation work evaluation method based on MRS-YOLO model | |
| Shi et al. | Positive Anchor Area Merge Algorithm: A Knowledge Distillation Algorithm for Fruit Detection Tasks Based on Yolov8 | |
| He et al. | A review of fabric defect detection in textile manufacturing | |
| CN119068179B (en) | Remote sensing image aerial airplane detection method based on ambiguity guiding region screening | |
| Samuthiram et al. | Surface defect prediction on printed circuit boards using a novel deep learning model with spatial and channel attention-based DenseNet | |
| Wei et al. | YOLO11n-Screw: A Recognition Method for Aircraft Screw Tail Marks |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication |