R-FCN knife switch detection method combining local features and global features
Technical Field
The invention relates to the technical field of disconnecting link detection in an auxiliary monitoring system of a transformer substation, and relates to an R-FCN disconnecting link detection method combining local characteristics and global characteristics.
Background
The transformer substation is an important transit place in high-voltage transmission, and various vital on-off control devices are operated in the transformer substation. When the transformer substation is interrupted in the past, related operators need to search for faults one by one, and the length of the fault detection time determines when the transformer substation works normally, so that the daily life of residents is seriously affected. The disconnecting link is used as an important power control switch of the transformer substation, and the opening and closing states of the disconnecting link directly determine whether the whole electrified circuit operates or not. The traditional investigation to the power failure state appears in the transformer substation, and the operating personnel at first need get into the transformer substation high voltage area and look for whether the switch breaks off, then investigation other connecting device one by one, takes place the electric shock danger easily. With the unmanned development demand of the transformer substation, the research on the automatic detection of the state of the disconnecting link in the snap-shot image is favorable for quickly identifying the current state of the disconnecting link, shortens the fault detection efficiency of the transformer substation, ensures the life safety of operators and effectively improves the safe operation of the power grid.
For automatic detection of the disconnecting link, the problems that the working environment of the disconnecting link is outdoor, various devices which are the same as the disconnecting link and are metal products exist, colors are similar, the disconnecting link is not easy to distinguish, the disconnecting link at the same position is affected by weather environment and different in recognition difficulty under different environments, and the acquired disconnecting link is often shielded by other object devices or cannot acquire the whole characteristics completely due to the influence of various voltage control devices, wires and rectangular shapes of the disconnecting link existing outdoors when an auxiliary monitoring system is deployed on a transformer substation are solved.
The knife switch detection method is divided into two types, namely 1) an image processing-based method and 2) a deep learning-based method. The method is affected by the acquisition of knife gate data, most of the current methods adopt a method based on image processing, a knife gate template is firstly established for a standard image acquired at a fixed angle position, then an image processing algorithm such as a characteristic point matching algorithm or a template matching algorithm is used for positioning and state identification, the method can obtain higher monitoring accuracy under the condition of good weather environment in a short time, but as the weather condition is frequently changed, the camera position can deviate on the basis of the angle of the pre-established template image, and the state of the knife gate cannot be identified, so that the camera position needs to be frequently checked and adjusted manually or the template is newly established. In addition, most of the current deep learning network models predict through global receptive fields, such as a fast-Rcnn series network model, a Yolo series network model and the like, when detecting the knife gate, if the knife gate is shielded or an acquired object is incomplete, the whole characteristic information of the knife gate is lack to generate missed detection or false detection, and part of the deep learning network models predict through local characteristics, such as an R-FCN network model, a VIT network model and the like, when detecting the knife gate, although incomplete knife gate can be effectively detected, when the knife gate is full of the whole image, the local characteristic prediction also can generate missed detection.
Disclosure of Invention
Aiming at the technical problems, the invention aims to provide a region-based full convolution target detection network R-FCN (R-FCN: object Detection via Region-based Fully Convolutional Networks) knife switch detection method based on the combination of local features and global features based on an R-FCN network model, wherein the global feature prediction is embedded into the R-FCN network model, and the accuracy of knife switch detection is improved by combining the advantages of the local features and the global feature prediction, so that the accuracy of the existing knife switch detection method is improved.
In order to achieve the purpose, the technical scheme adopted by the invention is that the R-FCN knife switch detection combining local characteristics and global characteristics comprises the following steps:
Step 1, a transformer substation auxiliary monitoring system is built to collect disconnecting link images;
Step 2, dividing, cleaning and labeling an image dataset;
Step 3, selecting an R-FCN trunk feature extraction network;
step 4, adjusting a trunk feature extraction network of the R-FCN;
Step 5, constructing a local feature prediction branch;
step 6, constructing a global feature prediction branch;
step 7, fusing a local prediction result and a global prediction result;
and 8, training a storage model.
Furthermore, the transformer substation auxiliary monitoring system in the step1 integrates all network monitoring cameras and related control equipment in the transformer substation, and the image data sets containing the disconnecting link and the image data sets not containing the disconnecting link at different moments, weather and backgrounds are acquired by adjusting the angles of all cameras, and the number of the image data sets containing the disconnecting link should be uniform and contain the two states of the switch of the disconnecting link.
Further, in the step 2, the data set is divided into an image containing a knife switch and an image not containing the knife switch, the data set containing the knife switch image is divided into a training set and a test set in a ratio of 8:2, the data set is cleaned, the data set with blur in the collected data set is removed, and the labeling data only labels the image containing the knife switch object.
Further, in step 3, an R-FCN backbone feature extraction network is selected, and the implementation manner is that a classification network ResNet is adopted as the backbone feature extraction network, a region suggestion network (RPN: region Proposal Network) is used after a fourth group of convolution layers of ResNet101 to operate to generate a region of interest (used for detecting a later knife gate), a pooling layer and a full connection layer after a fifth group of convolution layers of ResNet are discarded, and the number of channels finally output is 2048. The region of interest is a region in which a target is present.
In step 4, the trunk feature extraction network of the R-FCN is adjusted by adding a convolution layer with a convolution kernel size of 1×1 to reduce the number of channels to 1024 after the trimmed trunk feature extraction network ResNet is selected in step 3, so as to reduce the dimension of feature data but not change the size of the feature map, and improve the calculation speed.
Further, in the step 5, a local feature prediction branch is constructed, and the implementation mode is that a local prediction result is output by adopting an original network model R-FCN with local feature prediction, and the local feature prediction is divided into 7 multiplied by 7 local regions by adopting RPN suggestion regions for prediction.
Further, a global feature prediction branch is constructed in the step 6, and the implementation mode is that pooling operation is carried out on the extracted semantic features, the sizes of the extracted semantic features are unified, and then a convolution layer with the convolution kernel sizes of 7 multiplied by 7 and 1 multiplied by 1 is used in series to output a global prediction result.
Further, in the step 7, the local prediction result and the global prediction result are fused, and L2 regularization is required to be used for the output prediction results in the step 5 and the step 6, and the mathematical expression is as follows: Wherein x and y are vectors, x= (x 0,x1,x2...xn) represents a predicted output result, y= (y 0,y1,y2...yn) represents a regularized predicted output result, and finally, uniformly scaling the values of the two predicted output results to a range from 0 to 1, carrying out vector addition on the two predicted output results, and then carrying out Softmax operation to output a final predicted result, wherein the mathematical expression of Softmax is as follows: r i is the ith output value in the output result vector.
Further, in the step 8, the model is trained, and two kinds of divided data sets are used for training, wherein the first training only uses clear data sets containing the disconnecting link, and the second training uses fuzzy data sets containing the disconnecting link and not containing the disconnecting link but similar to the disconnecting link in a data quantity ratio of 1:1.
Further, in step 8, a model is trained, and a loss function mathematical expression used for training is:
where L (s, t) represents the total loss of classification loss and regression loss, s is the class prediction probability, Representing the prediction probability of category c *, t representing the regression frame of model prediction, L cls(s) being the classification loss, the mathematical expression being: Lambda multitask balance factor, c * is a category real label, when c * =0 represents background category, c * noteq 0 represents corresponding pair category, [ c * >0] constitutes a guide factor, the value is taken as 1, the regression frame is used for adjusting non-background category objects only, L reg(t,t*) is regression loss, t represents a model predicted target frame, and t * represents a manually marked real frame.
Compared with the prior art, the method has the beneficial effects that for knife switch detection with partial characteristic shielding and larger characteristics, the R-FCN network based on the combined local characteristics and global characteristics can predict and judge compared with the R-FCN network based on the original local characteristics. The R-FCN disconnecting link detection method based on the combined local features and the global features effectively reduces false detection rate and omission rate of disconnecting link detection in a complex scene of a transformer substation.
Drawings
In order to more clearly illustrate the technical solution of the present invention, the following description is further given with reference to the drawings required in the embodiments.
FIG. 1 is a schematic diagram of the operation flow of the R-FCN knife switch detection method combining local features and global features.
FIG. 2 is a block diagram of an R-FCN model combining local features and global features of the present invention.
FIG. 3 is a block diagram of the local feature prediction of the present invention.
FIG. 4 is a block diagram of global feature prediction of the present invention.
Detailed Description
The following describes embodiments of the present invention in detail with reference to FIGS. 1-4.
As shown in FIG. 1, the R-FCN knife switch detection method combining local features and global features comprises the following steps:
and 1, when the substation auxiliary monitoring system collects the disconnecting link images, the disconnecting link images are collected from natural factors such as different angles, different visual fields, different backgrounds and the like, so that the diversity of the collected pictures is ensured.
And 2, cleaning the data set acquired in the step 1 to remove damaged image data, and dividing the data set into an image data set only containing the knife switch and a data set which does not contain the knife switch but is similar to the knife switch. The total number of images of the acquired data set containing the knife switch is 800, 800 images which do not contain the knife switch but are similar to the knife switch are respectively divided, wherein the ratio of the training set to the testing set is 8:2, the data set only containing the knife switch images is marked with the position and the state of the knife switch, the number of samples is shown in table 1 for the first training, and the number of samples is shown in table 2 for the second training.
And 3, adjusting an R-FCN trunk feature extraction network, namely adopting a classification network ResNet101 as the trunk feature extraction network, generating a region of interest (for detection of a later disconnecting link) by using RPN operation after a fourth group of convolution layers of ResNet101, discarding a pooling layer and a full-connection layer after a fifth group of convolution layers of ResNet, and finally outputting 2048 channels, wherein the number of the channels is shown in figure 2.
Step 4, adding a convolution layer with a convolution kernel size of 1×1 after selecting the trimmed trunk feature extraction network ResNet101 in step 3 reduces the number of channels to 1024 to reduce the feature data dimension without changing the size of the feature map, as shown in fig. 2.
And 5, constructing a local feature prediction branch, namely outputting a local prediction result by adopting an original network model R-FCN with local feature prediction, and dividing the local feature prediction into 7X 7 local regions by adopting an RPN suggestion region to predict, wherein the local feature prediction is shown in figure 3.
And 6, constructing a global feature prediction branch, namely carrying out pooling operation on the extracted semantic features, unifying the sizes of the extracted semantic features, and then outputting a global prediction result by serially using convolution layers with the convolution kernel sizes of 7 multiplied by 7 and 1 multiplied by 1, as shown in figure 4.
And 7, merging the local prediction result and the global prediction result, namely regularizing the local prediction result and the global prediction result, uniformly scaling to the same numerical interval, and adding to complete information fusion prediction.
Further, the mathematical expression is: Wherein x and y are vectors, x= (x 0,x1,x2...xn) represents a predicted output result, y= (y 0,y1,y2...yn) represents a regularized predicted output result, and finally, uniformly scaling the values of the two predicted output results to a range from 0 to 1, carrying out vector addition on the two predicted output results, and then carrying out Softmax operation to output a final predicted result, wherein the mathematical expression of Softmax is as follows: r i is the ith output value in the output result vector.
And 8, training the model, namely training the model by using the initialized parameters of the model trunk feature extraction network ResNet101 on an ImageNet dataset, selecting a loss function which is the same as the R-FCN network by using the loss function used for model training, training by using data only containing a disconnecting link image for the first time until the loss function converges, storing the network model, training by using data containing a disconnecting link and data not containing a disconnecting link similar to the disconnecting link on the basis of the model parameters of the first training for the second time, and further characterizing the disconnecting link distinguishing capability of the network model, wherein the test result of the obtained model is shown in a table 3.
Further, in step 8, a model is trained, and a loss function mathematical expression used for training is:
where L (s, t) represents the total loss of classification loss and regression loss, s is the class prediction probability, Representing the prediction probability of category c *, t representing the regression frame of model prediction, L cls(s) being the classification loss, the mathematical expression being: Lambda multitask balance factor, c * is a category real label, when c * =0 represents background category, c * noteq 0 represents corresponding pair category, [ c * >0] constitutes a guide factor, the value is taken as 1, the regression frame is used for adjusting non-background category objects only, L reg(t,t*) is regression loss, t represents a model predicted target frame, and t * represents a manually marked real frame.
Table 1 is the statistics of the number of samples of the first training set.
Table 2 is the statistics of the number of samples of the second training set.
Table 3 shows the comparison of the accuracy of the R-FCN knife switch detection combining the local feature and the global feature.
Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the spirit and scope of the invention as defined by the appended claims and their equivalents.