CN118608964B - Transmission line arrester detection method based on infrared image and rotating target frame - Google Patents
Transmission line arrester detection method based on infrared image and rotating target frameInfo
- Publication number
- CN118608964B CN118608964B CN202410752339.0A CN202410752339A CN118608964B CN 118608964 B CN118608964 B CN 118608964B CN 202410752339 A CN202410752339 A CN 202410752339A CN 118608964 B CN118608964 B CN 118608964B
- Authority
- CN
- China
- Prior art keywords
- image
- lightning arrester
- target frame
- module
- value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Multimedia (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Biodiversity & Conservation Biology (AREA)
- Closed-Circuit Television Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a transmission line lightning arrester detection method based on an infrared image and a rotating target frame, which comprises the steps of collecting an infrared image of the transmission line lightning arrester by using an infrared camera, establishing an initial data set, improving data quantity of the initial data set by using a copy-paste data enhancement mode, establishing an extended data set, performing image enhancement pretreatment on the image in the extended data set, improving contrast and edge definition of the image, marking the rotating target frame of the lightning arrester, training a built lightning arrester detection network by using the marked data set to obtain an optimal lightning arrester detection model, and inputting the infrared image of the transmission line to be detected into the optimal lightning arrester detection model after image enhancement pretreatment to obtain the confidence degree of the lightning arrester and the rotating target frame information. The method provided by the invention can realize automatic identification of the lightning arrester of the infrared image transmission line, and solves the technical problems of poor quality of infrared images and large background interference in a horizontal target frame.
Description
Technical Field
The invention relates to the technical field of infrared image target detection, in particular to a transmission line lightning arrester detection method based on an infrared image and a rotary target frame.
Background
The lightning arrester is an essential component of the power transmission line, is in an operation state for a long time and is influenced by environmental factors, various faults often occur, and the common manifestation is abnormal heating in whole or in part, so that the temperature monitoring of the lightning arrester is an important part of the safety monitoring of the power transmission line. Because the infrared thermal imaging technology has the advantages of non-contact, wide range, all weather and no electromagnetic interference, infrared temperature measurement is a main means for monitoring the temperature of the power equipment. At present, temperature monitoring of a lightning arrester of a power transmission line mainly depends on manual identification of targets on infrared images so as to acquire temperature. The manual identification mode has strong subjectivity, long time consumption, low efficiency and poor adaptability.
In recent years, development of a target detection technique based on deep learning makes it possible to automatically identify a lightning arrester in an infrared image. However, the existing automatic detection method for the lightning arrester of the power transmission line mainly uses horizontal target frame detection, and the detected target frame contains more background parts, so that more interference is caused to the temperature monitoring of the lightning arrester of the power transmission line. Meanwhile, the infrared image has the characteristics of low resolution, poor definition and low signal to noise ratio, and the image quality is poor. Therefore, the problems that the quality of the infrared image is poor and the background interference in the horizontal target frame is high are solved to realize the automatic identification of the lightning arrester of the high-precision power transmission line.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, provides a transmission line arrester detection method based on an infrared image and a rotary target frame, can effectively realize automatic identification of the infrared image transmission line arrester, provides a basis for realizing automatic temperature monitoring of the transmission line arrester, and solves the technical problems of poor infrared image quality and high background interference in a horizontal target frame.
In order to achieve the purpose, the technical scheme provided by the invention is that the method for detecting the lightning arrester of the power transmission line based on the infrared image and the rotating target frame comprises the following steps:
step 1, an infrared camera is used for collecting infrared images of a lightning arrester of a power transmission line, and an initial data set is established;
Step 2, increasing the data volume of the initial data set by using a copy-paste data enhancement mode, and establishing an extended data set;
Step 3, performing image enhancement pretreatment on the images in the extended data set, improving the contrast and edge definition of the images, and marking a rotating target frame of the lightning arrester;
step 4, training the constructed lightning arrester detection network by using the marked data set to obtain an optimal lightning arrester detection model; the lightning arrester detection network is an improved YOLOv network, and the characteristics extraction network, the characteristics fusion network and the detection head of the YOLOv network are respectively improved; the improvement of the feature extraction network is that a AngleConv module and an improved CBAM attention module which adopts grouping space attention and increases input and output direct connection are added in the feature extraction network, the improvement of the feature fusion network is that 1/8 to 1/4 scale feature fusion is added in a top-down feature fusion part, 1/4 to 1/8 scale feature fusion is added in a bottom-up feature fusion part, 1/16 to 1/32 scale feature fusion is deleted, concat operation of the bottom-up feature fusion part is replaced by an adaptive feature fusion module, conv modules which are connected with two feature fusion paths from top to bottom and from bottom to top are added to adjust the channel number of input features of the adaptive feature fusion module, the improvement of the detection head is that three detection scales of the detection head are changed from 1/8, 1/16, 1/32 to 1/4, 1/8 and 1/16, the detection head of a horizontal target frame is improved to the detection frame, the prediction of the detection head is respectively improved to the prediction angle of 180 prediction angle of the rotation target frame, and the prediction angle of 180 prediction angle of the rotation target frame is respectively increased by 180 degrees;
And 5, inputting the infrared image of the transmission line to be detected into an optimal lightning arrester detection model after the image enhancement pretreatment in the step 3, and obtaining the confidence coefficient of the lightning arrester and the rotating target frame information.
Further, in step 2, the copy-paste data enhancement mode specifically includes:
reading all polygonal paste blocks in a mark file of the target source image, initializing a list for recording and generating positions of the paste blocks pasted in a new image to be an empty list, recording w = width of the image and h = height of the image, and sequentially performing the following operations on each polygonal paste block:
a. The minimum transverse coordinate of the current polygonal paste block is marked as x min, the maximum transverse coordinate is marked as x max, the minimum longitudinal coordinate is marked as y min, and the maximum longitudinal coordinate is marked as y max;
b. Randomly generating a transverse translation amount tx and a longitudinal translation amount ty of the current polygonal paste block, and enabling tx and ty to meet boundary constraint conditions, namely-x min<tx<w-xmax and-y min<ty<h-ymax;
c. Judging whether the minimum external horizontal rectangular frame of the current polygon pasting block after being translated according to the translation amount (tx, ty) is overlapped with the minimum external horizontal rectangular frame of the polygon pasting block pasted in the new image, if so, regenerating the translation amount (tx, ty) and trying again whether the minimum external horizontal rectangular frame of the current polygon pasting block after being translated according to the translation amount (tx, ty) is overlapped with the minimum external horizontal rectangular frame of the polygon pasting block pasted in the new image or not;
d. Pasting the current polygon paste block on the background image according to the translation amount (tx, ty), and recording the position (x min+tx,xmax+tx,ymin+ty,ymax +ty) of the current polygon paste block in the new image in the list, wherein (x min+tx,xmax+tx,ymin+ty,ymax +ty) represents the minimum transverse coordinate, the maximum transverse coordinate, the minimum longitudinal coordinate and the maximum longitudinal coordinate of the paste block in the new image respectively.
Further, the step 3 includes the steps of:
step 3.1, image blocks are divided, a platform histogram threshold T is determined, and the original histogram of each block is processed according to the following formula to obtain a platform histogram:
Where k is the gray value, P T (k) is the plateau histogram, P r (k) is the original histogram, and then the cumulative function F T (k) of the plateau histogram is calculated as:
wherein P T (j) represents a plateau histogram value of the gray value j;
the gray value R T (k) after the plateau histogram corresponding to the gray value k is calculated and expressed as:
wherein F T (k) represents the value of the cumulative function of the plateau histogram corresponding to the gradation value k, and F T (255) represents the value of the cumulative function of the plateau histogram corresponding to the gradation value 255;
The values of R T (k) are processed at equal intervals, and as R T(k)≥RT (k-1), the order O (k) of the values of R T (k) when k is from 0 to 255 is calculated in sequence, and the formula of O (0) = 0,O (k) is expressed as:
Wherein R T (k) represents the equalized gray value of the plateau histogram corresponding to the gray value k, R T (k-1) represents the equalized gray value of the plateau histogram corresponding to the gray value k-1, and O (k-1) represents the sequence of the equalized gray value R T (k-1) values of the plateau histogram corresponding to the gray value k-1;
Calculating a mapping relation f (k) between the gray value and the original gray value after the histogram equalization of the equidistant platform, wherein the specific calculation formula is expressed as follows:
converting the original image into a new image according to the mapping relation f (k) of each block, namely obtaining a local equidistant platform histogram equalized image;
And 3.2, carrying out Laplacian sharpening on the image with the balanced local equally-spaced platform histogram to obtain the preprocessed infrared image.
Further, in step 4, the first branch of the AngleConv modules is connected in series with a Conv module to obtain a feature map I 1, the second branch is connected in series with a Conv module for converting the number of channels into one fourth, then four oblique convolution modules with the angles of θ 1、θ2、θ3、θ4 are connected in parallel to obtain a feature map I 2、I3、I4、I5, then the output I 1 of the first branch and the output I 2、I3、I4、I5 of the second branch are taken as inputs, after Concat operation, a Conv module for halving the number of channels is connected in series to obtain an output feature map of the AngleConv module, the oblique convolution modules specifically perform the following operations of firstly, zero padding the periphery of the feature map to make the width and height of the feature map equal to the diagonal length of the feature map before zero padding, secondly, rotating the feature map by an angle of θ j', j' =1, 2,3,4, then connecting in series with a channel-by-channel convolution layer, rotating the feature map by an angle of θ j', and then removing elements at the four-week zero padding position to restore the feature map to the original width and height to obtain the output feature map of the oblique convolution module.
Further, in step 4, the improved CBAM attention module specifically performs the following operations that in the spatial attention module, the output features of the channel attention modules are grouped by using 128 channels as a group, the maximum value and the average value of each group are calculated in the channel dimension, then a convolution layer with the number of output channels of 1 is connected in series, and the convolution layer is multiplied by the output of the channel attention module after passing through a Sigmoid function, so as to obtain the output of the spatial attention module, and then the output feature map of the improved CBAM attention module is obtained after multiplying the output feature map of the improved 5224 attention module by the input of the CBAM attention module.
Further, in step 4, the adaptive feature fusion module specifically performs the following operations that two input feature graphs are taken as input, through Concat operations, a Conv module with the number of output channels being 2 is connected in series, then a Softmax function is applied to the channel dimension, and two numerical values of each spatial position are respectively taken as the summation weights of the two input features of the spatial position, and the weighted summation is performed to obtain an output feature graph, wherein the output feature graph is expressed as:
O=W[0]⊙I'1+W[1]⊙I'2
W=Softmax(Conv(Concat(I'1,I'2)))
wherein I' 1、I'2 represents two input feature maps, O represents an output feature map after the adaptive feature fusion, W is a weight tensor of two channels, W [0] is a weight matrix of a first channel of W, W [1] is a weight matrix of a second channel of W, as indicated by the dot product operation, concat represents the Concat operation in the channel dimension, conv represents a Conv module, and Softmax represents the Softmax operation in the channel dimension.
Further, in step 4, in training the lightning arrester detection network, the loss L θ of the angle prediction training is expressed as:
Where t i is a label value of the i-th angle class, p i is a predicted value of the i-th angle class, sigmoid represents a Sigmoid function, and a specific calculation formula of t i is:
n 0(i)=argminn (|u+180×n-i|), where the range of n is an integer
Where u is the angle class corresponding to the real rotating target frame, the range of values is 0,1,2,..179, σ represents the standard deviation of the normal distribution of the angle labels, the proper value is taken, and n 0 (i) represents the value of n which minimizes |u-180×n-i|;
The calculation formula of the confidence prediction training loss L obj is:
Lobj=-(y′log(Sigmoid(p))+(1-y′)log(1-Sigmoid(p)))
Where p rbox is the predicted rotated target frame, t rbox is the actual rotated target frame, inter (p rbox,trbox) represents the intersection area of the two rotated target frames, union (p rbox,trbox) represents the Union area of the two rotated target frames, p is the confidence prediction value, y' is the confidence tag value, sigmoid represents the Sigmoid function;
The improved CIoU is used for calculating the position of the rotating target frame and the predicted loss of the long and short edges, and the improved CIoU is specifically calculated by the following formula:
Wherein x, y, l, s is a predicted target center abscissa, target center ordinate, target long side length, target short side length, b represents a horizontal rectangular frame having x, y, l, s as a center position abscissa, center position ordinate, width, and height, b gt represents a horizontal rectangular frame having x, y, l, s as a center position abscissa, center position ordinate, width, and height, ρ (b, b gt) represents Euclidean distance between the center points of b and b gt, ioU represents the intersection ratio of b and b gt, c represents the diagonal length of the minimum circumscribed horizontal rectangular frame capable of containing b and b gt, l gt and s gt represent the true value of the target long side length and the true value of the short side length, v, α, respectively, Is an intermediate variable.
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. The invention provides a copy and paste data enhancement method which can effectively enlarge the sample data size and solve the problem of less sample data.
2. The invention provides an image enhancement method combining local equidistant platform histogram equalization and Laplacian sharpening, which can improve image contrast and edge definition in an image and solve the problem of poor infrared image quality.
3. According to the invention, a AngleConv module based on an oblique convolution module, an improved CBAM attention module which adopts grouping space attention and increases input and output direct connection, a self-adaptive feature fusion module and a lightning arrester detection network of a rotary target frame detection head are constructed, so that interference of a background on temperature monitoring of a lightning arrester of a power transmission line can be reduced, and the detection precision and generalization capability of a lightning arrester detection model are improved.
4. The invention uses the predicted rotating target frame and the IoU of the real rotating target frame as the labels of positive sample confidence, can restrain the score of the predicted target frame which is overlapped with the real target frame less, thereby reducing false detection, designs an improved CIoU to calculate the position of the rotating target frame and the predicted loss of the long and short edges, and the loss is more suitable for the characteristic of larger long and short edge ratio of the lightning arrester of the power transmission line compared with the original CIoU loss, and can improve the detection precision and generalization capability of the lightning arrester detection model.
Drawings
Fig. 1 is a flowchart of a method for detecting a lightning arrester of a power transmission line based on an infrared image and a rotating target frame in an embodiment of the invention.
Fig. 2 is a schematic view of an image before and after image enhancement preprocessing, wherein (a) is an image before image enhancement preprocessing, and (b) is an image after image enhancement preprocessing.
Fig. 3 is an overall structure diagram of a lightning arrester detection network in an embodiment of the present invention, in which AngleConv represents a AngleConv module, attention represents an improvement CBAM Attention module, adaptFusion represents an adaptive feature fusion module, head represents a detection Head, up represents an Up-sampling module, conv, C3, SPPF, and Concat respectively represent corresponding modules in YOLOv, and Conv2d represents a convolution layer.
Fig. 4 is a block diagram of AngleConv modules in an embodiment of the present invention, where Input feature refers to an Input feature diagram of the module, output feature refers to an Output feature diagram of the module, conv refers to a module including a convolution layer, a BN layer, and Silu layers, and Conv2d refers to the convolution layer.
Fig. 5 is a block diagram of an improved CBAM attention module in an embodiment of the present invention, in which Channel Attention represents the channel attention of the CBAM attention module, spatial Attention represents improved spatial attention, and Sigmoid represents a Sigmoid function.
Fig. 6 is a block diagram of an adaptive feature fusion module in an embodiment of the present invention, where I 1',I'2 represents two input feature graphs of the module, O represents an output feature graph, and Softmax represents a Softmax operation in a channel dimension.
Fig. 7 is a graph comparing the f-function in CIoU and the arctan-function in CIoU, modified in an embodiment of the invention.
Fig. 8 is a diagram illustrating an example of a detection result in an embodiment of the present invention.
Detailed Description
The invention will be further illustrated with reference to specific examples.
The embodiment discloses a transmission line lightning arrester detection method based on an infrared image and a rotating target frame, as shown in fig. 1, the method comprises the following steps:
step 1, an infrared camera is used for collecting infrared images of a lightning arrester of a power transmission line, an initial data set is established, and the method comprises the following steps:
Step 1.1, shooting and collecting lightning arresters of a power transmission line in various scenes by using an infrared camera, wherein the shooting modes comprise short-distance shooting, long-distance shooting and multi-angle shooting;
Step 1.2, cleaning the acquired image data, and eliminating a lightning arrester-free target or a low-quality image with image blurring;
And 1.3, marking the reserved image by using rolabelimg software, wherein the marking comprises the category of the target and the position of the rotating frame.
Step 2, increasing the data volume of the initial data set by using a copy and paste data enhancement mode to establish an extended data set, wherein the copy and paste data enhancement mode specifically comprises the following steps:
reading all polygonal paste blocks in a mark file of the target source image, initializing a list for recording and generating positions of the paste blocks pasted in a new image to be an empty list, recording w = width of the image and h = height of the image, and sequentially performing the following operations on each polygonal paste block:
a. The minimum transverse coordinate of the current polygonal paste block is marked as x min, the maximum transverse coordinate is marked as x max, the minimum longitudinal coordinate is marked as y min, and the maximum longitudinal coordinate is marked as y max;
b. Randomly generating a transverse translation amount tx and a longitudinal translation amount ty of the current polygonal paste block, and enabling tx and ty to meet boundary constraint conditions, namely-x min<tx<w-xmax and-y min<ty<h-ymax;
c. Judging whether the minimum external horizontal rectangular frame of the current polygon pasting block after being translated according to the translation amount (tx, ty) is overlapped with the minimum external horizontal rectangular frame of the polygon pasting block pasted in the new image, if so, regenerating the translation amount (tx, ty) and trying again whether the minimum external horizontal rectangular frame of the current polygon pasting block after being translated according to the translation amount (tx, ty) is overlapped with the minimum external horizontal rectangular frame of the polygon pasting block pasted in the new image or not;
d. Pasting the current polygon paste block on the background image according to the translation amount (tx, ty), and recording the position (x min+tx,xmax+tx,ymin+ty,ymax +ty) of the current polygon paste block in the new image in the list, wherein (x min+tx,xmax+tx,ymin+ty,ymax +ty) represents the minimum transverse coordinate, the maximum transverse coordinate, the minimum longitudinal coordinate and the maximum longitudinal coordinate of the paste block in the new image respectively.
Step 3, performing image enhancement pretreatment on the image in the extended data set to improve the contrast and edge definition of the image, as shown in fig. 2, and labeling a rotating target frame of a lightning arrester, wherein the method specifically comprises the following steps:
step 3.1, image blocks are divided, a platform histogram threshold T is determined, and the original histogram of each block is processed according to the following formula to obtain a platform histogram:
Where k is the gray value, P T (k) is the plateau histogram, P r (k) is the original histogram, and then the cumulative function F T (k) of the plateau histogram is calculated as:
wherein P T (j) represents a plateau histogram value of the gray value j;
the gray value R T (k) after the plateau histogram corresponding to the gray value k is calculated and expressed as:
wherein F T (k) represents the value of the cumulative function of the plateau histogram corresponding to the gradation value k, and F T (255) represents the value of the cumulative function of the plateau histogram corresponding to the gradation value 255;
The values of R T (k) are processed at equal intervals, and as R T(k)≥RT (k-1), the order O (k) of the values of R T (k) when k is from 0 to 255 is calculated in sequence, and the formula of O (0) = 0,O (k) is expressed as:
Wherein R T (k) represents the equalized gray value of the plateau histogram corresponding to the gray value k, R T (k-1) represents the equalized gray value of the plateau histogram corresponding to the gray value k-1, and O (k-1) represents the sequence of the equalized gray value R T (k-1) values of the plateau histogram corresponding to the gray value k-1;
Calculating a mapping relation f (k) between the gray value and the original gray value after the histogram equalization of the equidistant platform, wherein the specific calculation formula is expressed as follows:
converting the original image into a new image according to the mapping relation f (k) of each block, namely obtaining a local equidistant platform histogram equalized image;
And 3.2, carrying out Laplacian sharpening on the image with the balanced local equally-spaced platform histogram to obtain the preprocessed infrared image.
Step 4, training the constructed lightning arrester detection network by using the marked data set to obtain an optimal lightning arrester detection model; the lightning arrester detection network is a modified YOLOv network, as shown in fig. 3, which is modified for the feature extraction network, the feature fusion network and the detection head of the YOLOv network respectively; the improvement of the feature extraction network is that a AngleConv module and an improved CBAM attention module which adopts grouping space attention and increasing input and output direct connection are added in the feature extraction network, the improvement of the feature fusion network is that 1/8 to 1/4 scale feature fusion is added in a top-down feature fusion part, 1/4 to 1/8 scale feature fusion is added in a bottom-up feature fusion part, 1/16 to 1/32 scale feature fusion is deleted, concat operation of the bottom-up feature fusion part is replaced by an adaptive feature fusion module, conv modules which are connected with two feature fusion paths from top to bottom and from bottom to top are added to adjust the channel number of input features of the adaptive feature fusion module, the improvement of the detection head is that three detection scales of the detection head are changed from 1/8, 1/16, 1/32 to 1/4, 1/8 and 1/16, the detection head of a horizontal target frame is improved to a rotation target frame, the prediction head is respectively improved to the prediction angle of 180 prediction frames of the rotation target frame length, and 180 prediction frames of the rotation target frame length is increased by a prediction angle of 180 degrees corresponding to the rotation target frame length interval of 180 degrees;
Specifically, the feature extraction network of the lightning arrester detection network firstly sequentially connects in series two Conv modules, angleConv modules and C3 modules with the step length of 2 to obtain a first feature map, secondly sequentially connects in series the Conv modules and C3 modules with the step length of 2 to obtain a second feature map, sequentially connects in series the Conv modules and C3 modules with the step length of 2 to obtain a third feature map, and then sequentially connects in series the Conv modules, C3 modules, SPPF modules and CBAM attention improvement modules with the step length of 2 to obtain a fourth feature map.
As shown in fig. 4, the first branch of the AngleConv module is connected in series with a Conv module to obtain a feature map I 1, the second branch is connected in series with a Conv module for converting the number of channels into one fourth, then four oblique convolution modules with the angles of θ 1、θ2、θ3、θ4 are connected in parallel to obtain a feature map I 2、I3、I4、I5, then the output I 1 of the first branch and the output I 2、I3、I4、I5 of the second branch are taken as inputs, after Concat operation, a Conv module for halving the number of channels is connected in series to obtain an output feature map of the AngleConv module, the oblique convolution module specifically performs the following operations of firstly, zero padding the periphery of the feature map to make the width and the height of the feature map equal to the diagonal length of the feature map before zero padding, secondly, rotating the feature map by the angles of θ j', j' =1, 2,3,4, then connecting in series with a channel-by-channel convolution layer, rotating the output feature map by the angle of θ j', and then removing elements at the four-side zero padding positions to restore the feature map to the original width and height to obtain the output feature map of the oblique convolution module.
As shown in FIG. 5, the improved CBAM attention module is specifically configured in such a way that in the spatial attention module, the output characteristics of the channel attention module are grouped by 128 channels, the maximum value and the average value of each group are calculated in the channel dimension, then a convolution layer with the number of output channels of 1 is connected in series, and the convolution layer is multiplied by the output of the channel attention module after passing through a Sigmoid function, so as to obtain the output of the spatial attention module, and then the output characteristics of the improved CBAM attention module are obtained by multiplying the input of the CBAM attention module by the Sigmoid function.
The characteristic fusion network of the lightning arrester detection network takes a fourth characteristic diagram as input, and sequentially connects in series with a Conv module, an up-sampling module with the proportion of 2, concat operation, a C3 module and a Conv module to obtain a fifth characteristic diagram, wherein the other input of the Concat operation is a third characteristic diagram, then sequentially connects in series with an up-sampling module with the proportion of 2, a Concat operation, a C3 module and a Conv module to obtain a sixth characteristic diagram, wherein the other input of the Concat operation is a second characteristic diagram, then sequentially connects in series with an up-sampling module with the proportion of 2, a Concat operation and a C3 module to obtain a seventh characteristic diagram, wherein the other input of the Concat operation is a first characteristic diagram, then sequentially connects in series with a Conv module with the step length of 2, a self-adaptive characteristic fusion module and a C3 module to obtain an eighth characteristic diagram, wherein the other input of the self-adaptive characteristic fusion module is an output characteristic diagram after the sixth characteristic diagram passes through the Conv module, and finally sequentially connects in series with the Conv module with the step length of 2, and the C3 module to obtain a ninth characteristic diagram.
As shown in fig. 6, the specific case of the adaptive feature fusion module is that two input feature graphs are taken as input, through Concat operations, a Conv module with the number of output channels being 2 is connected in series, then a Softmax function is applied to the channel dimension, two numerical values of each spatial position are respectively taken as the addition weights of the two input features of the spatial position, and the weighted sum is performed to obtain an output feature graph, which is expressed as:
O=W[0]⊙I'1+W[1]⊙I'2
W=Softmax(Conv(Concat(I'1,I'2)))
wherein I' 1、I'2 represents two input feature maps, O represents an output feature map after the adaptive feature fusion, W is a weight tensor of two channels, W [0] is a weight matrix of a first channel of W, W [1] is a weight matrix of a second channel of W, as indicated by the dot product operation, concat represents the Concat operation in the channel dimension, conv represents a Conv module, and Softmax represents the Softmax operation in the channel dimension.
The detection head of the lightning arrester detection network adopts an improved YOLOv detection head, takes a seventh characteristic diagram, an eighth characteristic diagram and a ninth characteristic diagram as input, and predicts the confidence coefficient of a target, the type of the target, the long side, the short side and the position of a rotating target frame and the angle of the rotating target frame.
When training the lightning arrester detection network, the loss L θ of the angle prediction training is expressed as:
Where t i is a label value of the i-th angle class, p i is a predicted value of the i-th angle class, sigmoid represents a Sigmoid function, and a specific calculation formula of t i is:
n 0(i)=argminn (|u+180×n-i|), where the range of n is an integer
Where u is the angle class corresponding to the real rotating target frame, the range of values is 0,1,2,..179, σ represents the standard deviation of the normal distribution of the angle labels, the proper value is taken, and n 0 (i) represents the value of n which minimizes |u+180×n-i|;
The calculation formula of the confidence prediction training loss L obj is:
Lobj=-(y′log(Sigmoid(p))+(1-y′)log(1-Sigmoid(p)))
Where p rbox is the predicted rotated target frame, t rbox is the actual rotated target frame, inter (p rbox,trbox) represents the intersection area of the two rotated target frames, union (p rbox,trbox) represents the Union area of the two rotated target frames, p is the confidence prediction value, y' is the confidence tag value, sigmoid represents the Sigmoid function;
The improved CIoU is used for calculating the position of the rotating target frame and the predicted loss of the long and short edges, and the improved CIoU is specifically calculated by the following formula:
Wherein x, y, l, s is a predicted target center abscissa, target center ordinate, target long side length, target short side length, b represents a horizontal rectangular frame having x, y, l, s as a center position abscissa, center position ordinate, width, and height, b gt represents a horizontal rectangular frame having x, y, l, s as a center position abscissa, center position ordinate, width, and height, ρ (b, b gt) represents Euclidean distance between the center points of b and b gt, ioU represents the intersection ratio of b and b gt, c represents the diagonal length of the minimum circumscribed horizontal rectangular frame capable of containing b and b gt, l gt and s gt represent the true value of the target long side length and the true value of the short side length, v, α, respectively, Is an intermediate variable.
As shown in fig. 7, when the independent variable is larger (greater than or equal to 3.2), the f function can increase the dependent variable difference value at a fixed time when the independent variable difference value is larger than the arctan function, thereby improving the sensitivity of the training network to the difference of the target long-short side ratio value, and being suitable for the target detection of the lightning arrester of the power transmission line, such as the long-short side ratio value of which is large.
The step of training the lightning arrester to detect the network specifically comprises:
a. And (3) sorting the data sets after data augmentation and image preprocessing, and dividing the data sets into training sets and test sets according to the proportion of 5:1.
B. And determining the super parameters of the lightning arrester detection network, training the lightning arrester detection network by using a training set according to the determined super parameters, and testing the performance of the training model by using a testing set. In this embodiment, the training batch size is 8, the input image resolution is 384×288, the initial learning rate is 0.01, the training batch is 300, and meanwhile, the on-line data enhancement methods such as gray scale conversion, scale conversion and Mosaic data enhancement are adopted to improve the robustness of the training model.
Step 5, inputting the infrared image of the transmission line to be detected into an optimal lightning arrester detection model after the image enhancement pretreatment of the step 3 to obtain the confidence coefficient of the lightning arrester and the rotating target frame information, wherein the method specifically comprises the following steps:
step 5.1, sequentially carrying out histogram equalization and Laplacian sharpening on the infrared image to be detected on the local equidistant platform in step 3, and inputting an optimal lightning arrester detection model to obtain an output detection result;
And 5.2, inputting the output detection result of the step 5.1 into a post-processing mechanism, and removing redundant detection results to obtain a final lightning arrester detection result, wherein the final lightning arrester detection result comprises a rotating target frame and a confidence coefficient, as shown in fig. 8.
The post-processing mechanism is NMS (non-maximum suppression) post-processing based on a rotation target box IoU, and comprises the following steps:
a. selecting the one with the largest score in the initial detection frame set, moving the one into the final detection frame set, and moving the one out of the initial detection frame set;
b. Calculating the rotation target frame IoU value (cross ratio) of the detection frame selected in the step a and each detection frame of the initial detection frame set. If IoU is greater than the set threshold, moving the detection box out of the initial detection box;
c. repeating steps a-b until no detection frame exists in the initial detection frame.
In summary, the invention can detect the confidence coefficient and the rotating target frame of the lightning arrester in the infrared image, improves the detection precision and generalization capability of the detection model of the lightning arrester of the power transmission line, provides a basis for realizing automatic temperature monitoring of the lightning arrester of the power transmission line, and is worthy of popularization.
The above embodiments are only preferred embodiments of the present invention, and are not intended to limit the scope of the present invention, so variations in shape and principles of the present invention should be covered.
Claims (6)
1. The transmission line lightning arrester detection method based on the infrared image and the rotary target frame is characterized by comprising the following steps of:
step 1, an infrared camera is used for collecting infrared images of a lightning arrester of a power transmission line, and an initial data set is established;
Step 2, increasing the data volume of the initial data set by using a copy-paste data enhancement mode, and establishing an extended data set;
Step 3, performing image enhancement pretreatment on the images in the extended data set, improving the contrast and edge definition of the images, and marking a rotating target frame of the lightning arrester;
step 4, training the constructed lightning arrester detection network by using the marked data set to obtain an optimal lightning arrester detection model; the lightning arrester detection network is an improved YOLOv network, and the characteristics extraction network, the characteristics fusion network and the detection head of the YOLOv network are respectively improved; the improvement of the feature extraction network is that a AngleConv module and an improved CBAM attention module which adopts grouping space attention and increases input and output direct connection are added in the feature extraction network, the improvement of the feature fusion network is that 1/8 to 1/4 scale feature fusion is added in a top-down feature fusion part, 1/4 to 1/8 scale feature fusion is added in a bottom-up feature fusion part, 1/16 to 1/32 scale feature fusion is deleted, concat operation of the bottom-up feature fusion part is replaced by an adaptive feature fusion module, conv modules which are connected with two feature fusion paths from top to bottom and from bottom to top are added to adjust the channel number of input features of the adaptive feature fusion module, the improvement of the detection head is that three detection scales of the detection head are changed from 1/8, 1/16, 1/32 to 1/4, 1/8 and 1/16, the detection head of a horizontal target frame is improved to the detection frame, the prediction of the detection head is respectively improved to the prediction angle of 180 prediction angle of the rotation target frame, and the prediction angle of 180 prediction angle of the rotation target frame is respectively increased by 180 degrees;
the first branch of AngleConv modules is connected with a Conv module in series to obtain a characteristic diagram The second branch is connected in series with a Conv module which converts the number of channels into one fourth, and then connected in parallel with four angles respectivelyIs used for obtaining a characteristic diagram by an oblique convolution moduleThen output by the first branchAnd the output of the second branchThe oblique convolution module specifically performs the following operations of firstly, zero padding the periphery of the feature map to ensure that the width and the height of the feature map are equal to the diagonal length of the feature map before zero padding, and secondly, rotating the feature map by an angle,=1, 2,3,4, And then concatenating a channel-by-channel convolution layer, then rotating the output profile by an angleThen removing elements at the zero padding positions around the oblique convolution module to restore the characteristic diagram to the original width and height so as to obtain an output characteristic diagram of the oblique convolution module;
And 5, inputting the infrared image of the transmission line to be detected into an optimal lightning arrester detection model after the image enhancement pretreatment in the step 3, and obtaining the confidence coefficient of the lightning arrester and the rotating target frame information.
2. The method for detecting a lightning arrester of a power transmission line based on an infrared image and a rotating target frame according to claim 1, wherein in step2, the copy-paste data enhancement mode is specifically as follows:
Using labelme software to mark at least one polygon paste block on the target source image, reading all polygon paste blocks in the mark file of the target source image, initializing a list for recording the positions of the paste blocks pasted in the new image to be an empty list, and recording The width of the image is =,= High of image, the following is done for each polygon stick block in turn:
a. The minimum transverse coordinate of the current polygon paste block is recorded as The maximum transverse coordinate isMinimum longitudinal coordinate isMaximum longitudinal coordinate is;
B. Randomly generating lateral translation of current polygonal paste blockAnd amount of longitudinal translationAnd make、Meeting boundary constraints, i.e.And is also provided with;
C. Judging the translation amount of the current polygonal paste blockWhether the minimum circumscribed horizontal rectangular frame after translation is overlapped with the respective minimum circumscribed horizontal rectangular frame of the polygon pasting blocks pasted in the new drawing or not, if so, regenerating the translation amountAnd try again to judge the current polygon sticking block according to the translation amountWhether the translated minimum circumscribed horizontal rectangular frame is overlapped with the respective minimum circumscribed horizontal rectangular frame of the polygon pasting blocks pasted in the new drawing; if the number of attempts exceeds the preset limit, skipping the current polygon paste block;
d. according to the translation amount Pasting the current polygon paste block on the background image, and recording the position of the current polygon paste block in the new image in the list, wherein,Representing the minimum transverse coordinate, the maximum transverse coordinate, the minimum longitudinal coordinate and the maximum longitudinal coordinate of the current polygonal paste block in the new graph respectively.
3. The method for detecting the lightning arrester of the transmission line based on the infrared image and the rotating target frame according to claim 1, wherein the step 3 comprises the following steps:
step 3.1, image blocks are divided, a platform histogram threshold T is determined, and the original histogram of each block is processed according to the following formula to obtain a platform histogram:
;
in the formula, Is the gray value of the gray scale,Is a histogram of the plateau,Is the original histogram, then calculates the cumulative function of the platform histogramExpressed as:
;
in the formula, Representing gray valuesA plateau histogram value of (2);
Calculating gray values Corresponding gray value after platform histogram equalizationExpressed as:
;
in the formula, Representing gray valuesThe value of the corresponding cumulative function of the plateau histogram,A value representing the cumulative function of the plateau histogram corresponding to the gray value 255;
For a pair of The values are processed at equal intervals, due toSequentially calculateFrom 0 to 255Ordering of valuesRecording,The formula of (c) is expressed as:
;
in the formula, Representing gray valuesThe corresponding plateau histogram is equalized to the gray value,Representing gray valuesThe corresponding plateau histogram is equalized to the gray value,Representing gray valuesCorresponding gray value after platform histogram equalizationSorting of values;
calculating the mapping relation between the gray value and the original gray value after the histogram equalization of the equidistant platform The specific calculation formula is expressed as:
;
according to the mapping relation of each block Converting the original image into a new image, namely obtaining a local equidistant platform histogram equalized image;
And 3.2, carrying out Laplacian sharpening on the image with the balanced local equally-spaced platform histogram to obtain the preprocessed infrared image.
4. The method for detecting the lightning arrester of the transmission line based on the infrared image and the rotating target frame according to claim 1, wherein in step 4, the improvement CBAM attention module specifically performs the following operations, namely, in the spatial attention module, the output characteristics of the channel attention module are grouped by 128 channels, the maximum value and the average value of each group are calculated in the channel dimension, then a convolution layer with the number of 1 output channels is connected in series, the convolution layer is multiplied by the output of the channel attention module after passing through a Sigmoid function, the output of the spatial attention module is obtained, and then the output characteristic diagram of the improvement CBAM attention module is obtained after multiplying the input of the CBAM attention module by the Sigmoid function.
5. The method for detecting the lightning arrester of the power transmission line based on the infrared image and the rotating target frame according to claim 1, wherein in step 4, the adaptive feature fusion module specifically performs the following operations, namely taking two input feature images as input, performing Concat operations, then connecting Conv modules with the number of output channels being 2 in series, then applying a Softmax function in a channel dimension, taking two numerical values of each spatial position as summation weights of two input features of the spatial position, and obtaining an output feature image by weighted summation, wherein the weighted summation is expressed as follows:
;
;
in the formula, There are shown two input feature maps that are,Representing the output characteristic diagram after the self-adaptive characteristic fusion,Is the weight tensor of the two channels,Is thatIs a matrix of weights for the first channel of (a),Is thatIs a matrix of weights for the second channel of (c),A dot-by-dot operation is represented,Representing operations performed at Concat in the channel dimension,The Conv module is represented by the expression,Representing Softmax manipulation in the channel dimension.
6. The method for detecting an arrester of an electric power transmission line based on an infrared image and a rotating target frame according to claim 1, wherein in step 4, when the arrester detection network is trained, the loss of angle prediction training is performedThe calculation formula is expressed as:
;
in the formula, Is the firstThe tag values for the individual angle categories,Is the firstThe predicted value of the individual angle class is,The Sigmoid function is represented as a function,The specific calculation formula of (2) is:
;
;
in the formula, Is the true rotation target frame corresponding angle category, the range of values is 0,1,2, 179,The standard deviation representing the normal distribution of the angle label, takes an appropriate value,Representation and rendering ofTake the minimum valueA value;
loss of confidence prediction training The formula of (2) is:
;
;
in the formula, Is a predicted rotating target frame that is to be rotated,Is a true rotating target frame and is provided with a rotating frame,Representing the intersection area of two rotating object boxes,Representing the union area of two rotating object boxes,Is a predicted value of the degree of confidence,Is the value of the confidence tag,Representing a Sigmoid function;
The improved CIoU is used for calculating the position of the rotating target frame and the predicted loss of the long and short edges, and the improved CIoU is specifically calculated by the following formula:
;
;
;
;
;
in the formula, Respectively a predicted target center abscissa, a target center ordinate, a target long side length and a target short side length,Expressed in terms ofRespectively used as a horizontal rectangular frame with a central position abscissa, a central position ordinate, a width and a height,Expressed in terms ofThe true value of (2) is used as the horizontal rectangular frame of the central position abscissa, the central position ordinate, the width and the height,Representation ofAndIs set at a distance from the center point of the lens,Representation ofAndIs used for the cross-over ratio of (a),The representation can simultaneously containAndThe diagonal length of the smallest circumscribed horizontal rectangular box,AndRepresenting the true value of the length of the long side and the true value of the length of the short side of the target respectively,Is an intermediate variable.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410752339.0A CN118608964B (en) | 2024-06-12 | 2024-06-12 | Transmission line arrester detection method based on infrared image and rotating target frame |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410752339.0A CN118608964B (en) | 2024-06-12 | 2024-06-12 | Transmission line arrester detection method based on infrared image and rotating target frame |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN118608964A CN118608964A (en) | 2024-09-06 |
| CN118608964B true CN118608964B (en) | 2025-09-23 |
Family
ID=92556989
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202410752339.0A Active CN118608964B (en) | 2024-06-12 | 2024-06-12 | Transmission line arrester detection method based on infrared image and rotating target frame |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN118608964B (en) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119941719B (en) * | 2025-04-07 | 2025-07-15 | 江苏尚诚能源科技有限公司 | Power transmission line monitoring method and device based on machine vision |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113326734A (en) * | 2021-04-28 | 2021-08-31 | 南京大学 | Rotary target detection method based on YOLOv5 |
| CN116630926A (en) * | 2023-06-02 | 2023-08-22 | 清华大学 | Quick detection method for curved lane line based on oblique convolution |
Family Cites Families (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112884064B (en) * | 2021-03-12 | 2022-07-29 | 迪比(重庆)智能科技研究院有限公司 | A method of target detection and recognition based on neural network |
-
2024
- 2024-06-12 CN CN202410752339.0A patent/CN118608964B/en active Active
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113326734A (en) * | 2021-04-28 | 2021-08-31 | 南京大学 | Rotary target detection method based on YOLOv5 |
| CN116630926A (en) * | 2023-06-02 | 2023-08-22 | 清华大学 | Quick detection method for curved lane line based on oblique convolution |
Also Published As
| Publication number | Publication date |
|---|---|
| CN118608964A (en) | 2024-09-06 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN108961235B (en) | Defective insulator identification method based on YOLOv3 network and particle filter algorithm | |
| CN110738697B (en) | Monocular depth estimation method based on deep learning | |
| CN109299274B (en) | Natural scene text detection method based on full convolution neural network | |
| CN110837870B (en) | Sonar image target recognition method based on active learning | |
| CN113920107A (en) | A method of insulator damage detection based on improved yolov5 algorithm | |
| CN110570396A (en) | A method of industrial product defect detection based on deep learning | |
| CN111950453A (en) | Optional-shape text recognition method based on selective attention mechanism | |
| CN112434586B (en) | Multi-complex scene target detection method based on domain self-adaptive learning | |
| CN109684922A (en) | A kind of recognition methods based on the multi-model of convolutional neural networks to finished product dish | |
| CN111008632B (en) | License plate character segmentation method based on deep learning | |
| CN110889398A (en) | Multi-modal image visibility detection method based on similarity network | |
| CN108846474A (en) | The satellite cloud picture cloud amount calculation method of convolutional neural networks is intensively connected based on multidimensional | |
| CN118608964B (en) | Transmission line arrester detection method based on infrared image and rotating target frame | |
| CN113378672A (en) | Multi-target detection method for defects of power transmission line based on improved YOLOv3 | |
| CN112258490A (en) | Low-emissivity coating intelligent damage detection method based on optical and infrared image fusion | |
| CN119180793A (en) | Fan blade defect detection system and method based on multi-mode sensing | |
| CN117078608B (en) | A method for detecting highly reflective leather surface defects based on double mask guidance | |
| CN112991448A (en) | Color histogram-based loop detection method and device and storage medium | |
| CN111310690A (en) | Forest fire recognition method and device based on CN and three-channel capsule network | |
| CN113902733A (en) | Spacer defect detection method based on key point detection | |
| CN112785548A (en) | Pavement crack detection method based on vehicle-mounted laser point cloud | |
| CN113947567B (en) | Defect detection method based on multitask learning | |
| CN118570194B (en) | Method and system for detecting defects of inner surface of special-shaped bushing based on three-dimensional point cloud | |
| CN114120061A (en) | A small target defect detection method and system for power inspection scenarios | |
| CN119228749A (en) | A method for measuring water leakage during the construction of a large underground cavern |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |