Ship detection and identification method in day and night image
Technical Field
The invention relates to the field of computer vision and digital image processing, in particular to a ship detection and identification method in day and night images based on statistical learning and regional covariance.
Background
The target detection is to find out all interested objects in the image, and comprises two subtasks of object positioning and object classification, namely, the category and the position of the object are determined simultaneously. The target detection is a hot direction of computer vision and image processing, is widely applied to the fields of robot navigation, intelligent video monitoring, industrial detection and the like, reduces the consumption of human capital through the computer vision, and has important practical significance. Therefore, target detection becomes a research hotspot of theory and application in recent years, and is an important branch of image processing and computer vision discipline and a core part of an intelligent monitoring system. Meanwhile, target detection is also a basic algorithm in the field of universal identity recognition, and plays an important role in subsequent tasks such as face recognition, gait recognition, crowd counting, instance segmentation and the like.
Due to the wide application of deep learning, the target detection algorithm is developed rapidly. Since 2006, a large number of deep neural networks were published under the guidance of Hinton, Bengio, Lecun et al, and particularly 2012, the Hinton topic group first participated in ImageNet image recognition competitions, which captured the champions at a time through the CNN network AlexNet constructed, and since then the neural networks received extensive attention. Deep learning utilizes a multi-layer computational model to learn abstract data representations, so that complex structures in big data can be discovered, and the technology is successfully applied to various pattern classification problems including the field of computer vision at present.
The analysis of the target motion by the computer vision can be roughly divided into three levels, namely motion segmentation and target detection; tracking a target; and (4) action recognition and behavior description. The target detection is one of basic tasks to be solved in the field of computer vision, and is also a basic task of a video monitoring technology. As the targets in the video have different postures and are often shielded, and the motion of the targets has irregularity, the conditions of depth of field, resolution, weather, illumination and the like of the monitoring video and the diversity of scenes are considered, and the results of the target detection algorithm directly influence the effects of subsequent tracking, action recognition and action description. Even today with technological development, the basic task of object detection remains a very challenging task, with great potential and space for improvement.
At present, a target detection and identification method based on deep learning is applied to detection and identification of ships, daytime scenes are good in performance, but for nighttime scenes, due to the fact that the difference between illumination, contrast and signal-to-noise ratio of images at night is large, the detection and identification performance of ships at night is sharply reduced. In order to intelligently detect the position of a ship in the video monitoring of the whole time and automatically identify the type of a target ship, the key point is to extract the image characteristics of the ship. However, in practical application, the difference of characteristics such as signal-to-noise ratio, contrast and the like of images at different periods of time in day and night is large, and great challenges are brought to image feature extraction of ships.
Currently, the mainstream target detection algorithms based on the deep learning model can be mainly divided into two categories: One-Stage and Two-Stage. Generally, the One-Stage detection algorithm does not need a Region prompt Stage, directly generates the class probability and the position coordinate value of an object, and has relatively high speed; Two-Stage object detection algorithms, which divide the detection problem into Two stages, first generate candidate regions (regions), and then classify and position-refine the candidate regions, are much more accurate than the previous one, but are slower.
Disclosure of Invention
In order to realize the detection and identification of a water surface target ship in the whole time period, the invention provides a ship detection and identification method in a day and night image, which comprises the following steps:
s1, detecting the illumination of the ship images at different time intervals by using the light sensing elements, and dividing the ship images into a daytime image and a nighttime image according to different illumination ranges of the ship images;
s2, aiming at the daytime image, firstly detecting all objects appearing in a detection range, and then screening ship objects from the detected objects;
s3, aiming at the night image, firstly detecting a significant target in the night image, and screening out a ship object from the night image;
and S4, acquiring the real-time positions and the affiliated category information of all ships in the current video frame based on the screened ship objects.
Further, step S1 specifically includes:
s11, collecting a large number of scene pictures at different time intervals, and statistically analyzing the image illumination range of each time interval to form an illumination range reference comparison table;
and S12, detecting the illumination of the ship image transmitted by the camera through the light sensing element, and comparing the illumination range with the reference comparison table to judge whether the type of the ship image is a daytime image or a nighttime image.
Further, in step S2, the daytime image is processed by using a target detection algorithm Fast R-CNN based on a deep convolutional neural network, where the network structure includes two parts, RPN and Fast R-CNN, where RPN is used to predict a candidate region that may include a target in the input image, and output a suggestion box that may include a ship target; fast R-CNN is used to classify the candidate regions and to revise the bounding boxes of the candidate regions.
Further, the training step of the target detection algorithm fast R-CNN based on the deep convolutional neural network is as follows:
1) initializing RPN network parameters by using a pre-training network model, and finely adjusting the RPN network parameters by using a random gradient descent algorithm and a back propagation algorithm;
2) initializing fast R-CNN target detection network parameters by using a pre-training network model, extracting a candidate region by using the RPN network in the first step, and training a target detection network;
3) reinitializing and fine-tuning RPN network parameters by the target detection network in the second step;
4) extracting a candidate area by using the RPN network in the third step and finely adjusting the parameters of the target detection network;
5) and repeating the third step and the fourth step until the maximum iteration number is reached or the network converges.
Further, step S2 specifically includes:
s21, calculating a convolution characteristic diagram of the daytime image to be detected;
s22, processing the convolution feature graph by using an RPN to obtain a target suggestion frame;
s23, extracting features of each suggestion box by utilizing RoI Pooling;
and S24, classifying by using the extracted features.
Further, in step S3, the nighttime image is processed using a convolutional neural network algorithm based on regional covariance guidance.
Further, step S3 specifically includes:
s31, extracting low-level features of the night image by taking pixels as units;
s32, constructing area covariance by taking the multi-dimensional feature vector as the basis;
s33, constructing a convolutional neural network model by taking the covariance matrix as a training sample;
s34, calculating the image saliency based on the local and global contrast principle;
and S35, framing a remarkable ship target and acquiring the position of the ship.
Further, the ship detection and identification method of the present invention further comprises:
s5, evaluating an image detection result by using AUC and MAE evaluation indexes; the AUC and MAE calculation formulas are respectively as follows:
wherein rank
insiThe sequence numbers representing the ith sample, which represent the probability scores arranged from small to large at the rank position, M, N are the number of positive samples and the number of negative samples respectively,
indicating that only the sequence numbers of the positive samples are added up;
wherein
The significant map is represented by a map of the feature,
representing the reference map, W and H represent the pixel value width and height of the image, respectively.
The invention has the following beneficial effects:
according to the ship detection and identification method, the images are classified based on different illumination intensities of the ship images, and different processing strategies are respectively used for the classified daytime images and nighttime images, so that most ships can be detected even on the nighttime images with poor image quality, and in addition, the ships can be detected even if the ship changes in scale, so that the detection and identification of the ship target in the whole time scene are realized, and the ship detection and identification method has better robustness.
Drawings
Fig. 1 is a basic flow diagram of an embodiment of the ship detection and identification method of the present invention.
FIG. 2 is an example of a diurnal ship image in an embodiment of the invention.
FIG. 3 is a flowchart of the Faster R-CNN target detection algorithm used in the embodiments of the present invention.
Fig. 4 is a diagram of an implementation effect obtained by a ship model simulation real ship motion test on a lake in a campus according to the embodiment of the ship detection and identification method of the present invention, wherein: a is a far ship image, b is a near ship image, c is a multi-obstacle ship image, and d is a ship scale transformation image.
FIG. 5 is a block diagram of a convolutional neural network based on regional covariance steering used in an embodiment of the present invention.
FIG. 6 is a diagram of an implementation effect obtained by a ship model simulation real ship motion test of a lake in a campus at night by using the ship detection and identification method of the present invention.
Detailed Description
For a further understanding of the invention, reference will now be made to the preferred embodiments of the invention by way of example, and it is to be understood that the description is intended to further illustrate features and advantages of the invention, and not to limit the scope of the claims.
The embodiment of the invention provides a ship detection and identification method in a day and night image based on statistical learning and regional covariance. As shown in fig. 1, the process includes:
1. firstly, detecting a video frame image acquired by a photoelectric holder by using a light sensing element to realize day and night image classification;
2. for the daytime image and the nighttime image, respectively detecting the size and the position of the ship by using a Faster RCNN and a region guide covariance guide CNN, wherein the specific processes are respectively shown in FIG. 3 and FIG. 5;
3. and screening the detected ships to determine the types and positions of the ships.
In step 1, an example of the diurnal ship images is shown in fig. 2, wherein the first image is a daytime image and the second image is a nighttime image. The specific process of utilizing the light sensing element to realize day and night image classification comprises the following steps: firstly, a large number of scene pictures in different time periods are collected, the image illumination range of each time period is analyzed in a statistical manner, the detailed parameters are shown in table 1, the illumination range of the scene in the daytime is on the left side, and the illumination range of the scene at night is on the right side. The image type, namely the daytime image or the nighttime image, is judged by comparing the reference values of the range of the following table through a light sensing element detection method, namely, a common light sensing element detects the illumination of the image transmitted by the camera.
TABLE 1 reference value of illumination range under various natural periods
| Natural conditions of the world
|
Luminance value (Lx)
|
Natural conditions of the world
|
Luminance value (Lx)
|
| Direct sunlight
|
(1~1.3)×105 |
Deep dusk
|
1
|
| Full daylight
|
(1~2)×104 |
Full moon
|
10-1 |
| Day (Yin)
|
103 |
Moon in half
|
10-2 |
| Very dark daytime
|
102 |
Starlight
|
10-3 |
| Dusk (dawn)
|
10
|
Night (Yin)
|
10-4 |
In step 2, the main steps of the fast-based Convolutional Neural Networks (Faster Convolutional Neural Networks) target detection algorithm for processing ship detection are as follows:
1) calculating a convolution characteristic diagram of the ship image;
2) processing the convolution characteristic graph by using an RPN (Region suggestion Network) to obtain a target suggestion box;
3) extracting features of each suggestion box by utilizing RoI Pooling (Region of interest Pooling);
4) and classifying by using the extracted features.
Aiming at the detection and identification of the ship target in the daytime scene, the data set selected in the embodiment is a ship picture data set and a ship model data set in the Yangtze river actually photographed by a Nautilus bridge, and before the realization, part of the ship image in the acquired data set is used as a training data set, and the other part of the acquired data set is used as a test set. The ships are classified into five types, namely passenger ships, cargo ships, beacon ships, warships and sailing ships. The pre-training model selected in this embodiment is ResNet 50. The RPN is trained end-to-end during the training phase. The initial learning rate in the Faster R-CNN network is 0.0003, the iteration is 20000 times, and the specific training steps are as follows:
1) initializing RPN network parameters by using a pre-training network model, and finely adjusting the RPN network parameters by using a random gradient descent algorithm and a back propagation algorithm;
2) initializing fast R-CNN target detection network parameters by using a pre-training network model, extracting a candidate region by using the RPN network in the first step, and training a target detection network;
3) reinitializing and fine-tuning RPN network parameters by the target detection network in the second step;
4) extracting a candidate area by using the RPN in the third step and finely adjusting target detection network parameters;
5) and repeating the third step and the fourth step until the maximum iteration number is reached or the network converges.
The model performance was verified on the test set and the resulting false negative and false positive indicators were counted as shown in table 2.
TABLE 2 Faster R-CNN model false alarm missing rate
| Type of index
|
Cargo ship
|
Passenger ship
|
Lamp beacon boat
|
Warship
|
Sailing boat
|
| Rate of missing reports
|
0.221
|
0.117
|
0.667
|
0.212
|
0.006
|
| False alarm rate
|
0.051
|
0.072
|
0.015
|
0.103
|
0.077 |
In the experimental effect diagram shown in fig. 4, the following description is made with respect to the lower left multi-ship regression diagram, i.e., fig. 4 (c): the resolution of the picture is 233 × 151, wherein the white foam is a water surface obstacle interfering object, and the parameter values of the algorithm operation result are as shown in table 3 below. The other three pictures are similar to (c) in fig. 4.
TABLE 3 Ship position type information Table
| Ship number
|
Coordinate information (x, y)
|
width&eight
|
Species of
|
Confidence level
|
| Boat 1 (left 1)
|
(25,101)
|
21&12
|
Speed-boat
|
84%
|
| Boat 2 (left 2)
|
(47,82)
|
21&13
|
Speed-boat
|
76%
|
| Boat 3 (left 3)
|
(121,29)
|
28&26
|
Pump-ship
|
99%
|
| Boat 4 (left 4)
|
(210,77)
|
12&12
|
Speed-boat
|
89% |
In step 2, aiming at a night video frame image, in order to solve the problem that a training sample is unbalanced due to single visual information, the embodiment of the invention provides a convolutional neural network algorithm based on regional covariance guidance, which is used for detecting a significant target in the night image. The salient object detection is a research which is provided by simulating a human eye visual attention mechanism and takes the most interesting area of human eyes as a detection object, a ship object is a salient object when a ship sails on the water surface with a single background, and the position of the ship can be obtained by returning to a boundary frame of the salient ship object after the detection. As shown in fig. 5, the convolutional neural network algorithm based on the regional covariance guidance mainly comprises the following steps:
1) extracting low-level features of the image by taking a pixel as a unit;
2) constructing a region covariance based on the multi-dimensional feature vector;
3) constructing a convolutional neural network model by taking the covariance matrix as a training sample;
4) calculating the image significance based on the local and global contrast principles;
5) and (5) framing out a remarkable ship target and acquiring the position of the ship.
The model training for the night scene is different from the day scene, but the training and testing steps can refer to the training scheme of the Faster R-CNN target detection algorithm in step 2 and the training and testing steps in FIG. 3. The data set used by this module is a night time period image that is the same as the day scene location. Due to the particularity of the night scene and the characteristics of the algorithm used by the module, the evaluation standard selected by the module is the mainstream AUC and MAE evaluation index in the field, and the unit of running time of each image is as follows: and second. Specific index values are shown in table 4.
The AUC and MAE calculation formulas are respectively as follows:
wherein rank
insiThe index (probability score is arranged from small to large and at rank position) representing the ith sample, M, N is the number of positive samples and the number of negative samples respectively,
indicating that only the sequence numbers of the positive samples are added.
Wherein
The significant map is represented by a map of the feature,
reference map is shown, and W and H are respectively shownThe pixel values of the pixels are wide and high.
TABLE 4 night Ship detection Algorithm evaluation index
| Index name
|
MAE
|
AUC
|
Time
|
| Index value
|
0.1329
|
0.8546
|
1.553 |
As can be seen from table 4, in the nighttime period, due to the lack of the ship image information, the ship image is processed significantly, so that the real-time effect cannot be achieved, but the ship target detection function can be better achieved. MAE is the mean absolute error, with smaller values representing better algorithm performance. The AUC is a probability value, so that the quality of the classifier can be visually evaluated, and the larger the value is, the better the value is.
The implementation effect obtained by using a ship model to simulate real ship motion test on a lake in a campus at night is shown in fig. 6, and the following explanation is made for a regression graph of multiple ships below the right: the resolution of the picture is 233 × 155, and the output parameter values of the ship information after the algorithm operation result are shown in table 5. The other three pictures are similar.
TABLE 5 Ship position type information Table
| Ship model number
|
Coordinate information (x, y)
|
width
|
height
|
Confidence level
|
| Boat 1 (left 1)
|
(75,97)
|
52
|
50
|
94%
|
| Boat 2 (left 2)
|
(128,41)
|
37
|
25
|
86% |
As can be seen from the detection results: the ship detection model provided by the embodiment of the invention can detect most ships even on night images with poor image quality, and can detect the ships even if the ships have scale changes. In conclusion, the method and the device realize the detection and identification of the ship target in the whole time period scene, and have better robustness.
The above description of the embodiments is only intended to facilitate the understanding of the method of the invention and its core idea. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.