CN115439424B

CN115439424B - Intelligent detection method for aerial video images of unmanned aerial vehicle

Info

Publication number: CN115439424B
Application number: CN202211010524.XA
Authority: CN
Inventors: 徐龙; 朱绪胜; 欧雷; 陈俊佑; 崔党熊; 秦子燕
Original assignee: Chengdu Aircraft Industrial Group Co Ltd
Current assignee: Chengdu Aircraft Industrial Group Co Ltd
Priority date: 2022-08-23
Filing date: 2022-08-23
Publication date: 2023-09-29
Anticipated expiration: 2042-08-23
Also published as: CN115439424A

Abstract

The invention relates to the field of video monitoring, and discloses an intelligent detection method for aerial video images of an unmanned aerial vehicle. The invention is realized by the following technical scheme: creating a pre-training model to detect a plurality of aerial video image objects for video data preprocessing, extracting characteristic points of a target to be detected of a target reference video, splicing reference video key frames frame by utilizing image A-KAZE characteristic point matching, image affine transformation and image fusion, converting the reference video into a reference panoramic image, and splicing according to GIS data associated with each image to generate a panoramic reference image; any detected frame extracted from the detected video; and automatically detecting the change information of the panoramic reference image and the alignment area of the frame to be detected, generating image comparison RGB-LBP characteristic, area and gray histogram change information based on the initial change of low-pass filtering, and generating the aerial video image change image with higher final accuracy.

Description

Intelligent detection method for aerial video images of unmanned aerial vehicle

Technical Field

The invention relates to the field of video monitoring such as safety precaution, intelligent traffic, search and rescue, and in particular relates to an intelligent detection method for aerial video images of a low-altitude small unmanned aerial vehicle.

Background

The unmanned aerial vehicle has low flying height, light weight, small-image digital camera and high-resolution camera, and can capture high-quality images for various analyses. Unmanned Aerial Vehicles (UAVs) are therefore becoming increasingly popular in many areas. More static and dynamic information can be conveniently acquired through unmanned aerial vehicle aerial video, and the scene situation is mastered. The low-speed unmanned aerial vehicle is used for aerial photography, agriculture and entertainment, is applied to the fields of agriculture, construction, public safety, safety and the like, and is rapidly adopted in other fields. Along with popularization of unmanned aerial vehicles, the earth surface morphology is frequently changed, real-time mapping and urgent requirements of various industries of society on high-resolution remote sensing images are met, and low-altitude unmanned aerial vehicle remote sensing is rapidly developed. The low-altitude unmanned aerial vehicle remote sensing has the characteristics of low operation cost, flexibility, capability of performing low-altitude flight under the cloud, high resolution of acquired remote sensing images and the like, and can be used as powerful supplement for satellite remote sensing and manned aircraft aviation remote sensing, thereby having good effects in acquiring small-area high-resolution remote sensing images, providing disaster emergency mapping guarantee and remote sensing images in areas with difficult topography. The low-altitude unmanned aerial vehicle aerial photogrammetry is a novel aerial remote sensing technology developed after satellite remote sensing and large aircraft remote sensing. In unmanned aerial vehicle aerial photography, image quality is very important, and meanwhile, factors influencing image quality are many, so that in order to obtain better images, a plurality of aerial photography methods need to be mastered. Along with the maturity of unmanned aerial vehicle related technology, unmanned aerial vehicle-based low-altitude aerial photography is low in cost, flexible in take-off and landing, little influenced by weather conditions (can work under cloud cover), has become an important supplementary means for acquiring image information by traditional aerial photography and satellite remote sensing gradually, and is widely and widely applied to the fields of national major natural disaster emergency, geographic national condition monitoring, land management, urban construction planning and the like. Particularly, in applications such as target tracking, target searching and ground object state monitoring in a specific area, frequent, continuous and accurate low-altitude monitoring is usually required for the area range, and the small unmanned aerial vehicle video aerial photography can effectively complete the monitoring video acquisition task due to the advantages of low shooting cost, low-altitude shooting capability, high imaging resolution of a small target and the like. The low-altitude unmanned aerial vehicle photogrammetry system is generally composed of a ground system, a flight control system, a data processing system and an aerial photographing system, wherein the ground control system is responsible for planning a target route when the unmanned aerial vehicle is started, displaying a flight path, a map, flight parameters and the like of the flight by control software of a ground station when the unmanned aerial vehicle flies, and adjusting a day mark route and a route point by utilizing the control system; the flight control system utilizes GPS positioning signals to analyze the speed, position, height and the like of the unmanned aerial vehicle, so that the unmanned aerial vehicle flies according to a preset route; the data processing system utilizes image processing software to splice the aerial photograph, define the coordinate information and manufacture a digital ground model; the aerial photographing system can be combined with actual requirements, and carry numerical aerial photographing systems such as a digital aerial photographing instrument and a single-lens reflex digital camera, so that aerial photographing requirements of various precision and types are met. However, for intelligent analysis of video collected by a low-altitude unmanned aerial vehicle, particularly, a change detection means is used for extracting ground feature or phenomenon change information of a monitoring area, and an effective method is lacking at present. In recent years, the requirements of each field on remote sensing image data acquisition are higher and higher, satellite remote sensing image collection is often difficult to meet the requirements due to the influence of height, resolution, weather conditions and revisit period, conventional aerial shooting is also often limited by conditions such as airspace and weather, and the requirements on emergency tasks are difficult to be qualified and the cost is high. When automatically analyzing drone images, several challenges need to be overcome. Head-up and small view of the object: current computer vision algorithms and datasets are designed and evaluated in laboratory settings to artificially center photographs of close-up objects taken horizontally. For vertically photographed drone images, the objects of interest are relatively small and feature less, appearing mainly as planes and rectangles. For example, a building image taken from an unmanned aerial vehicle only shows a roof, while a floor image of a building will have features such as doors, windows, and walls. Even though we can obtain a large number of images, we still need to label them. This is a manual annotation interpretation effort, and a manual task, requiring accuracy and precision, since "inputting garbage means outputting garbage". There is no magical way to solve the label problem other than manual completion. In any supervised machine learning process, marking images may be the most difficult and time consuming step. Unmanned aerial vehicle image sizes are large, with resolutions in most cases exceeding 3000px X3000 px. This increases the computational complexity in processing such images. Object overlap: one of the problems with segmenting an image is that the same object may appear in two different images. This can lead to duplicate detection and counting errors. The extraction of edge features is highly susceptible to noise. Because the unmanned aerial vehicle is influenced by air flow and wind direction when flying in the air, the attitude angle and the heading can generate deviation, the rotation deflection angle and the overlapping degree of the photo are not stable enough, and the unmanned aerial vehicle is matched with a non-measuring camera with higher cost performance, so that nonlinear optical distortion (such as barrel-shaped or pillow-shaped distortion) exists at the edge of the obtained photo, and the post-processing of the image is difficult. The detection method relies heavily on the rationality of the artificial design features resulting in poor robustness. In addition, some objects that are in close proximity to each other may have overlapping borders during the detection process. Or in the case of shielding, the traditional image change detection mainly solves the problems that the video data amount is large, the video single-frame visual field is small, the overlapping degree of the coverage areas of adjacent frames is high, the small-scale change information is complex, the background is complex and the like in the video change detection. The image is rich in a large amount of information, and processing the image also means a large amount of computation. The smallest objects that are most difficult to detect in the small are the smallest objects because of their lower resolution. Video is composed of a series of spatially and temporally continuous video frames, which are essentially a still image, and although still image change detection has been studied intensively, real-time, accurate video image change detection methods are rare. The core problem solved by the image change detection is to find the ground object or phenomenon change information by sending out the image difference between two images shot at the same region of interest and different moments (the definition of the prior shot is a reference image and the definition of the later shot is a detection current image), which is an effective means for realizing intelligent analysis of static images and dynamic videos. In the process of executing a reconnaissance task of a certain area, the ground background in a video image is often complex, various differences can exist in the appearance of a target area at different times, and the target area is also in an irregular distribution state in the image, so that great difficulty is brought to detection of a change area. The detection and classification of the video image change area is a technical process for determining the change of the ground object state by utilizing unmanned plane video scene images covering the same area in a plurality of periods, and relates to the change type, distribution condition and change information, the ground object type and limit before and after the change and analysis of the change attribute, the video images at a certain moment before and after the same aerial photography reconnaissance task are executed, and the video images of a certain twice aerial photography reconnaissance at different aerial photography reconnaissance tasks. The most typical application field of video image change area detection is satellite remote sensing image analysis, which processes remote sensing images and other auxiliary data which are acquired by multiple time phases and cover the same earth surface area to realize the identification and analysis of the state change of targets or phenomena at different time periods, and simultaneously can determine the change of ground objects or phenomena within a time interval and provide quantitative and qualitative analysis of the space distribution of the ground objects and the change thereof. Because the unmanned aerial vehicle has different shooting times, different shooting angles and larger differences in video color, brightness and pixel distribution of a target area caused by different weather conditions during shooting, the image features of the multi-source images have larger differences, and the single feature is difficult to finish the registration of the visible light image and the infrared image under the same coordinate system. The conventional affine or homography model cannot be used for accurately fitting the transformation relation of the whole image field, and more local registration errors exist in the registration result, so that the extracted change information contains excessive interference, and the variability of similar targets cannot be accurately distinguished. According to three layers of image data processing, the commonly used change detection in the remote sensing image analysis field can be divided into pixel level change detection, feature level change detection and target level change detection. The pixel level change detection is to compare gray scale or color (RGB) pixel values of different phases at each position on the basis of image registration/alignment, and judge whether the change occurs or not, so as to detect a change region. The pixel level change detection is easily affected by factors such as image registration, radiation correction and the like, but the method keeps the original detail information of the image more greatly, so the method is a currently mainstream change detection method. Feature level change detection requires that an object of interest is first determined and features (such as edges, shapes, contours, textures, etc.) thereof are extracted, and then the features are comprehensively compared and analyzed to obtain change information of the object. In general, feature level change detection performs association processing on features, so that the feature level change detection has higher reliability and accuracy in judging feature attributes, but is not based on original data, information loss inevitably occurs in a feature extraction process, and fine change information is difficult to provide. The target level change detection is a high-level analysis method based on a target model by detecting change information of certain specific objects (such as roads, houses and the like) on the basis of image understanding and recognition, and has the greatest advantage of being capable of better responding to user demands, but has the defect of difficulty in target extraction. The application of the deep neural network in the feature level change detection method makes the algorithm have higher requirement on the floating point number computing capability of the computer platform, and the computing capability of the airborne computing platform limits the detection effect to a great extent.

Disclosure of Invention

The invention aims at solving the problems of performance and speed of the unmanned aerial vehicle video monitoring existing in the unmanned aerial vehicle low-altitude aerial video image change detection method in the prior art, and provides an intelligent unmanned aerial vehicle aerial video image detection method which is high in detection speed, high in detection precision, real-time and accurate.

The invention is realized by the following technical scheme: an intelligent detection method for aerial video images of an unmanned aerial vehicle comprises the following steps:

the visual receiving system defines coordinate information in the process of video data acquisition and acquires a low-altitude aerial image video image of a target area of the unmanned aerial vehicle;

the ground station transmits the received multi-time-phase aerial photographing reference video to the flight control computer through wireless data transmission based on the sample parameter transmission on-line detection model;

the method comprises the steps that a flight control computer image processor performs fuzzy linear discrimination data analysis on a visible light reconnaissance image and an infrared reconnaissance image of a detected video sample feature set, a pre-training model is created to detect a plurality of aerial video image objects to perform unmanned aerial vehicle video data preprocessing, pixels in all connected domains in the aerial video image after the preprocessing are calculated, an optical flow is used as an instantaneous motion field of gray pixel points on the image, and Global Positioning System (GPS) interpolation of a check video key frame and a key frame is performed;

Extracting overlapping image information between characteristic points of a target to be detected of a target reference video and adjacent key frames by a pre-training model, selecting light flow points at equal intervals in a target area type image, calculating motion vectors of the light flow points, classifying, comparing and matching the same scene image from different angles, measuring angles among objects in each image, splicing reference video key frames frame by utilizing image A-KAZE characteristic point matching, image affine transformation and image fusion, splicing frames together in sequence, splicing frames into a panoramic image, calculating aerial photographing positions and yaw by using a corresponding relation between a single-channel image coordinate system and a world coordinate system after completing image splicing and fusion, splicing and generating a panoramic reference image according to GIS data associated with each image, creating a whole landscape view, extracting key frames of component histograms in the reference video in the process of splicing and generating the panoramic reference image by the reference video image, extracting A-KAZE characteristic points in the overlapping area of the adjacent key frames, carrying out characteristic point matching and calculating an image transformation matrix by a characteristic matching algorithm, converting the extracted key frames into a panoramic reference image space by the panoramic image space based on the transformation matrix, and converting the extracted key frames into the reference image into the panoramic reference image;

In the registration process of a detection video and a panoramic reference image based on an online detection model transmitted by sample parameters, registering a detection video frame and the panoramic reference image, extracting any detection frame from the detection video, manually selecting a region to be detected or finding out a frame needing to be concerned in the detection video by an automatic extraction method, quickly realizing the coarse positioning of the detection frame in the reference panoramic image by utilizing the GPS information of the frame to be detected and the full reference image, automatically adjusting the size of a tracking window based on a coarse positioning result, registering and fusing the extracted visible light image and a multisource image and a homologous image of an infrared image change region under the same coordinate system, and accurately registering the detection frame and the panoramic reference image by utilizing the GPS initial positioning and an image registration method based on A-KAZE characteristic point matching so as to realize the quick and accurate alignment of the frame to be detected and the panoramic reference image;

in the process of executing image change detection, an online detection model automatically monitors an aerial video image, detects an image change area, carries out denoising and histogram equalization operation on a registered image, removes influences of noise, illumination and irrelevant changes, effectively removes influences of parallax and registration errors of generated change images through a low-pass filtering mode, generates an initial change image, carries out iterative operation by using a MeanShift algorithm, moves the center of a search window to the maximum position of iteration, adjusts the size of the search window, carries out up-sampling by the sliding window, searches for small and dense objects, carries out training on the image based on image morphological processing, image local binary pattern RGB-LBP feature comparison and area and gray-scale histogram comparison, carries out verification on the change position in the change image by using deep learning software, corrects the initial change image, builds a change target area data set, trains depth network weight, and generates an aerial video image change image with higher final accuracy.

The method comprises the steps that a vision receiving system is adopted, in the video data acquisition process, coordinate information is defined, a low-altitude aerial photo video image of a target area of an unmanned aerial vehicle is obtained, a ground station transmits an on-line detection model based on sample parameters, a received multi-time aerial photo reference video is transmitted to a flight control computer through wireless data transmission, a flight control computer image processor performs fuzzy linear identification data analysis on a visible light reconnaissance image and an infrared reconnaissance image of a detection video sample feature set, a pre-training model is created to detect a plurality of aerial photo video image objects to perform video data preprocessing, pixels in all communication areas in the aerial photo video image after preprocessing are calculated, an optical flow is used as an instantaneous motion field of gray pixel points on the image, and a Global Positioning System (GPS) interpolation of a check video key frame and a key frame is performed;

in the process of executing image change detection, an online detection model automatically monitors an aerial video image, detects an image change area, carries out denoising and histogram equalization operation on a registered image, removes influences of noise, illumination and irrelevant changes, effectively removes influences of parallax and registration errors of generated change images through a low-pass filtering mode, generates an initial change image, carries out iterative operation by using a MeanShift algorithm, moves the center of a search window to the maximum position of iteration, adjusts the size of the search window, carries out up-sampling by the sliding window, searches for small and dense objects, carries out training on the image based on image morphological processing, image local binary pattern RGB-LBP feature comparison and area and gray-scale histogram comparison, verifies the change position in the change image by using deep learning software, corrects the initial change image, constructs a change target area data set, trains depth network weight, and generates an aerial video image change image with higher final accuracy.

In order to better implement the invention, further, the method for preprocessing the unmanned aerial vehicle video data comprises the following steps:

shooting sensor calibration, video key frame extraction and GPS interpolation, and generating a panoramic reference picture by reference video stitching;

extracting a key frame from a reference video, performing image matching based on a key frame adjacent relation, screening out a position coordinate set of an aerial video image connected domain, and mapping the position coordinate set to a standard coordinate space to generate a panoramic image required by change detection;

detecting registration or alignment of an aerial video frame and a panoramic reference map;

for video change detection, firstly, a frame to be processed in a detected video is found through a manual selection or automatic extraction method;

then, finding out the same coverage area in the panoramic reference image and the image to be detected, registering the two images, quickly realizing coarse positioning of a detection frame in the panoramic reference image by utilizing GPS information, completing fusion of an infrared image target change area and a visible light image target change area by adopting a self-adaptive weight target area fusion algorithm, accurately registering the detection frame and the reference image based on an image feature point matching method, generating a change image in a low-pass filtering mode after capturing the accurate registration of the detection video frame and the reference video panoramic image, and removing the influence of parallax and registration errors;

Thirdly, calculating the change information of each aerial video frame position by using an RGB-LBP characteristic comparison method;

and finally, verifying the change information of each position of the hand-piece by using morphological operation, area and gray histogram comparison, and outputting a final change image.

In order to better implement the present invention, the method for verifying the change position in the change map by using the deep learning software further includes:

the deep learning software performs regional network air triangulation on the acquired calibration field data and the high-precision control point data by using a beam method regional network adjustment model based on a least square adjustment theory through manually setting homonymous points, and solves required geometric calibration parameters of the shooting sensor, namely azimuth elements, radial distortion coefficients, tangential distortion coefficients, CCD non-square scaling coefficients of charge coupled devices and CCD non-orthogonality distortion coefficients in a photo.

In order to better implement the invention, further, the computing method for checking the video key frame and GPS interpolation of the key frame comprises the following steps:

in video key frame extraction and GPS interpolation, based on unmanned aerial vehicle track data, a formula for automatically extracting a key frame time interval t under a given overlapping degree can be deduced as follows:

Wherein, the formula of the frame width lower width Xn is:

the formula for the width Xf over the frame width is:

to ensure the overlapping degree in the x direction, the formula of the overlapping degree Dx of the shooting sensor in the x direction after t seconds is:

the formula of the overlap Dy in the y-direction is:

the formula of the frame width height Y is: y=h· [ cot (tan) ^-1 (2h/f)+θ)+cot(tan ^-1 (2h/f)-θ)]；

In the GPS interpolation of video key frames, aiming at the selected key frames, recording the position information corresponding to each key frame, wherein the position information is provided by a GPS navigator carried by an unmanned aerial vehicle, if the GPS information is discontinuous, the GPS interpolation is carried out by utilizing a Newton interpolation method, so that the GPS information corresponds to the extracted key frames one by one,

wherein H is the altitude of the unmanned aerial vehicle at a certain moment t, v is the speed, ω is the width of the shooting sensor, H is high, f is the focal length, θ is the included angle between the oblique shooting camera and the horizontal plane, and n is the block sub-area divided into corresponding areas.

In order to better implement the present invention, further, the process of stitching the reference video images to generate the panoramic reference map includes:

the panoramic image generation by reference video stitching comprises extracting A-KAZE characteristic points, matching the characteristic points, stitching images and generating a panoramic image;

in the extraction of the A-KAZE characteristic points, extracting A-KAZE image characteristics from two overlapped adjacent key frames respectively;

Firstly, constructing an image pyramid by using nonlinear diffusion filtering and a fast explicit diffusion FED algorithm;

secondly, searching a 3 multiplied by 3 neighborhood Hessian matrix determinant extremum after scale normalization in a nonlinear scale space to obtain an image feature point coordinate;

thirdly, determining the main direction of the feature point based on first-order differential values of all adjacent points of the feature point circular area;

finally, the neighborhood image of the feature points is rotated to a main direction, and an improved local differential binary descriptor M-LDB is adopted to generate an image feature vector;

in the matching of the A-KAZE feature points, the feature points extracted from two overlapped key frames are matched;

firstly, defining the similarity between two A-KAZE feature descriptors by utilizing the Hamming distance;

then, searching initial matching points of the feature points by using a bidirectional k nearest neighbor classification KNN algorithm;

finally, screening the matching point pairs by adopting a random sampling consensus algorithm RANSAC to remove the mismatching pairs;

when the image is spliced and the panorama is generated, preprocessing is carried out on the received reference video, video key frames are extracted based on random sampling, GPS information of each key frame is recorded, and the reference video key frames are collected as followsK is the total frame number of the reference video extraction key frame, K is the current key frame number, the key frame f1 is set as the panorama space, and the video stitching is carried out to key f ₂ To key frame f _k One by one to the panorama space.

In order to better implement the present invention, further, the process of stitching the reference video images to generate the panoramic reference map further includes:

when the image is spliced and the panorama is generated, preprocessing is carried out on the received reference video, video key frames are extracted based on random sampling, GPS information of each key frame is recorded, and the reference video key frames are collected as followsThe reference video keyframe set is +.>Setting a key frame f1 as a panorama space, and setting the key f ₂ To key frame f _k The method comprises the steps of transforming to a panoramic space one by one, selecting an affine transformation model M with functions of adapting to translation, rotation and scaling as an image coordinate transformation matrix, and expressing the image coordinate transformation as follows:

wherein K is the total frame number of the reference video extraction key frame, K is the current key frame number, (x, y) and (x ', y') respectively represent the coordinates of pixel points in the panoramic image and the image to be spliced, and m ₀ -m ₅ Is an affine transformation parameter.

In order to better implement the present invention, further, the process of splicing the key frames includes:

firstly, dividing all pixel nodes into a target change area and a non-change area, extracting a visible light image target change area and an infrared image target change area, and for a key frame f to be spliced ₂ Extracting key frame f ₁ And key frame f ₂ Overlapping the characteristic points of the area A-KAZE, and calculating a matching point set match of more than 3 pairs of matching points ₂ And match ₁ And obtaining a frame f by adopting a least square method ₂ To frame f ₁ Image transformation matrix M for panorama space _1,2 The method comprises the steps of carrying out a first treatment on the surface of the Then, judging whether the key frames f to be spliced are greater than 2 for k and greater than 2 for k _k Calculating characteristic point set match of k-1 frame and k frame overlapping area _k-1 And match _k Extracting key frame f _k And key frame f _k-1 Overlapping area A-KAZE characteristic point, and calculating to obtain frame f by least square method _k Transformation matrix M to panorama space _1.K By a transformation matrix M _1.K-1 For match _k Coordinate transformation is carried out to obtain match _k-1 Calculating a transformation matrix M from a kth frame to a k-1 frame _1.K Obtaining a key frame f _k-1 Matching point set match under panorama space _k-1 Key frame f _k-1 Medium matching point set match _k-1 And match _k-1 Projection to panorama space, otherwise match based on set of matching points _k-1 And match _k Calculating feature point set match of overlapping area of 1 st frame and 2 nd frame ₁ And match ₂ Then calculate the transformation matrix M from the 2 nd frame to the 1 st frame _1.2 Finally, the image transformation matrix M is utilized _1.K And bilinear interpolation method to make key frame f _k And transforming to a panoramic image space, and performing stitching processing by using an image fusion technology to finish image stitching and generate a final panoramic image.

To better implement the present invention, further, the method for performing accurate registration of a detection frame with a panoramic reference map includes:

the method comprises the steps of performing GPS-based rapid coarse positioning and A-KAZE feature-based precise registration, preprocessing a received detection video when a GPS-based detection frame is subjected to rapid coarse positioning, extracting image frames for performing change detection and GPS information thereof, comparing the detection frame GPS information with each key frame GPS information recorded in a panoramic reference picture, finding out the nearest 4 adjacent key frame areas in the panoramic picture, and taking the areas as initial reference picture areas for change detection; when accurate registration based on the A-KAZE features is performed, extracting A-KAZE feature points, matching the A-KAZE feature points, and transforming the detection image into a reference image space to finish accurate registration of the detection image and the coarse positioning reference image region; then, an image area T and an image area R with the same position and size are respectively extracted from the registered detection image and the panoramic reference image, and a target image with a confidence threshold larger than a preset confidence threshold is output from the detection result, wherein T and R are respectively used as a test image and a reference image which are input by change detection.

In order to better implement the present invention, further, the online detection model includes, in performing an image change detection process:

Generating an initial change image, firstly converting an RGB image into a gray image by using a reference image R and a detection image T to obtain a corresponding gray reference image R _gray And gray test pattern T _gray The method comprises the steps of carrying out a first treatment on the surface of the Then, based on the gray reference map R _gray And gray test pattern T _gray And generating a difference image according to the gray values of the positions, judging the position of each pixel of the initial change map D by using the difference values, if D (i, j) represents change, calculating pixel bits (i, j) of the detection frame and the reference map and RGB-LBP characteristics of the text positions, and using the size N of the represented neighborhood window as N multiplied by N. ) Neighborhood window, respectively calculating gray reference image R by low-pass filtering mode _gray To gray level detection map T _gray Difference graph D of (2) _R Gray scale detection map T _gray To reference map R of wager _gray Difference graph D of (2) _T The calculation formula is as follows:

based on difference image D _R Difference graph D _T Division threshold delta _diff Calculating an initial change image D, verifying a pixel position with a value of 1 in the initial change image D based on RGB-LBP feature comparison, confirming a position (i, j) with a value of 1 in the initial change image D by using RGB-LBP feature comparison, setting the position as change if the change is detected twice, otherwise, calculating 8-bit binary coded LBP features centering on the position (i, j) in 3 color channels of a reference image R and a detection image T respectively, taking the 15X 15 adjacent points, and concatenating the final LBP features according to the position and the channels to form LBP features S of the reference image R and the test image T at the position (i, j) _R (i, j) and S _T (i, j) in a 3 x 3 neighborhood centered around the position, starting from the upper left corner, 0/1 encoding is performed for 8 neighboring positions in time, if the gray value is lower than the center position, the point encoding is 0, otherwise 1; then, LBP characteristics S of the reference pattern R and the test pattern T at the (i, j) position are calculated _R And S is _T Sea distance d of (2) _RT (i, j); finally, whether the pixel position (i, j) is changed is judged based on the Hamming distance, if the Hamming distance d _RT (i, j) satisfy d _RT (i,j)＞δ _h ×|S _R And (i, j) is changed, the value of D (i, j) in the initial change image is kept unchanged by 1 if the (i, j) is changed, otherwise, the value of D (i, j) in the initial change image is changed from 1 to 0 if the (i, j) is unchanged. Wherein,,

wherein 0: unchanged, 1: the variation, N ε {7,9,11}, Δi represents the offset of position coordinate i in N neighborhood, Δj represents the offset of position coordinate j in N neighborhood, δ _diff ∈[0,50]The value is based on the illumination difference degree of the image, |S _R I represents the length of the binary string, delta _h To determine the threshold.

In order to better realize the invention, further, in the post-processing of the change image, in order to effectively eliminate false alarms, the post-processing of the verified change image is needed, and the operation flow is as follows: firstly, gray processing, gray correction and smooth filtering are carried out, step edge points and marked line edge points are screened, and then connected Removing false edges, extracting marking information, detecting and removing isolated change positions by morphological open operation, setting corresponding areas in the verified change graph D as unchanged, calculating pixel area of each change area, and determining the area delta of the minimum change area according to image resolution and the minimum target size _a The area of the change image D after verification is smaller than delta _a Setting the regions of the detection frame and the reference image (i, j) as unchanged, calculating the characteristic similarity after calculating the RGB-LBP characteristics of the detection frame and the reference image (i, j), if the characteristic similarity is larger than a threshold value, maintaining D (i, j) unchanged, and if the characteristic similarity is smaller than the threshold value, correcting D (i, j), setting the regions as unchanged, and carrying out post-processing on the changed images, wherein each changed region A in the verified changed image D _p Find its minimum circumscribed rectangular area B _p In the gray reference image R _gray And gray test pattern T _gray Extraction of B _p Corresponding image areaAnd->And calculate the area +.>Gray histogram feature->And area->Gray histogram feature->And->The distance calculation method of (2) is as follows:

where, beta is the gray histogram feature dimension,and->Representing the average value of the features, q representing the value of the q-th dimension of the histogram; when->Less than 0.35, change region A in D _p Set unchanged, otherwise unchanged.

Compared with the prior art, the invention has the following advantages:

(1) The detection speed is high. The invention is based on aerial video image change detection application, obtains the aerial video image of unmanned aerial vehicle low altitude, the ground station transmits the online detection model in the video data acquisition process based on sample parameter, detect a plurality of aerial video image objects to carry on the video data preprocess through setting up the pre-training model, calculate the pixel in each connected domain in the aerial video image after preprocessing, regard optical flow as the gray pixel point on the instantaneous playground of the image, examine video key frame and global positioning system GPS interpolation of key frame; the method is particularly suitable for automatically finding the ground object or phenomenon change information in the front and rear two sections of videos in two sections of observation videos of near-down view shooting in a short time interval (tens of minutes or hours) of a low-altitude unmanned plane: the disappearance, appearance or partial damage of the objects of people, vehicles, buildings, public facilities, etc. The change detection execution result can be directly returned to the user as image analysis data, and can also be transmitted to a higher-level task for semantic analysis, such as scene understanding, target detection, target tracking and the like. The method combines the video key frame extraction and the image stitching technology, and converts the reference video into the reference panorama by a video image change detection method different from the traditional frame-by-frame processing, so that the problems of small single-frame visual field, high overlapping degree of the coverage areas of adjacent frames of the video and the like are solved, and the number of image frames to be processed in the video image change detection is greatly reduced on the premise of not losing image information. In the registering process of the detection frame and the panoramic reference map, the GPS coarse positioning and A-KAZE feature matching technology is combined, the position range of the detection frame in the panoramic reference map is rapidly determined, and the accurate registering of the detection frame and the panoramic reference map is realized through the former, so that compared with the traditional image registering method based on Scale-invariant feature transform (Scale-invariant Feature Transform) feature matching, the registering speed and registering precision are obviously improved;

(2) And the detection accuracy is high. Aiming at the characteristics of low-altitude aerial video data of an unmanned aerial vehicle, images of the same scene are compared and matched from different angles, after angles among objects in each image are measured, frames are spliced together in sequence, frames are spliced into a panoramic image, after image splicing and fusion are completed, aerial photographing positions and yawing are calculated by using the corresponding relation between a single-channel image coordinate system and a world coordinate system, a panoramic reference image is created according to GIS data associated with each image in a spliced manner, an entire view is created, key frames of component histograms in the reference video are extracted in the process of splicing the reference video images to generate the panoramic reference image, A-KAZE characteristic points are extracted in the overlapping area of adjacent key frames, an image transformation matrix is calculated through characteristic point matching, the extracted key frames are transformed into a panoramic reference image space based on the transformation matrix, a reference video is converted into a reference panoramic image, the registration accuracy (within 1 pixel error) of the registration of the frames to be detected and the reference image is effectively improved, and the influence of the registration accuracy of the change of the detection is reduced by 90%. Through means of image denoising, image enhancement, low-pass filtering difference image generation, change information detection based on RGB-LBP feature comparison, change information verification based on morphological processing and gray histogram feature comparison and the like, a large amount of noise, false detection and irrelevant change information (such as water surface waves, leaf waving and the like) are removed, and the accuracy of change detection is improved;

(3) In order to solve the problems of large video data volume, small video single frame visual field, high overlapping degree of adjacent frame coverage areas and the like, a panoramic reference image generation technology of video key frame extraction and image stitching is combined, an online detection model based on sample parameter transmission is used for detecting the registration of a video frame and a panoramic reference image in the registration process of the detected video and the panoramic reference image, a frame needing to be concerned in the detected video is found through manually selecting a region to be detected or an automatic extraction method, coarse positioning of the detected frame in the reference panoramic image is quickly realized by utilizing GPS information of the frame to be detected and the full reference image, the size of a tracking window is automatically adjusted based on a coarse positioning result, the accurate registration of the detected frame and the panoramic reference image is carried out by utilizing an image A-KAZE characteristic point matching method, and a weight is updated; on the premise of not losing video data information, the conversion from video change detection to image change detection is realized rapidly. In order to improve the detection rate of small-scale change information, reduce the influence of complex background on detection results, an initial change image generation method based on low-pass filtering is used, and two different feature comparison methods are used for verifying the change information. In addition, the application of means such as image denoising, image enhancement, morphological processing and the like in the image preprocessing and detection result post-processing processes also greatly improves the detection accuracy and the robustness in complex backgrounds;

(4) In the process of executing image change detection, the method automatically monitors the aerial video image, detects an image change area, carries out denoising and histogram equalization operation on the registered image, removes the influence of noise, illumination and irrelevant change, effectively removes the influence of parallax and registration error of the generated change image through a low-pass filtering mode, generates an initial change image, carries out iterative operation by using a MeanShift algorithm, moves the center of a search window to the iterative maximum position, adjusts the size of the search window, upwards samples by the sliding window, searches for small and dense objects, trains the image based on image morphological processing, image local binary pattern RGB-LBP (RGB-Local Binary Pattern) characteristic comparison and area and gray histogram comparison, verifies the change position in the change image by using deep learning software, corrects the initial change image and generates the aerial video image change image with higher final accuracy. The method overcomes the defect that the prior change detection technology is applied to aerial video data, is particularly suitable for quickly and accurately finding the condition of the change of the states of targets such as people, vehicles, buildings, public facilities and the like in two videos shot by an unmanned plane platform within a certain time interval, and has wide application prospects in the fields of scene monitoring, target searching and the like.

Drawings

The invention is further described with reference to the following drawings and examples, and all inventive concepts of the invention are to be considered as being disclosed and claimed.

Fig. 1 is a flowchart of an intelligent detection method for aerial video images of an unmanned aerial vehicle.

Fig. 2 is a flowchart of generating a panoramic image with reference to the video in fig. 1.

Fig. 3 is a flow chart of the detection video and panoramic image change detection, registration process of fig. 1.

Detailed Description

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it should be understood that the described embodiments are only some embodiments of the present invention, but not all embodiments, and therefore should not be considered as limiting the scope of protection. All other embodiments, which are obtained by a worker of ordinary skill in the art without creative efforts, are within the protection scope of the present invention based on the embodiments of the present invention.

Example 1:

according to the intelligent detection method for the aerial video image of the unmanned aerial vehicle, as shown in fig. 1, a vision receiving system defines coordinate information in the video data acquisition process, acquires a low-altitude aerial video image of a target area of the unmanned aerial vehicle, a ground station transmits an on-line detection model based on sample parameters, a received multi-time-phase aerial reference video is transmitted to a flight control computer through wireless data transmission, a flight control computer image processor performs fuzzy linear discrimination data analysis on a visible light detection image and an infrared detection image of a detection video sample feature set, a pre-training model is created to detect a plurality of aerial video image objects to perform video data preprocessing, pixels in all connected domains in the preprocessed aerial video image are calculated, and an optical flow is used as an instantaneous motion field of gray pixel points on the image, and a global positioning system GPS interpolation of a calibration video key frame and the key frame is performed;

in the process of executing image change detection, an online detection model automatically monitors an aerial video image, detects an image change area, carries out denoising and histogram equalization operation on a registered image, removes influences of noise, illumination and irrelevant changes, effectively removes influences of parallax and registration errors of generated change images through a low-pass filtering mode, generates an initial change image, carries out iterative operation by using a MeanShift algorithm, moves the center of a search window to the maximum position of iteration, adjusts the size of the search window, carries out up-sampling by the sliding window, searches for small and dense objects, carries out training on the image based on image morphological processing, image local binary pattern RGB-LBP feature comparison and area and gray-scale histogram comparison, verifies the change position in the change image by using deep learning software, corrects the initial change image, constructs a change target area data set, trains depth network weight, and generates an aerial video image change image with higher final accuracy. KAZE is a novel multi-scale 2D feature detection and description algorithm performed in a non-linear scale space.

Example 2:

the embodiment is further optimized on the basis of embodiment 1, and for real-time and accurate change detection of the unmanned aerial vehicle low-altitude aerial video image, four steps of video data preprocessing, reference video stitching to generate a panoramic image, registration of a detection video frame and the panoramic reference image and change area detection are still adopted. In unmanned aerial vehicle video data preprocessing, reference video stitching to generate panoramic reference images, registration and change detection of detection video frames and panoramic reference images, the following specific implementation modes are adopted:

the unmanned aerial vehicle video data preprocessing mainly comprises shooting sensor calibration, video key frame extraction and GPS interpolation, and panoramic reference pictures are generated by reference video stitching. The video image data volume is large, the image content repeatability between adjacent frames is high, the coverage area of a single frame image is limited, and the single frame image is not suitable for being used as a change detection reference image, so that a key frame is required to be extracted from a reference video, image matching is carried out based on the adjacent relation of the key frame, a position coordinate set of an aerial video image connected domain is screened out, and the position coordinate set is mapped to a standard coordinate space to generate a panoramic image required by change detection;

the premise of detecting the change of the aerial video frames is that two images containing the same area are accurately registered, the aerial video frames are detected to be registered or aligned with the panoramic reference map, and for detecting the change of the video, firstly, the frames needing to be processed in the detected video are found out through a manual selection or automatic extraction method; then, finding out the same coverage area in the panoramic reference image and the image to be detected, registering the two images, and because the coverage area of the panoramic image is large, and the direct registration cannot meet the real-time requirement, the GPS information is utilized to quickly realize the coarse positioning of the detection frame in the panoramic reference image, and finally, the fusion of the infrared image target change area and the visible light image target change area is completed by adopting a self-adaptive weight target area fusion algorithm, and the accurate registration of the detection frame and the reference image is carried out based on an image feature point matching method;

After the aerial photography detection video frame is accurately registered with the reference video panoramic image, the change detection can automatically find the disappearance, appearance, damage and other change conditions of targets such as people, vehicles, buildings, public facilities and the like in the front and rear two times of shooting of the same coverage area. The change detection method comprises the steps of firstly removing the influence of noise, illumination and irrelevant changes by means of denoising, histogram equalization and the like; secondly, generating a change map by using a low-pass filtering mode, removing the influence of parallax and registration errors, calculating the change information of each aerial video frame position by using an RGB-LBP characteristic comparison method, finally, verifying the change information of each position by using morphological operation, area and gray histogram comparison and the like, and outputting a final change image.

Other portions of this embodiment are the same as those of embodiment 1, and thus will not be described in detail.

Example 3:

the present embodiment is further optimized based on the above embodiment 1 or 2, and in the unmanned aerial vehicle video data preprocessing including shooting sensor calibration, video key frame extraction and GPS interpolation, the shooting sensor calibration is performed. Before checking the shooting sensor, firstly, confirming that the mechanical structure of the shooting sensor is firm and stable, and no shake exists, and ensuring that the optical structure and the electronic structure of the shooting sensor are also reliable and stable; then, the geometric calibration of the camera is carried out by using an outdoor calibration field, and the specific process is as follows: the deep learning software performs regional network air triangulation on the acquired calibration field data and the high-precision control point data by using a beam method regional network adjustment model based on a least square adjustment theory through manually setting the same name points, and solves the required geometric calibration parameters of the shooting sensor, namely azimuth elements in a photo, radial distortion coefficients, tangential distortion coefficients, non-square proportional coefficients of a charge coupled device CCD (Charge Coupled Device) and distortion coefficients of CCD non-orthogonality.

Other portions of this embodiment are the same as those of embodiment 1 or 2 described above, and thus will not be described again.

Example 4:

the present embodiment is further optimized based on any one of the above embodiments 1 to 3, and in the video key frame extraction and GPS interpolation, based on unmanned aerial vehicle track data, a formula for automatically extracting a key frame time interval under a given overlapping degree can be derived as follows:

wherein, the frame width is wide down:

width of frame:

to ensure the overlapping degree in the x direction, the overlapping degree of the shooting sensor in the x direction after t seconds is:

the overlap in the y-direction is:

the frame width is high: y=h· [ cot (tan) ^-1 (2h/f)+θ)+cot(tan ^-1 (2h/f)-θ)]If the GPS information is discontinuous, GPS interpolation can be performed by utilizing a Newton interpolation method, so that the GPS information corresponds to the extracted key frames one by one;

in the formula, let H be unmanned aerial vehicle navigation height (unit: m) at a certain moment, speed be v (unit: m/s), ω be that the sensor width of adopting shooting sensor, H be high, f be focal length (unit: mm), θ be camera and horizontal angle, n be that corresponding region divide into the piece subregion.

Aiming at the received video, the key frame selection is the key of video change detection because of large data volume and high repetition rate of the information of the adjacent frames of the video. Considering the influence of oblique shooting (the included angle between a camera and a horizontal plane is theta, the actual width of a shot image covering the ground is narrow and wide), aiming at a selected key frame, the position information corresponding to each key frame needs to be recorded, and the position information is provided by a GPS navigator carried by an unmanned aerial vehicle. Therefore, in the GPS interpolation of the video key frames, the position information corresponding to each key frame is recorded aiming at the selected key frame, the position information is provided by a GPS navigator carried by the unmanned aerial vehicle, and if the GPS information is discontinuous, the GPS interpolation is carried out by utilizing the Newton interpolation method, so that the GPS information and the extracted key frames are in one-to-one correspondence.

Other portions of this embodiment are the same as any of embodiments 1 to 3 described above, and thus will not be described again.

Example 5:

the present embodiment is further optimized based on any one of the above embodiments 1 to 4, and as shown in fig. 2, generating a panorama by reference to video stitching includes-KAZE feature point extraction, feature point matching, image stitching, and panorama generation: in the extraction of A-KAZE feature points, A-KAZE image features are respectively extracted from two overlapped adjacent key frames, and the main flow is as follows: firstly, constructing an image pyramid by utilizing nonlinear diffusion filtering and a rapid explicit diffusion FED (Fast Explicit Diffusion) algorithm; secondly, searching a 3 multiplied by 3 neighborhood Hessian matrix determinant extremum after scale normalization in a nonlinear scale space to obtain an image feature point coordinate; thirdly, determining the main direction of the feature point based on first-order differential values of all adjacent points of the feature point circular area; finally, the neighborhood image of the feature points is rotated to a main direction, and an improved local differential binary descriptor M-LDB (Modified-Local Difference Binary) is adopted to generate an image feature vector;

in the matching of A-KAZE feature points, feature points extracted from two overlapped key frames are matched, and the main flow is as follows: firstly, defining the similarity between two A-KAZE feature descriptors by utilizing the Hamming distance; then, searching initial matching points of the feature points by using a bidirectional k nearest neighbor classification KNN algorithm; finally, screening the matching point pairs by adopting a random sampling consensus algorithm RANSAC to remove the mismatching pairs;

When image splicing and panorama generation are carried out, preprocessing is carried out on received reference video, GPS information of each key frame is recorded, key frame interval time is calculated, video key frames are extracted based on random sampling, and a reference video key frame set is obtainedSetting a key frame f1 as a panorama space, and setting the key f ₂ To key frame f _k The method comprises the steps of transforming to a panoramic space one by one, selecting an affine transformation model M with functions of adapting to translation, rotation and scaling as an image coordinate transformation matrix, and expressing the image coordinate transformation as follows:

wherein K is the total frame number of the reference video extraction key frame, K is the current key frame number, (x, y) and (x, y) respectively represent pixel point coordinates in the panoramic image and the image to be spliced, and m ₀ -m ₅ Is an affine transformation parameter.

Other portions of this embodiment are the same as any of embodiments 1 to 4 described above, and thus will not be described again.

Example 6:

the present embodiment is further optimized based on any one of the above embodiments 1 to 5, and a specific key frame splicing procedure is as follows: in the key frame splicing process, firstly, dividing all pixel nodes into a target change area and a non-change area, extracting a visible light image target change area and an infrared image target change area, and for a key frame f to be spliced ₂ Extracting key frame f ₁ And key frame f ₂ Overlapping the characteristic points of the area A-KAZE, and calculating a matching point set match of more than 3 pairs of matching points ₂ And match ₁ And obtaining a frame f by adopting a least square method ₂ To frame f ₁ Image transformation matrix M for panorama space _1,2 The method comprises the steps of carrying out a first treatment on the surface of the Then, judging whether the key frames f to be spliced are greater than 2 for k and greater than 2 for k _k Calculating characteristic point set match of k-1 frame and k frame overlapping area _k-1 And match _k Extracting key frame f _k And key frame f _k-1 Overlapping area A-KAZE characteristic point, and calculating to obtain frame f by least square method _k Transformation matrix M to panorama space _1.K By a transformation matrix M _1.K-1 For match _k Coordinate transformation is carried out to obtain match _k-1 Calculating a transformation matrix M from a kth frame to a k-1 frame _1.K Obtaining a key frame f _k-1 Matching point set match under panorama space _k-1 Key frame f _k-1 Medium matching point set match _k-1 And match _k-1 Projection to panorama space, otherwise match based on set of matching points _k-1 And match _k Calculating feature point set match of overlapping area of 1 st frame and 2 nd frame ₁ And match ₂ Then calculate the transformation matrix M from the 2 nd frame to the 1 st frame _1.2 Finally, the image transformation matrix M is utilized _1.K And bilinear interpolation method to make key frame f _k And transforming to a panoramic image space, and performing stitching processing by using an image fusion technology to finish image stitching and generate a final panoramic image.

The registration of the detection frame and the full-reference view comprises the rapid rough positioning based on the GPS and the accurate registration based on the A-KAZE characteristics, when the rapid rough positioning is carried out on the detection frame based on the GPS, the received detection video is preprocessed, the image frame for executing the change detection and the GPS information thereof are extracted, the GPS information of the detection frame is compared with the GPS information of each key frame recorded in the panoramic reference view, the nearest 4 adjacent key frame areas are found in the panoramic view, and the areas are used as initial reference view areas for the change detection; when accurate registration based on the A-KAZE features is performed, extracting A-KAZE feature points, matching the A-KAZE feature points, and transforming the detection image into a reference image space to finish accurate registration of the detection image and the coarse positioning reference image region; then, an image area T and an image area R with the same position and size are respectively extracted from the registered detection image and the panoramic reference image, and a target image with a confidence threshold larger than a preset confidence threshold is output from the detection result, wherein T and R are respectively used as a test image and a reference image which are input by change detection.

Other portions of this embodiment are the same as any of embodiments 1 to 5 described above, and thus will not be described again.

Example 7:

The embodiment is further optimized based on any one of the above embodiments 1 to 6, as shown in fig. 3, the image change detection execution mainly includes image preprocessing, low-pass filtering, and in the image preprocessing, inputting a panoramic image and a detection frame, performing rough positioning on the input test image T and reference image R by using a 2-dimensional gaussian convolution to check image blurring, extracting a-KAZE characteristics by using GPS information, calculating a transformation matrix, accurately registering the detection frame and the reference image, preprocessing the registered detection frame and the reference image, generating an initial change image D by using a low-pass filtering method, performing gaussian filtering on three channel images of the reference image R and the detection image T respectively for RGB images, removing influences of details (such as water surface waves, leaf waving, and the like) and noise on change detection results, then, increasing contrast of the reference image R and the detection image T by using a histogram equalization method, reducing influences caused by light difference of the two images, and performing post-processing on the change verification and detection results based on RGB-LBP characteristics;

to overcome illumination, noise, parallax and registration errorsIn the process of executing image change detection, the line detection model generates an initial change image, firstly converts RGB images into gray images from a reference image R and a detection image T to obtain a corresponding gray reference image R _gray And gray test pattern T _gray The method comprises the steps of carrying out a first treatment on the surface of the Then, based on the gray reference map R _gray And gray test pattern T _gray And generating a difference image according to the gray values of the positions, judging the position of each pixel of the initial change map D by using the difference values, if D (i, j) represents change, calculating pixel bits (i, j) of the detection frame and the reference map and RGB-LBP characteristics of the text positions, and using the size N of the represented neighborhood window as N multiplied by N. ) Neighborhood window, respectively calculating gray reference image R by low-pass filtering mode _gray To gray level detection map T _gray Difference graph D of (2) _R Gray scale detection map T _gray To reference map R of wager _gray Difference graph D of (2) _T The calculation formula is as follows:

based on difference image D _R Difference graph D _T Division threshold delta _diff Calculating an initial change image D, verifying the pixel position with the value of 1 in the initial change image D based on RGB-LBP characteristic comparison, confirming the position (i, j) with the value of 1 in the initial change image D by using RGB-LBP characteristic comparison, setting the position as change if the change is detected twice, otherwise, calculating 8-bit binary coded LBP characteristics with the position (i, j) as the center in 3 color channels of a reference image R and a detection image T respectively, taking the position (i, j) as the center, connecting the final LBP characteristics in series according to the position and the channel to form a reference image R and a detection image T LBP feature S of test pattern T at (i, j) position _R (i, j) and S _T (i, j) in a 3 x 3 neighborhood centered around the position, starting from the upper left corner, 0/1 encoding is performed for 8 neighboring positions in time, if the gray value is lower than the center position, the point encoding is 0, otherwise 1; then, LBP characteristics S of the reference pattern R and the test pattern T at the (i, j) position are calculated _R And S is _T Sea distance d of (2) _RT (i, j); finally, whether the pixel position (i, j) is changed is judged based on the Hamming distance, if the Hamming distance d _RT (i, j) satisfy d _RT (i,j)＞δ _h ×|S _R And (i, j) is changed, the value of D (i, j) in the initial change image is kept unchanged by 1 if the (i, j) is changed, otherwise, the value of D (i, j) in the initial change image is changed from 1 to 0 if the (i, j) is unchanged. Wherein,,

wherein 0: unchanged, 1: the variation, N ε {7,9,11}, Δi represents the offset of position coordinate i in N neighborhood, Δj represents the offset of position coordinate j in N neighborhood, δ _diff ∈[0,50]The value is based on the illumination difference degree of the image, |S _R I represents the length of the binary string, delta _h To determine the threshold, 0.3 is generally taken.

Other portions of this embodiment are the same as any of embodiments 1 to 6 described above, and thus will not be described again.

Example 8:

the embodiment is further optimized based on any one of the above embodiments 1 to 7, and in the post-processing of the change image, in order to effectively eliminate false alarms, post-processing is required to be performed on the change image after verification, and the operation flow is as follows: firstly, gray processing, gray correction and smooth filtering are firstly carried out, step edge points and marked line edge points are screened, edge points are connected, false edges are removed, marked line information is extracted, isolation change positions are detected and removed by morphological opening operation, corresponding areas in a verified change chart D are set to be unchanged, then, pixel areas of each change area are calculated, and the area delta of the minimum change area is determined according to image resolution and the minimum target size _a The area of the change image D after verification is smaller than delta _a Is set to unchanged, and the detection frame and the reference picture (i, j) position RG are calculatedAfter B-LBP feature, calculating feature similarity, if feature similarity is larger than threshold, D (i, j) is kept unchanged, if feature similarity is smaller than threshold, D (i, j) is corrected, set as unchanged, and changed image post-processing is performed, for each changed region A in verified changed image D _p Find its minimum circumscribed rectangular area B _p In the gray reference image R _gray And gray test pattern T _gray Extraction of B _p Corresponding image areaAnd->And calculate the area +.>Gray histogram feature->And area->Gray histogram feature->And->The distance calculation method of (2) is as follows:

wherein beta is the feature dimension of the gray histogram,and->Average value of representative characteristics, q tableThe value of the q-th dimension of the histogram is shown. When->Less than 0.35, change region A in D _p Set unchanged, otherwise unchanged.

For convenience of display, the unchanged position pixel value in the change map D is set to 0 (black), and the changed position pixel value is set to 255 (white). So far, unmanned aerial vehicle video variation detection is accomplished.

Other portions of this embodiment are the same as any of embodiments 1 to 7 described above, and thus will not be described again.

The foregoing description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way, and any simple modification and equivalent variation of the above embodiment according to the technical matter of the present invention falls within the scope of the present invention.

Claims

1. An intelligent detection method for aerial video images of an unmanned aerial vehicle is characterized by comprising the following steps:

2. The intelligent detection method for aerial video images of an unmanned aerial vehicle according to claim 1, wherein the method for preprocessing the video data of the unmanned aerial vehicle comprises the following steps:

3. The intelligent detection method for aerial video images of an unmanned aerial vehicle according to claim 1, wherein the method for verifying the change positions in the change map by using deep learning software comprises the following steps:

4. The intelligent detection method for aerial video images of an unmanned aerial vehicle according to claim 1, wherein the calculation method for Global Positioning System (GPS) interpolation of key frames and key frames of the calibration video comprises the following steps:

In video key frame extraction and GPS interpolation, based on unmanned aerial vehicle track data, a formula for automatically extracting a key frame time interval t under a given overlapping degree can be deduced as follows:wherein the frame width is X _n The formula of (2) is:

Frame width X _f The formula of (2) is:to ensure overlap in the x-direction, the imaging sensor overlaps D in the x-direction after t seconds _x The formula of (2) is:

Degree of overlap D in y-direction _y The formula of (2) is:

5. The intelligent detection method for aerial video images of an unmanned aerial vehicle according to claim 1, wherein the process of stitching the reference video images to generate the panoramic reference map comprises the following steps:

The panoramic image generation by reference video stitching comprises extracting A-KAZE characteristic points, matching the characteristic points, stitching images and generating a panoramic image; in the extraction of the A-KAZE characteristic points, extracting A-KAZE image characteristics from two overlapped adjacent key frames respectively;

when the image is spliced and the panorama is generated, preprocessing is carried out on the received reference video, video key frames are extracted based on random sampling, GPS information of each key frame is recorded, and the reference video key frames are collected as follows K is the total frame number of the reference video extraction key frame, K is the current key frame number, the key frame f1 is set as the panorama space, and the video stitching is carried out to key f ₂ To key frame f _k One by one to the panorama space.

6. The intelligent detection method for aerial video images of an unmanned aerial vehicle according to claim 1, wherein the process of stitching the reference video images to generate the panoramic reference map further comprises:

when the image is spliced and the panorama is generated, preprocessing is carried out on the received reference video, video key frames are extracted based on random sampling, GPS information of each key frame is recorded, and the reference video key frames are collected as followsReference video keyframe set isSetting a key frame f1 as a panorama space, and setting the key f ₂ To key frame f _k The method comprises the steps of transforming to a panoramic space one by one, selecting an affine transformation model M with functions of adapting to translation, rotation and scaling as an image coordinate transformation matrix, and expressing the image coordinate transformation as follows:

7. The unmanned aerial vehicle aerial video image intelligent detection method according to claim 1, wherein the key frame splicing process comprises the following steps:

8. The intelligent detection method for aerial video images of an unmanned aerial vehicle according to claim 1, wherein the method for accurately registering the detection frame with the panoramic reference map comprises the following steps:

9. The unmanned aerial vehicle aerial video image intelligent detection method according to claim 1, wherein the online detection model comprises, in performing image change detection:

initial change image generationFirstly, converting RGB image into gray image by reference image R and detection image T to obtain correspondent gray reference image R _gray And gray test pattern T _gray The method comprises the steps of carrying out a first treatment on the surface of the Then, based on the gray reference map R _gray And gray test pattern T _gray Generating a difference image according to gray values at corresponding positions, judging each pixel position of an initial change map D by using the difference values, if D (i, j) represents change, calculating pixel positions (i, j) of a detection frame and a reference map and RGB-LBP characteristics of text positions, and calculating gray reference maps R by using a neighborhood window with the size of N multiplied by N, wherein N is the side length of the neighborhood window in a low-pass filtering mode _gray To gray level detection map T _gray Difference graph D of (2) _R Gray scale detection map T _gray To reference map R of wager _gray Difference graph D of (2) _T The calculation formula is as follows:

based on difference image D _R Difference graph D _T Division threshold delta _diff Calculating an initial change image D, verifying a pixel position with a value of 1 in the initial change image D based on RGB-LBP feature comparison, confirming a position (i, j) with a value of 1 in the initial change image D by using RGB-LBP feature comparison, setting the position as change if the change is detected twice, otherwise, calculating 8-bit binary coded LBP features centering on the position (i, j) in 3 color channels of a reference image R and a detection image T respectively, taking the 15X 15 adjacent points, and concatenating the final LBP features according to the position and the channels to form LBP features S of the reference image R and the test image T at the position (i, j) _R (i, j) and S _T (i, j) in a 3×3 neighborhood centered on the position, starting from the upper left corner, 0/1 coding is performed for 8 neighboring positions in order, and if the gray value is lower than the center position, the point coding is0, otherwise 1; then, LBP characteristics S of the reference pattern R and the test pattern T at the (i, j) position are calculated _R And S is _T Sea distance d of (2) _RT (i, j); finally, whether the pixel position (i, j) is changed is judged based on the Hamming distance, if the Hamming distance d _RT (i, j) satisfy d _RT (i,j)＞δ _h ×|S _R (i, j) the (i, j) position is changed, the value of D (i, j) in the initial change image is kept unchanged by 1, otherwise, the (i, j) position is not changed, the value of D (i, j) in the initial change image is changed from 1 to 0,

10. The unmanned aerial vehicle aerial video image intelligent detection method according to any one of claims 1 to 9, comprising:

in the post-processing of the change image, in order to effectively eliminate false alarms, the post-processing of the verified change image is required, and the operation flow is as follows: firstly, gray processing, gray correction and smooth filtering are firstly carried out, step edge points and marked line edge points are screened, edge points are connected, false edges are removed, marked line information is extracted, isolation change positions are detected and removed by morphological opening operation, corresponding areas in a verified change chart D are set to be unchanged, then, pixel areas of each change area are calculated, and the area delta of the minimum change area is determined according to image resolution and the minimum target size _a The area of the change image D after verification is smaller than delta _a Setting the regions of the detection frame and the reference image (i, j) as unchanged, calculating the characteristic similarity after calculating the RGB-LBP characteristics of the detection frame and the reference image (i, j), if the characteristic similarity is larger than a threshold value, maintaining D (i, j) unchanged, and if the characteristic similarity is smaller than the threshold value, correcting D (i, j), setting the regions as unchanged, and carrying out post-processing on the changed images, wherein each changed region A in the verified changed image D _p Find its minimum circumscribed rectangular area B _p In the gray reference image R _gray And gray test pattern T _gray Extraction of B _p Corresponding image areaAnd->And calculate the area +.>Gray histogram feature->And area->Gray histogram feature-> And->The distance calculation method of (2) is as follows:

where, beta is the gray histogram feature dimension,and->Representing the average value of the features, q represents the value of the q-th dimension of the histogramThe method comprises the steps of carrying out a first treatment on the surface of the When->Less than 0.35, change region A in D _p Set unchanged, otherwise unchanged.