+

CN112733686A - Target object identification method and device used in image of cloud federation - Google Patents

Target object identification method and device used in image of cloud federation Download PDF

Info

Publication number
CN112733686A
CN112733686A CN202011641087.2A CN202011641087A CN112733686A CN 112733686 A CN112733686 A CN 112733686A CN 202011641087 A CN202011641087 A CN 202011641087A CN 112733686 A CN112733686 A CN 112733686A
Authority
CN
China
Prior art keywords
region
interest
image
target object
original image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011641087.2A
Other languages
Chinese (zh)
Inventor
程家明
孔繁东
周志祥
彭杨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Xingtu Xinke Electronic Co ltd
Original Assignee
Wuhan Xingtu Xinke Electronic Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Xingtu Xinke Electronic Co ltd filed Critical Wuhan Xingtu Xinke Electronic Co ltd
Priority to CN202011641087.2A priority Critical patent/CN112733686A/en
Publication of CN112733686A publication Critical patent/CN112733686A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/176Urban or other man-made structures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a target object identification method and a target object identification device for an image of a cloud federation, wherein the method comprises the following steps: carrying out Random-Batch images processing on the original image, fusing the original image with the original image, inputting the fused image into a ResNet network, and carrying out feature extraction to obtain a feature map; inputting the feature map into a bidirectional feature map pyramid network for deep feature map fusion to obtain a feature map with stronger semantic expression capability, inputting the feature map into a region generation network to generate a plurality of candidate boxes, inputting the feature boxes into a ROIAlign network layer to screen out regions of interest, and mapping the regions of interest to the feature map to obtain feature information of the regions of interest; and classifying the region of interest, performing frame regression and mask network processing on the region of interest through the full connection layer according to the characteristic information to obtain a semantic classification result of the original image so as to identify the target object. The method improves the model in the training process, so that the method has better effect on the fine-grained detection and identification of the target in the image.

Description

Target object identification method and device used in image of cloud federation
Technical Field
The invention relates to the technical field of image recognition, in particular to a method and a device for recognizing a target object in an image of cloud federation.
Background
Compared with the common target detection task, military wharf target detection in aerial images is more difficult. Firstly, the image itself is blurred because the distance is too far, the pixels are not very high; in addition, in the image, there are bridges and playgrounds with pixels exceeding 100 × 100, and various containers and ships with pixels smaller than 50 × 50, and the ship objects are dense, there are some parts overlapping each other, and there are aircraft and docks between them, the image complexity is higher, so the object identification method is required to have higher requirements on multi-scale and precision.
In order to identify each target in the aerial image of the military wharf, firstly, semantic segmentation is carried out, entities of different classes are identified, secondly, example segmentation is carried out on the entities of the same class, and finally the attribute of each target is detected.
At the semantic segmentation level, at present, RCNN, fast-RCNN, false-RCNN and the like are mainly used, wherein an RCNN network firstly extracts a propofol (candidate frame) in an image, then inputs the propofol (candidate frame) into a CNN (convolutional neural network) to extract features, classifies the features by using an SVM (support vector machine), and finally performs a Bbox reg (frame regression).
In order to solve the problem of slow RCNN speed, a Fast-RCNN algorithm is proposed at present. In Fast-RCNN, the input is changed to a whole image, and feature selection is performed through ROI. And Bbox reg (frame regression) and region classification are added into the network to become multi-task, and Fast-RCNN improves the defect that each frame of RCNN needs to be independently input into CNN, thereby improving the speed. However, although Fast-RCNN greatly increases speed, it takes a lot of time to screen the feature boxes.
In order to further improve the speed of selecting the Proposal (candidate box), an improved algorithm of Fast-RCNN based on Fast-RCNN is proposed, the Fast-RCNN is improved on the Fast-RCNN, an algorithm for quickly extracting the Proposal (candidate box), namely RPN (Region Proposal Network), is firstly proposed, and the RPN is well integrated into the Fast-RCNN. In the aspect of semantic segmentation, Fast-RCNN and the like have good effects, but the Fast-RCNN cannot perform instance segmentation and cannot meet the requirement of target identification.
For better target identification, an improved example segmentation algorithm Mask-RCNN is proposed on the basis of fast-RCNN. Firstly, the ROI (Region Of Interest) in the fast-RCNN is improved by the Mask-RCNN, the original ROI Pooling is improved into the ROI Align, and the error in the process Of Proposal (candidate frame) is greatly reduced; secondly, the FPN (feature pyramid network) in the Mask-RCNN is an extension of the backbone network, and can better represent the target on multiple scales. In addition, the most key point in the Mask-RCNN is that a parallel Mask (Mask network) branch for predicting a target Mask is added to the existing branch for identifying the bounding box, so that example segmentation is realized.
However, for the fine-grained target identification of the wharf remote sensing image, the robustness of the Mask-RCNN is still insufficient, and the accuracy of the fine-grained target identification is not very high. Therefore, the Mask-RCNN has insufficient robustness and low accuracy in fine-grained target identification, which is an urgent technical problem to be solved.
Disclosure of Invention
The invention provides a target object identification method and device used in an image of a cloud federation, and aims to solve the technical problems of insufficient robustness and low accuracy of fine-grained target identification of the traditional Mask-RCNN.
In order to achieve the above object, the present invention provides a method for identifying an object in an image of the cloud federation, including the steps of:
carrying out Random-Batch images processing on the original image to obtain a processed image;
fusing the processed image and the original image, inputting the fused image into a ResNet network, and performing feature extraction to obtain a feature map;
inputting the feature map into a bidirectional feature map pyramid network for deep feature map fusion to obtain a feature map with stronger semantic expression capability;
inputting the feature map with stronger semantic expression capability into a region generation network to generate a plurality of candidate frames;
inputting the candidate frames into an ROI Align network layer, and screening out an interested region;
mapping the region of interest to the feature map with stronger semantic expression capability to obtain the features of the region of interest;
and the full connection layer classifies the region of interest, performs frame regression and mask network processing on the region of interest according to the characteristics of the region of interest to obtain a semantic classification result of the original image so as to identify a target object in the original image.
Preferably, the Random-Batch images processing the original image to obtain a processed image includes:
for each image to be input, randomly intercepting a target object in an original image of 1280 × 1280 by using a screenshot frame of 640 × 640, and obtaining a screenshot of 640 × 640 for each image;
randomly selecting 4 screenshots for each time, and randomly splicing to obtain a combined image;
the combined image is mixed with the original image as a subsequent input.
Preferably, before the step of classifying, performing frame regression and mask network processing on the region of interest by the full-link layer according to the features of the region of interest to obtain a semantic classification result of the original image, so as to identify the target object in the original image, the method further includes:
the method has the advantages that the channel attention mechanism is added to the mask network in the full connection layer, attention can be improved for the target which is needed but not easy to recognize, and accuracy of model recognition is improved.
Preferably, the fully-connected layer performs classification, frame regression and mask network processing on the region of interest according to the features of the region of interest to obtain a semantic classification result of the original image so as to identify the target object in the original image, and the method includes:
inputting the region of interest into a full-connection layer, and classifying the region of interest according to the characteristics of the region of interest to obtain two outputs;
predicting the target object represented by each interested area through one of the outputs so as to classify different targets and obtain a target object prediction result;
performing frame regression on the target object represented by each region of interest through another output to obtain a candidate frame matching the size and the position of the target object, so that the model can identify the target object more accurately;
and according to the target object prediction result and the candidate frame, obtaining a semantic classification result of the original image by utilizing the mask network processing so as to identify the target object in the original image.
Preferably, after the step of classifying, performing frame regression and mask network processing on the region of interest by the fully-connected layer according to the features of the region of interest to obtain a semantic classification result of the original image, so as to identify the target object in the original image, the method further includes:
fine-tuning a hyper-parameter of the ResNet network based on the accuracy of the semantic classification result, wherein the hyper-parameter comprises: and under the condition that different networks have different activation functions, learning rates and optimizers, the most appropriate hyper-parameter is found through repeated debugging, and finally the optimal semantic classification result of the original image is output on a test set.
In addition, in order to achieve the above object, the present invention further provides a target recognition device for an image of a cloud federation, which includes a memory, a processor, and a target recognition program for an image of a cloud federation, stored on the memory and executable on the processor, wherein the target recognition program for an image of a cloud federation is executed by the processor to implement the steps of the target recognition method for an image of a cloud federation.
In addition, in order to achieve the above object, the present invention further provides a storage medium, on which an object recognition program in an image for cloud federation is stored, wherein the object recognition program in an image for cloud federation, when executed by a processor, implements the steps of the object recognition method in an image for cloud federation.
In order to achieve the above object, the present invention also provides an object recognition apparatus for use in an image of the cloud federation, including:
the image processing module is used for carrying out Random-Batch images processing on the original image to obtain a processed image;
the feature extraction module is used for fusing the processed image and the original image and inputting the fused image into a Resnet network for feature extraction to obtain a feature map;
the characteristic fusion module is used for inputting the characteristic graph into a bidirectional characteristic graph pyramid network to perform deep characteristic graph fusion to obtain a characteristic graph with stronger semantic expression capability;
the interesting region selection module is used for inputting the feature map with stronger semantic expression capability into a region generation network to generate a plurality of candidate frames, inputting the candidate frames into an ROI Align network layer and screening an interesting region;
the correlation establishing module is used for mapping the region of interest to the feature map with stronger semantic expression capability, acquiring the features of the region of interest and establishing correlation information between the region of interest and the corresponding features;
and the classification module is used for classifying the region of interest, performing frame regression and mask network processing on the region of interest through a full connection layer according to the associated information to obtain a semantic classification result of the original image so as to identify a target object in the original image.
The invention has the beneficial effects that:
(1) innovative Random-Batch images processing is added before the input of the original image, the spliced image is obtained and mixed with the original image to be used as input, subsequent training is carried out, the recognition performance of a single small target is improved, and the accuracy of the model is improved to a certain extent.
(2) The traditional FPN is changed into Bi-FPN, the image features are subjected to complex bidirectional fusion, a feature map capable of expressing semantic features better is obtained, and a better effect is achieved when fine-grained feature extraction is carried out.
(3) A channel attention mechanism is added, the channel attention mechanism can calculate the correlation between each channel and important features, the channel with higher correlation increases more attention to the channel, and the accuracy of pixel point classification is improved.
Drawings
FIG. 1 is a flow chart of a military target identification method for aerial images of the cloud federation in accordance with an embodiment of the present invention;
FIG. 2 is a flowchart of a Random-Batchimages process according to an embodiment of the present invention;
FIG. 3 is a structural diagram of Bi-FPN according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be further described with reference to the accompanying drawings.
Referring to fig. 1, fig. 1 is a flowchart of a military target identification method for an aerial image of cloud federation according to an embodiment of the present invention, and the embodiment of the present invention provides a military target identification method for an aerial image of cloud federation, including the following steps:
s1, carrying out Random-Batch images processing on the original aerial images to obtain processed images;
s2, fusing the processed image and the original aerial image, inputting the fused image into a ResNet50/101 network, and performing feature extraction to obtain a feature map;
s3, inputting the Feature Map into a bidirectional Feature Map pyramid network (Bi-FPN) for deep Feature Map fusion to obtain a Feature Map (Feature Map) with stronger semantic expression capability;
s4, inputting the feature map with stronger semantic expression ability into a region generation network, generating a plurality of candidate frames (Propusals), and selecting a region of interest (ROI) from the candidate frames;
s4, inputting the feature map with stronger semantic expression ability into a region generation network (RPN), generating a plurality of candidate frames (Propusals), inputting the candidate frames into an ROI Align network layer, and screening out a region of interest (ROI);
s5, mapping the region of interest (ROI) to the feature map with stronger semantic expression capability, obtaining the features of the region of interest (ROI), and establishing the associated information between the region of interest (ROI) and the corresponding features;
s6, the full connection layer carries out classification prediction (Cls _ prob), frame regression (Bbox Reg) and Mask network (Mask) processing on the region of interest according to the correlation information to obtain a semantic classification result of the original aerial image so as to identify a military target object in the original image;
the specific steps of S6 are: inputting the region of interest into a full-link layer, and classifying the region of interest (ROI) according to the characteristics corresponding to the region of interest to obtain two outputs;
normalizing (Softmax) the target object represented by each region of interest through one of the outputs, and performing classified prediction (Cls _ prob) so as to classify different targets to obtain a target object prediction result;
performing frame regression (Bbox Reg) on the target object represented by each region of interest through another output to obtain a candidate frame matching the size and the position of the target object, so that the model can identify the target object more accurately;
according to the target object prediction result and the candidate frame, the mask network is utilized for processing, wherein a channel Attention mechanism (Attention) is added to the mask network in the full connection layer, the Attention of a target which is needed but not easy to be identified can be improved by adding the channel Attention mechanism, the accuracy of model identification is improved, and finally the semantic classification result of the original image is obtained so as to identify the target object in the original image.
S7, fine-tuning a hyper-parameter of the ResNet network based on the accuracy of the semantic classification result, wherein the hyper-parameter comprises: and the hyper-parameters are repeatedly debugged to find the most appropriate hyper-parameters, and finally the optimal semantic classification result of the original image is output on the test set.
In the aerial images of military docks, compared with large ships, bridges, playgrounds and the like, a plurality of small targets such as containers, small ships and the like are difficult to identify through a traditional Mask-RCNN algorithm, and therefore innovative Random-Batch images are added to a data set in the process of inputting the images.
Referring to FIG. 2, FIG. 2 is a flow chart of a Random-Batchimages process according to an embodiment of the present invention. Carrying out Random-Batch images processing on the original image, which specifically comprises the following steps:
intercepting a target object in the original aerial image of 1280 multiplied by 1280 by using a screenshot frame of 640 multiplied by 640 for each aerial image to be input, wherein each image can obtain a screenshot of 640 multiplied by 640; randomly selecting 4 screenshots for each time, and randomly splicing to obtain a combined complete image; the combined complete image and the original aerial image are mixed to serve as subsequent input, and the mixed image is input into a ResNet50/101 network for training, so that the recognition performance of a single small target is improved, and the accuracy of the model is improved to a certain extent.
In order to identify small targets with finer granularity, in the process of generating a candidate frame (Proposal), the Mask-RCNN network uses FPN (pyramid network), but the fine-granularity feature identification is not enough, so that the Bi-FPN (bidirectional pyramid feature network) is used in the method.
Referring to FIG. 3, FIG. 3 is a structural diagram of a Bi-FPN according to an embodiment of the present invention. The Bi-FPN is a complex bidirectional fusion based on the FPN, because the characteristic diagram contains shallow layer and deep layer information of a picture, the FPN (characteristic pyramid network) only simply outputs the information of each layer, the Bi-FPN (bidirectional characteristic diagram pyramid network) fuses the information of different layers through convolutional neural networks of different convolutional kernels, and in order to strengthen the fusion effect of the information of each layer, the Bi-FPN network superposes a replicated Block (repeating unit) for 3 times. Therefore, for the output of each layer, the information of different layers of the picture is fused, and the feature graph words with stronger semantic expression capability are obtained. Therefore, the fine-grained feature extraction method has a better effect.
In addition, a channel attention mechanism is added to the mask network in the full connection layer, and the addition of the channel attention mechanism can improve the attention of the target which needs to be identified but is difficult to identify, and improve the accuracy of model identification.
The invention is mainly based on aerial photography military wharf fine-grained target detection in AI rocket military competition projects, the detection targets are obvious in general conditions, the size is larger, the number of the targets is relatively less, therefore, a Mask-RCNN model can obtain a good effect, but in wharf remote sensing images, the images are not very clear, the targets are fuzzy, the sizes are different, and more targets need to be detected, so that the traditional Mask-RCNN recognition effect is not good, and the accuracy of the improved Mask-RCNN model is obviously improved in a wharf remote sensing image fine-grained target detection task. As shown in Table 1, for the traditional Mask-RCNN model, the mAP obtained from the test data in the game project is only 54.765, after Random-Batchimages are added, the mAP value reaches 58.652, after the FPN is further changed into Bi-FPN, the mAP value reaches 64.157, and further a channel attention mechanism is added, the final mAP value reaches 68.227, and the first 20% of good results are obtained in all teams in the game.
In addition, the embodiment of the invention also provides military target identification equipment for the aerial image of the cloud federation, which comprises a memory, a processor and a military target identification program for the aerial image of the cloud federation, wherein the military target identification program is stored in the memory and can be operated on the processor, and the military target identification program for the aerial image of the cloud federation realizes the steps of the military target identification method for the aerial image of the cloud federation when being executed by the processor.
In addition, the specific embodiment of the invention also provides a storage medium, wherein a military target identification method program for the aerial image of the cloud federation is stored on the storage medium, and the military target identification method program for the aerial image of the cloud federation realizes the steps of the military target identification method for the aerial image of the cloud federation when being executed by a processor.
In addition, a military target recognition device for aerial images of cloud federation is further provided in the specific embodiment of the present invention, and the military target recognition device for aerial images of cloud federation includes:
the image processing module is used for carrying out Random-Batch images processing on the original image to obtain a processed image;
the feature extraction module is used for fusing the processed image and the original image and inputting the fused image into a Resnet network for feature extraction to obtain a feature map;
the characteristic fusion module is used for inputting the characteristic graph into a bidirectional characteristic graph pyramid network to perform deep characteristic graph fusion to obtain a characteristic graph with stronger semantic expression capability;
the interesting region selection module is used for inputting the feature map with stronger semantic expression capability into a region generation network to generate a plurality of candidate frames, inputting the candidate frames into an ROI Align network layer and screening an interesting region;
the correlation establishing module is used for mapping the region of interest to the feature map with stronger semantic expression capability, acquiring the features of the region of interest and establishing correlation information between the region of interest and the corresponding features;
and the classification module is used for classifying the region of interest, performing frame regression and mask network processing on the region of interest through a full connection layer according to the associated information to obtain a semantic classification result of the original image so as to identify a target object in the original image.
The beneficial effects brought by the specific embodiment of the invention are as follows:
(1) and innovative Random-Batch images processing is added before the input of the original aerial images, the spliced images are obtained and mixed with the original images to be used as input, and subsequent training is carried out, so that the recognition performance of a single small target is improved, and the accuracy of the model is improved to a certain extent.
(2) The traditional FPN is changed into Bi-FPN, the image features are subjected to complex bidirectional fusion, a feature map capable of expressing semantic features better is obtained, and a better effect is achieved when fine-grained feature extraction is carried out.
(3) A channel attention mechanism is added, the channel attention mechanism can calculate the correlation between each channel and important features, the channel with higher correlation increases more attention to the channel, and the accuracy of pixel point classification is improved.
TABLE 1 comparison of recognition results for various models
Figure BDA0002880680930000131
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (7)

1. A target object recognition method used in an image of a cloud federation is characterized by comprising the following steps:
carrying out Random-Batch images processing on the original image to obtain a processed image;
fusing the processed image and the original image, inputting the fused image into a ResNet network, and performing feature extraction to obtain a feature map;
inputting the feature map into a bidirectional feature map pyramid network for deep feature map fusion to obtain a feature map with stronger semantic expression capability;
inputting the feature map with stronger semantic expression capability into a region generation network to generate a plurality of candidate frames;
inputting the candidate boxes into a ROIAlign network layer, and screening out an interested area;
mapping the region of interest to the feature map with stronger semantic expression capability to obtain the features of the region of interest;
and the full connection layer classifies the region of interest, performs frame regression and mask network processing on the region of interest according to the characteristics of the region of interest to obtain a semantic classification result of the original image so as to identify a target object in the original image.
2. The method for identifying the target object in the image for the cloud federation of claim 1, wherein the performing Random-Batch images processing on the original image to obtain a processed image comprises:
and randomly intercepting 1/4 images containing the target object for each image in the original images, and randomly splicing four 1/4 images containing the target object to obtain processed images.
3. The method for identifying the target object in the image of the cloud federation as claimed in claim 1, wherein before the step of classifying, frame regression and mask network processing the region of interest by the full connectivity layer according to the features of the region of interest to obtain the semantic classification result of the original image so as to identify the target object in the original image, the method further comprises:
and adding a channel attention mechanism to the mask network in the full connection layer.
4. The method for identifying the target object in the image of the cloud federation as claimed in claim 1, wherein the fully connected layer performs classification, frame regression and mask network processing on the region of interest according to the features of the region of interest to obtain a semantic classification result of the original image so as to identify the target object in the original image, and the method comprises the following steps:
inputting the characteristics of the region of interest into a full-connection layer, and classifying the region of interest according to the characteristics of the region of interest to obtain two outputs;
predicting the target object represented by each region of interest through one of the outputs to obtain a target object prediction result;
performing frame regression on the target object represented by each region of interest through another output to obtain a candidate frame matching the size and the position of the target object;
and according to the target object prediction result and the candidate frame, obtaining a semantic classification result of the original image by utilizing the mask network processing so as to identify the target object in the original image.
5. The method for identifying the target object in the image of the cloud federation of claim 1, wherein after the step of classifying, frame regression and mask network processing the region of interest by the full connectivity layer according to the features of the region of interest to obtain the semantic classification result of the original image so as to identify the target object in the original image, the method further comprises:
and fine-tuning the super-parameters of the ResNet network based on the accuracy of the semantic classification result, and outputting the optimal semantic classification result of the original image.
6. The method for object recognition in images of the cloud federation of claim 5, wherein the hyper-parameters comprise: at least one of a learning rate, an activation function, and an optimizer.
7. An object recognition device used in an image of a cloud federation, the object recognition device used in the image of the cloud federation comprising:
the image processing module is used for carrying out Random-Batch images processing on the original image to obtain a processed image;
the feature extraction module is used for fusing the processed image and the original image and inputting the fused image into a ResNet network to perform feature extraction so as to obtain a feature map;
the characteristic fusion module is used for inputting the characteristic graph into a bidirectional characteristic graph pyramid network to perform deep characteristic graph fusion to obtain a characteristic graph with stronger semantic expression capability;
the interested region selection module is used for inputting the feature map with stronger semantic expression capability into a region generation network, generating a plurality of candidate frames, inputting the candidate frames into a ROIAlign network layer and screening out the interested region;
the correlation establishing module is used for mapping the region of interest to the feature map with stronger semantic expression capability, acquiring the features of the region of interest and establishing correlation information between the region of interest and the corresponding features;
and the classification module is used for classifying the region of interest, performing frame regression and mask network processing on the region of interest through a full connection layer according to the associated information to obtain a semantic classification result of the original image so as to identify a target object in the original image.
CN202011641087.2A 2020-12-31 2020-12-31 Target object identification method and device used in image of cloud federation Pending CN112733686A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011641087.2A CN112733686A (en) 2020-12-31 2020-12-31 Target object identification method and device used in image of cloud federation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011641087.2A CN112733686A (en) 2020-12-31 2020-12-31 Target object identification method and device used in image of cloud federation

Publications (1)

Publication Number Publication Date
CN112733686A true CN112733686A (en) 2021-04-30

Family

ID=75609051

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011641087.2A Pending CN112733686A (en) 2020-12-31 2020-12-31 Target object identification method and device used in image of cloud federation

Country Status (1)

Country Link
CN (1) CN112733686A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113989632A (en) * 2021-09-13 2022-01-28 西安电子科技大学 Bridge detection method and device for remote sensing image, electronic equipment and storage medium
CN114202693A (en) * 2021-12-10 2022-03-18 深圳市旗扬特种装备技术工程有限公司 Illumination intensity identification method and system, electronic equipment and medium
CN114399643A (en) * 2021-12-13 2022-04-26 阿里巴巴(中国)有限公司 Image processing method, storage medium, and computer terminal
CN115100486A (en) * 2022-06-07 2022-09-23 南京邮电大学 Chinese herbal medicine identification method based on time-related feature mining in smart medical scene
CN115512248A (en) * 2022-09-29 2022-12-23 曲阜师范大学 Unmanned aerial vehicle aerial photography water surface floater identification method based on neural network

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110909642A (en) * 2019-11-13 2020-03-24 南京理工大学 Remote sensing image target detection method based on multi-scale semantic feature fusion
CN111696077A (en) * 2020-05-11 2020-09-22 余姚市浙江大学机器人研究中心 Wafer defect detection method based on wafer Det network
CN111985316A (en) * 2020-07-10 2020-11-24 上海富洁科技有限公司 A pavement garbage perception method for road intelligent cleaning
CN112016510A (en) * 2020-09-07 2020-12-01 平安国际智慧城市科技股份有限公司 Signal lamp identification method and device based on deep learning, equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110909642A (en) * 2019-11-13 2020-03-24 南京理工大学 Remote sensing image target detection method based on multi-scale semantic feature fusion
CN111696077A (en) * 2020-05-11 2020-09-22 余姚市浙江大学机器人研究中心 Wafer defect detection method based on wafer Det network
CN111985316A (en) * 2020-07-10 2020-11-24 上海富洁科技有限公司 A pavement garbage perception method for road intelligent cleaning
CN112016510A (en) * 2020-09-07 2020-12-01 平安国际智慧城市科技股份有限公司 Signal lamp identification method and device based on deep learning, equipment and storage medium

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113989632A (en) * 2021-09-13 2022-01-28 西安电子科技大学 Bridge detection method and device for remote sensing image, electronic equipment and storage medium
CN114202693A (en) * 2021-12-10 2022-03-18 深圳市旗扬特种装备技术工程有限公司 Illumination intensity identification method and system, electronic equipment and medium
CN114399643A (en) * 2021-12-13 2022-04-26 阿里巴巴(中国)有限公司 Image processing method, storage medium, and computer terminal
CN114399643B (en) * 2021-12-13 2025-09-02 阿里巴巴(中国)有限公司 Image processing method, storage medium and computer terminal
CN115100486A (en) * 2022-06-07 2022-09-23 南京邮电大学 Chinese herbal medicine identification method based on time-related feature mining in smart medical scene
CN115512248A (en) * 2022-09-29 2022-12-23 曲阜师范大学 Unmanned aerial vehicle aerial photography water surface floater identification method based on neural network
CN115512248B (en) * 2022-09-29 2025-08-22 曲阜师范大学 A method for identifying floating objects on the water surface from UAV aerial photography based on neural network

Similar Documents

Publication Publication Date Title
CN112561910B (en) Industrial surface defect detection method based on multi-scale feature fusion
CN113989662B (en) A fine-grained target recognition method for remote sensing images based on self-supervised mechanism
CN107609601B (en) Ship target identification method based on multilayer convolutional neural network
CN112733686A (en) Target object identification method and device used in image of cloud federation
CN109902715B (en) Infrared dim target detection method based on context aggregation network
CN111814902A (en) Target detection model training method, target recognition method, device and medium
CN102968637B (en) Complicated background image and character division method
CN105574550A (en) Vehicle identification method and device
CN106156777B (en) Text image detection method and device
Zhao et al. BiTNet: A lightweight object detection network for real-time classroom behavior recognition with transformer and bi-directional pyramid network
US20220180624A1 (en) Method and device for automatic identification of labels of an image
CN112183672B (en) Image classification method, feature extraction network training method and device
CN116385958A (en) An edge intelligent detection method for power grid inspection and monitoring
CN112364721A (en) Road surface foreign matter detection method
CN112288700A (en) Rail defect detection method
CN107169417A (en) Strengthened based on multinuclear and the RGBD images of conspicuousness fusion cooperate with conspicuousness detection method
CN114332473A (en) Object detection method, object detection device, computer equipment, storage medium and program product
CN117011616B (en) Image content auditing method and device, storage medium and electronic equipment
CN108133235A (en) A kind of pedestrian detection method based on neural network Analysis On Multi-scale Features figure
CN114898290A (en) Real-time detection method and system for marine ship
CN109086737B (en) Convolutional neural network-based shipping cargo monitoring video identification method and system
Bakır et al. Evaluating the robustness of yolo object detection algorithm in terms of detecting objects in noisy environment
Li et al. An improved PCB defect detector based on feature pyramid networks
CN109977875A (en) Gesture identification method and equipment based on deep learning
Prasetiyo et al. Differential augmentation data for vehicle classification using convolutional neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210430

RJ01 Rejection of invention patent application after publication
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载