+

CN109977970A - Character recognition method under water conservancy project complex scene based on saliency detection - Google Patents

Character recognition method under water conservancy project complex scene based on saliency detection Download PDF

Info

Publication number
CN109977970A
CN109977970A CN201910240747.7A CN201910240747A CN109977970A CN 109977970 A CN109977970 A CN 109977970A CN 201910240747 A CN201910240747 A CN 201910240747A CN 109977970 A CN109977970 A CN 109977970A
Authority
CN
China
Prior art keywords
water conservancy
conservancy project
character recognition
conspicuousness
picture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910240747.7A
Other languages
Chinese (zh)
Inventor
孙丰
卢克
马艳娜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Water Resources and Electric Power
Original Assignee
Zhejiang University of Water Resources and Electric Power
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Water Resources and Electric Power filed Critical Zhejiang University of Water Resources and Electric Power
Priority to CN201910240747.7A priority Critical patent/CN109977970A/en
Publication of CN109977970A publication Critical patent/CN109977970A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The case where present invention discloses character recognition method under a kind of water conservancy project complex scene based on saliency detection, partial region semantic information missing occurs for part conspicuousness object (personage) detection model proposes a kind of new strong supervision conspicuousness detection method.The model of use is divided into two layers, and first layer mainly uses multi-level full convolutional neural networks and grabs the global semantic information and local feature information of conspicuousness personage in pixel level, and marks out coarse conspicuousness personage.The second layer is introduced into the significant characteristics of the shallow hierarchy generated in first layer operational process by using short connection, is merged with coarse notable figure to obtain the characteristic information being lost, and reinforce the boundary characteristic of obvious object.In data input, have chosen primary picture it is high it is photochemical after picture and primary bloom picture is gone as inputting, to show on the character data collection under the model of the design water conservancy project complex scene that random collecting obtains on network excellent simultaneously.

Description

Character recognition method under water conservancy project complex scene based on saliency detection
Technical field
The present invention relates to computer vision fields, detect primarily directed to the conspicuousness of personage under water conservancy project complex scene.
Background technique
Under the increasingly important historical background of flood control informatization, in the weather such as typhoon and spring tide, how to prevent and and Shi Faxian personnel illegally enter water engineering facility (seawall, reservoir, dyke etc.) seems especially urgent in supervision, using artificial Check that the mode of video monitoring is popularized on a large scale, but efficiency and cost are to be improved.As can by the way of finding automatically, Machine is allowed to assist staff, therefore the target detection under water conservancy project complex scene for personage is very important.In this way Demand action can be used personage conspicuousness detection technique meet.Although there is some technologies to can detecte part on the market Personage in scape, but have the following disadvantages: (1) scene can not be excessively complicated, once contain more complexity in scene Element, with high contrast and the big non-limiting object of accounting is easy to cause detection to fail in picture;(2) for detecting Personage, profile is unintelligible, and sometimes very fuzzy, lost part overall situation semantic information;(3) existing part conspicuousness detection Model can not be arranged effectively for such as water surface glistening light of waves of the element in the complex background in scene or with the massif of high contrast It removes;(4) existing conspicuousness detection model is for close with personage's distance present in the image that shoots in water conservancy project scene and compare Spending low impurity element effectively can not remove or identify;(5) it when existing conspicuousness detection model is directed to actual conditions, obtains Result be deviated with reality.
Personage's picture in the case where observing a large amount of water conservancy project complex scene finds that these pictures can be divided into following three classes: (1) wisp, i.e. conspicuousness personage are in the picture compared with full figure, and area accounting is less than 10%, and such picture number is in entirety Detect accounting 80% in picture;(2) in the picture compared with full figure, accounting is greater than 50% for big object, i.e. personage, such picture compared with It is few;(3) complex background not only contains personage's main body in the image shot, also includes riverside dykes and dams, distant place massif, river bank Junction grade obvious object.How personage in detection image is solved, and the especially difficult problem of wisp detection is to be badly in need of solution Certainly the problem of.
Summary of the invention
In order to overcome the above-mentioned deficiencies of the prior art, the present invention provides a kind of water conservancy project based on saliency detection is multiple The recognition methods of personage under miscellaneous scene.For can not the global semantic letter of effective integration existing for current most people object detecting method Breath and local characteristic information and noise present in picture can not be effectively detected, propose the depth based on multi-level short connection Merge SEMANTIC INFORMATION MODEL.Can effectively using local feature information reduce because detect when lost part overall situation semantic information and The case where causing conspicuousness object detection to fail, while enhancing the mark of conspicuousness object, noise to non-limiting object or The non-limiting object of person's large volume is effectively removed.
The technical scheme adopted by the invention is that: acquisition picture overall situation semantic information and local feature information use multilayer Secondary two information of short link model depth integration complement each other, the case where to reduce loss of learning in detection process.
Compared with prior art, the beneficial effects of the invention are as follows global semantic information and local feature information is combined, subtract Lack and has occurred global semantic information loss in conspicuousness detection process so that the case where influencing final significant result figure.
Detailed description of the invention
Fig. 1 is the Artificial Neural Network Structures figure that the present invention uses;
Fig. 2 is pixel layer facial nerve flow through a network schematic diagram;
Fig. 3 is pixel layer facial nerve network specification auxiliary figure;
Fig. 4 is primary figure, removes bloom figure and bloom figure contrast schematic diagram.
Specific embodiment
The present invention will be further explained below with reference to the attached drawings.
Model of the invention is based on Caffe deep learning frame.
The first step designs a multi plane neural network end to end, and input picture can be made to be mapped directly to needs Pixel-level significantly detect figure.With that in mind, (1) model first can produce multifaceted notable figure to grab difference Overall situation semanteme or local feature in level.(2) model, which needs enough depth just, can go deep into the specifying information of capturing pictures And hiding context comparative information.As shown in Figure 1, design pixel layer model.(initial picture size is 256 to initial picture × 256, unit: pixel) by bloom and after going bloom to operate, the full convolutional neural networks of depth are inputted, in different convolutional layers Generate respective result figure.The depth of these different size of convolution and neural network is referring to DCL (Deep Contrast Learning pixel fluid layer), substrate of the simultaneous selection VGG-16 as model, as shown in Figure 2.
It also needs to carry out bloom and bloom reservation operations to image before input picture.It is complementary in image in order to grab Semantic information, figure bloom figure and remove bloom figure by being converted to original image (RGB).It is converted using following formula:
Wherein X is original image, and M is the pixel average of image data set.K is a hyper parameter, is commonly defined as 1.
As can be seen from the above formula that the image X after conversionOWithIt is reciprocal.Use XOWithTwo as model A input source.Specific effect is as shown in Figure 4.
The pixel layer model mainly has 5 pond layers (Max Pooling), 10 whole convolutional layer (each whole volumes It include 2~4 not equal independent convolutional layers in lamination).Data are slowly transmitted downwards since first layer convolutional layer up to Five layers of convolutional layer, during convolution transmitting, each layer convolutional layer can extract the known another characteristic of each layer.As first layer convolution is special Note is local characteristic information, can eliminate the inapparent noise and impurity of part at this time, and for conspicuousness object Boundary have higher retention so that picture is still within the degree of high pixel.After first layer, pond layer is acted on The relatively significant information around the data requirement that diminution is transmitted while also extraction pixel.Hereafter each layer all can be to upper layer The data transmitted extract new information, because the cross-sectional dimensions size of the data of the effect transmitting of pond layer successively contracts Subtract.Layer 5 can grab the position of conspicuousness object and ignore more part to the global information for extracting general image, emphasis Characteristic value.After the completion of all down-sampling operations, in order to need to merge the image that each convolutional layer time generates later, need by All picture up-sampling, that is, deconvolution are needed the notable figure of generation at all levels in this example to identical specification size It up-samples to 225 × 225 (units: pixel).Specific every layer of convolutional layer specification is as shown in Fig. 3.It is as described above a series of Process is concluded when mathematics function can be used:
fs(X;W, b)=Pooling (σ (W*sX+b))
In above-mentioned formula, X is primary input picture;W and b respectively represents convolution kernel and convolution bias;*sRepresent volume The step value of product operation;σ representative is line rectification function (Rectified Linear Unit, ReLU);What Pooling was represented It is pondization operation, refers in particular to maximum pondization operation (Max Pooling) herein.F as a results(X;W, b) be former number According to obtained from the down-sampling operation carried out according to parameter s.
In above-mentioned formula, X still represents primary input picture;fs(X;θ) represent under the action of step-length s and parameter θ The characteristic pattern of generation;It represents in up-sampling step-length s and parameterUnder the action of deconvolution generate characteristic pattern, and guarantee Characteristic pattern specification is identical as the specification of X.However up-sampling operation herein is different from conventional double interpolation operations, in function It is to participate in supervised learning process, needs continuous perfect in an iterative process.
As described above, with apparent conspicuousness contour of object but same in the Saliency maps that pool1 convolutional layer face generates When with more impurity and noise.Convolutional network generation figure after pool5 etc. has grabbed global conspicuousness information, still Part overall situation conspicuousness information may be lost in the picture of part.So in order to preferably integrate multifaceted different spy Sign averagely melts the result figure whole numerical value addition that all different levels generate also for the loss for making up Pixel-level notable figure It is combined into a conspicuousness testing result figure.That is the FUSE operation in Fig. 2.This operates with mathematical formulae and is described as follows:
Wherein, N is the notable figure quantity obtained by first layer difference convolution Chi Huahou;SiTo pass through different convolution ponds Obtained notable figure Sfuse
The notable figure is than the notable figure S after layer 5 pondization up-sampling5There is clearer boundary, but there is also more Noise.Therefore it will enter the second layer, and carry out noise, and strengthen conspicuousness object.It is learnt by observation, is adopted on first layer pond Notable figure S after sample1There is the clear boundary close to original image Pixel-level.Then spontaneous idea, it is expected that SfuseIt can be better Extract S1Clear boundary to supply SfuseSmeared out boundary problem.Therefore, it is connected using end, by S1With SfuseLongitudinal stack, with It carries out maximum pondization operation three times to the image afterwards to operate with cubic convolution, the purpose of this operation is to be expected that by pondization operation Remove S1In unnecessary impurity reinforce the mark of conspicuousness object, and because pond degree is smaller three times, do not influence S1Boundary Readability.Pond step-length three times is 2 and pond range is 2 × 2 (units: pixel).During operating three times respectively Generate notable figure Sfuse11, Sfuse12, Sfuse13.Then the above-mentioned average fusion of same addition is carried out to three width figures, obtains notable figure Sfuse2.Continue to Sfuse2It is iterated operation, as shown in fig. 1, finally obtains significant result figure.
Finally notable figure is done to front transfer using cross entropy loss function (Cross Entropy Function).Wherein Cross entropy weight be primary figure and notable figure, be expressed as follows with mathematical formulae:
Wherein G is true value figure (Ground Truth, GT);What W was represented is the set of network parameter;βiIt is balance of weights ginseng Number;| I | represent all pixels point set in picture;|I|-For non-significant pixel collection;|I|+For significant pixel collection;And
The data set of the model training is on the personage's pictures collected, picture size be unified for 225 × 225 (unit: Pixel), it the use of Batch-size is 1, the random decline that learning rate is 1e-8 is trained.When 200000 consumption of model iteration Between more than 24 hours.This method is realized using Python based on Caffe frame.The GPU used is Tesla M40 (12GB)。
By the way that the fusion function of Chi Huayu convolution operation realization, the part of primary personage's picture of input are used for multiple times above Characteristic information is gradually merged based on global semantic information.In the process, local feature information and global sense information Fusion will not accomplish in one move but gradually supplement the global semantic character physical's frame lost during Chi Huayu convolution The semantic information of other defects such as information or personage's main body.The model existing 6 SOD data sets (DUT-OMRON, ECSSD, HKU-IS, PASCA-S, SED1, SED2) on have good performance.

Claims (5)

1. character recognition method under a kind of water conservancy project complex scene based on saliency detection, is technically characterized in that use Multi-level full convolutional neural networks carry out the detection of conspicuousness object to picture, in the detection process specifically for wisp feelings Condition acquires global semantic feature, and acquisition local message is as supplement.
2. character recognition method under the water conservancy project complex scene according to claim 1 based on saliency detection, special Sign is: acquiring the global semantic feature i.e. main position of wisp personage as judging the basic of conspicuousness object space Information uses short connection that the local message i.e. figure action details of shallow-layer convolutional layer output is semantic for the overall situation as supplemental information It is supplemented.
3. character recognition method under the water conservancy project complex scene according to claim 1 based on saliency detection, special Sign is: Model Fusion local message and global semantic feature using short connection.
4. character recognition method under the water conservancy project complex scene according to claim 2 or 3 based on saliency detection, Be characterized in that: this method conspicuousness object in picture is to have highly sensitive inspection in the case that wisp personage accounting is low The property surveyed.
5. character recognition method under the water conservancy project complex scene according to claim 2 or 3 based on saliency detection, It is characterized in that, this method has complex background present in scene high-intensitive filter capacity.
CN201910240747.7A 2019-03-27 2019-03-27 Character recognition method under water conservancy project complex scene based on saliency detection Pending CN109977970A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910240747.7A CN109977970A (en) 2019-03-27 2019-03-27 Character recognition method under water conservancy project complex scene based on saliency detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910240747.7A CN109977970A (en) 2019-03-27 2019-03-27 Character recognition method under water conservancy project complex scene based on saliency detection

Publications (1)

Publication Number Publication Date
CN109977970A true CN109977970A (en) 2019-07-05

Family

ID=67081032

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910240747.7A Pending CN109977970A (en) 2019-03-27 2019-03-27 Character recognition method under water conservancy project complex scene based on saliency detection

Country Status (1)

Country Link
CN (1) CN109977970A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112223288A (en) * 2020-10-09 2021-01-15 南开大学 Visual fusion service robot control method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106447658A (en) * 2016-09-26 2017-02-22 西北工业大学 Significant target detection method based on FCN (fully convolutional network) and CNN (convolutional neural network)
CN108509880A (en) * 2018-03-21 2018-09-07 南京邮电大学 A kind of video personage behavior method for recognizing semantics

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106447658A (en) * 2016-09-26 2017-02-22 西北工业大学 Significant target detection method based on FCN (fully convolutional network) and CNN (convolutional neural network)
CN108509880A (en) * 2018-03-21 2018-09-07 南京邮电大学 A kind of video personage behavior method for recognizing semantics

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
QIBIN HOU 等: "Deeply Supervised Salient Object Detection", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112223288A (en) * 2020-10-09 2021-01-15 南开大学 Visual fusion service robot control method
CN112223288B (en) * 2020-10-09 2021-09-14 南开大学 Visual fusion service robot control method

Similar Documents

Publication Publication Date Title
CN116052016B (en) Fine segmentation and detection method of clouds and cloud shadows in remote sensing images based on deep learning
CN108171698B (en) Method for automatically detecting human heart coronary calcified plaque
CN111275696B (en) Medical image processing method, image processing method and device
CN111862143B (en) Automatic monitoring method for river dike collapse
CN109086824A (en) A kind of sediment sonar image classification method based on convolutional neural networks
CN109816012A (en) A multi-scale object detection method fused with context information
CN110232394A (en) A kind of multi-scale image semantic segmentation method
CN110349087B (en) RGB-D image high-quality grid generation method based on adaptive convolution
CN114663439A (en) Remote sensing image land and sea segmentation method
CN109886221A (en) Sand dredger recognition methods based on saliency detection
Pham et al. A new deep learning approach based on bilateral semantic segmentation models for sustainable estuarine wetland ecosystem management
CN112288776B (en) A Target Tracking Method Based on Multi-Time Step Pyramid Codec
CN113312993B (en) A PSPNet-based Land Cover Classification Method for Remote Sensing Data
CN111476723B (en) Remote sensing image lost pixel recovery method for failure of Landsat-7 scanning line corrector
CN112785629A (en) Aurora motion characterization method based on unsupervised deep optical flow network
CN114943893B (en) Feature enhancement method for land coverage classification
CN118072001B (en) Camouflaged target detection method based on scale feature perception and wide-range perception convolution
Ji et al. Domain adaptive and interactive differential attention network for remote sensing image change detection
CN109741340A (en) Refinement method of ice layer segmentation in ice sheet radar image based on FCN-ASPP network
CN111696033A (en) Real image super-resolution model and method for learning cascaded hourglass network structure based on angular point guide
CN117726954B (en) A method and system for segmenting land and sea in remote sensing images
CN114943894A (en) ConvCRF-based high-resolution remote sensing image building extraction optimization method
CN112906645B (en) Sea ice target extraction method with SAR data and multispectral data fused
Yuan et al. Capturing small objects and edges information for cross-sensor and cross-region land cover semantic segmentation in arid areas
CN117011713A (en) Method for extracting field information based on convolutional neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190705

RJ01 Rejection of invention patent application after publication
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载