CN109977970A - Character recognition method under water conservancy project complex scene based on saliency detection - Google Patents
Character recognition method under water conservancy project complex scene based on saliency detection Download PDFInfo
- Publication number
- CN109977970A CN109977970A CN201910240747.7A CN201910240747A CN109977970A CN 109977970 A CN109977970 A CN 109977970A CN 201910240747 A CN201910240747 A CN 201910240747A CN 109977970 A CN109977970 A CN 109977970A
- Authority
- CN
- China
- Prior art keywords
- water conservancy
- conservancy project
- character recognition
- conspicuousness
- picture
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The case where present invention discloses character recognition method under a kind of water conservancy project complex scene based on saliency detection, partial region semantic information missing occurs for part conspicuousness object (personage) detection model proposes a kind of new strong supervision conspicuousness detection method.The model of use is divided into two layers, and first layer mainly uses multi-level full convolutional neural networks and grabs the global semantic information and local feature information of conspicuousness personage in pixel level, and marks out coarse conspicuousness personage.The second layer is introduced into the significant characteristics of the shallow hierarchy generated in first layer operational process by using short connection, is merged with coarse notable figure to obtain the characteristic information being lost, and reinforce the boundary characteristic of obvious object.In data input, have chosen primary picture it is high it is photochemical after picture and primary bloom picture is gone as inputting, to show on the character data collection under the model of the design water conservancy project complex scene that random collecting obtains on network excellent simultaneously.
Description
Technical field
The present invention relates to computer vision fields, detect primarily directed to the conspicuousness of personage under water conservancy project complex scene.
Background technique
Under the increasingly important historical background of flood control informatization, in the weather such as typhoon and spring tide, how to prevent and and
Shi Faxian personnel illegally enter water engineering facility (seawall, reservoir, dyke etc.) seems especially urgent in supervision, using artificial
Check that the mode of video monitoring is popularized on a large scale, but efficiency and cost are to be improved.As can by the way of finding automatically,
Machine is allowed to assist staff, therefore the target detection under water conservancy project complex scene for personage is very important.In this way
Demand action can be used personage conspicuousness detection technique meet.Although there is some technologies to can detecte part on the market
Personage in scape, but have the following disadvantages: (1) scene can not be excessively complicated, once contain more complexity in scene
Element, with high contrast and the big non-limiting object of accounting is easy to cause detection to fail in picture;(2) for detecting
Personage, profile is unintelligible, and sometimes very fuzzy, lost part overall situation semantic information;(3) existing part conspicuousness detection
Model can not be arranged effectively for such as water surface glistening light of waves of the element in the complex background in scene or with the massif of high contrast
It removes;(4) existing conspicuousness detection model is for close with personage's distance present in the image that shoots in water conservancy project scene and compare
Spending low impurity element effectively can not remove or identify;(5) it when existing conspicuousness detection model is directed to actual conditions, obtains
Result be deviated with reality.
Personage's picture in the case where observing a large amount of water conservancy project complex scene finds that these pictures can be divided into following three classes:
(1) wisp, i.e. conspicuousness personage are in the picture compared with full figure, and area accounting is less than 10%, and such picture number is in entirety
Detect accounting 80% in picture;(2) in the picture compared with full figure, accounting is greater than 50% for big object, i.e. personage, such picture compared with
It is few;(3) complex background not only contains personage's main body in the image shot, also includes riverside dykes and dams, distant place massif, river bank
Junction grade obvious object.How personage in detection image is solved, and the especially difficult problem of wisp detection is to be badly in need of solution
Certainly the problem of.
Summary of the invention
In order to overcome the above-mentioned deficiencies of the prior art, the present invention provides a kind of water conservancy project based on saliency detection is multiple
The recognition methods of personage under miscellaneous scene.For can not the global semantic letter of effective integration existing for current most people object detecting method
Breath and local characteristic information and noise present in picture can not be effectively detected, propose the depth based on multi-level short connection
Merge SEMANTIC INFORMATION MODEL.Can effectively using local feature information reduce because detect when lost part overall situation semantic information and
The case where causing conspicuousness object detection to fail, while enhancing the mark of conspicuousness object, noise to non-limiting object or
The non-limiting object of person's large volume is effectively removed.
The technical scheme adopted by the invention is that: acquisition picture overall situation semantic information and local feature information use multilayer
Secondary two information of short link model depth integration complement each other, the case where to reduce loss of learning in detection process.
Compared with prior art, the beneficial effects of the invention are as follows global semantic information and local feature information is combined, subtract
Lack and has occurred global semantic information loss in conspicuousness detection process so that the case where influencing final significant result figure.
Detailed description of the invention
Fig. 1 is the Artificial Neural Network Structures figure that the present invention uses;
Fig. 2 is pixel layer facial nerve flow through a network schematic diagram;
Fig. 3 is pixel layer facial nerve network specification auxiliary figure;
Fig. 4 is primary figure, removes bloom figure and bloom figure contrast schematic diagram.
Specific embodiment
The present invention will be further explained below with reference to the attached drawings.
Model of the invention is based on Caffe deep learning frame.
The first step designs a multi plane neural network end to end, and input picture can be made to be mapped directly to needs
Pixel-level significantly detect figure.With that in mind, (1) model first can produce multifaceted notable figure to grab difference
Overall situation semanteme or local feature in level.(2) model, which needs enough depth just, can go deep into the specifying information of capturing pictures
And hiding context comparative information.As shown in Figure 1, design pixel layer model.(initial picture size is 256 to initial picture
× 256, unit: pixel) by bloom and after going bloom to operate, the full convolutional neural networks of depth are inputted, in different convolutional layers
Generate respective result figure.The depth of these different size of convolution and neural network is referring to DCL (Deep Contrast
Learning pixel fluid layer), substrate of the simultaneous selection VGG-16 as model, as shown in Figure 2.
It also needs to carry out bloom and bloom reservation operations to image before input picture.It is complementary in image in order to grab
Semantic information, figure bloom figure and remove bloom figure by being converted to original image (RGB).It is converted using following formula:
Wherein X is original image, and M is the pixel average of image data set.K is a hyper parameter, is commonly defined as 1.
As can be seen from the above formula that the image X after conversionOWithIt is reciprocal.Use XOWithTwo as model
A input source.Specific effect is as shown in Figure 4.
The pixel layer model mainly has 5 pond layers (Max Pooling), 10 whole convolutional layer (each whole volumes
It include 2~4 not equal independent convolutional layers in lamination).Data are slowly transmitted downwards since first layer convolutional layer up to
Five layers of convolutional layer, during convolution transmitting, each layer convolutional layer can extract the known another characteristic of each layer.As first layer convolution is special
Note is local characteristic information, can eliminate the inapparent noise and impurity of part at this time, and for conspicuousness object
Boundary have higher retention so that picture is still within the degree of high pixel.After first layer, pond layer is acted on
The relatively significant information around the data requirement that diminution is transmitted while also extraction pixel.Hereafter each layer all can be to upper layer
The data transmitted extract new information, because the cross-sectional dimensions size of the data of the effect transmitting of pond layer successively contracts
Subtract.Layer 5 can grab the position of conspicuousness object and ignore more part to the global information for extracting general image, emphasis
Characteristic value.After the completion of all down-sampling operations, in order to need to merge the image that each convolutional layer time generates later, need by
All picture up-sampling, that is, deconvolution are needed the notable figure of generation at all levels in this example to identical specification size
It up-samples to 225 × 225 (units: pixel).Specific every layer of convolutional layer specification is as shown in Fig. 3.It is as described above a series of
Process is concluded when mathematics function can be used:
fs(X;W, b)=Pooling (σ (W*sX+b))
In above-mentioned formula, X is primary input picture;W and b respectively represents convolution kernel and convolution bias;*sRepresent volume
The step value of product operation;σ representative is line rectification function (Rectified Linear Unit, ReLU);What Pooling was represented
It is pondization operation, refers in particular to maximum pondization operation (Max Pooling) herein.F as a results(X;W, b) be former number
According to obtained from the down-sampling operation carried out according to parameter s.
In above-mentioned formula, X still represents primary input picture;fs(X;θ) represent under the action of step-length s and parameter θ
The characteristic pattern of generation;It represents in up-sampling step-length s and parameterUnder the action of deconvolution generate characteristic pattern, and guarantee
Characteristic pattern specification is identical as the specification of X.However up-sampling operation herein is different from conventional double interpolation operations, in function
It is to participate in supervised learning process, needs continuous perfect in an iterative process.
As described above, with apparent conspicuousness contour of object but same in the Saliency maps that pool1 convolutional layer face generates
When with more impurity and noise.Convolutional network generation figure after pool5 etc. has grabbed global conspicuousness information, still
Part overall situation conspicuousness information may be lost in the picture of part.So in order to preferably integrate multifaceted different spy
Sign averagely melts the result figure whole numerical value addition that all different levels generate also for the loss for making up Pixel-level notable figure
It is combined into a conspicuousness testing result figure.That is the FUSE operation in Fig. 2.This operates with mathematical formulae and is described as follows:
Wherein, N is the notable figure quantity obtained by first layer difference convolution Chi Huahou;SiTo pass through different convolution ponds
Obtained notable figure Sfuse。
The notable figure is than the notable figure S after layer 5 pondization up-sampling5There is clearer boundary, but there is also more
Noise.Therefore it will enter the second layer, and carry out noise, and strengthen conspicuousness object.It is learnt by observation, is adopted on first layer pond
Notable figure S after sample1There is the clear boundary close to original image Pixel-level.Then spontaneous idea, it is expected that SfuseIt can be better
Extract S1Clear boundary to supply SfuseSmeared out boundary problem.Therefore, it is connected using end, by S1With SfuseLongitudinal stack, with
It carries out maximum pondization operation three times to the image afterwards to operate with cubic convolution, the purpose of this operation is to be expected that by pondization operation
Remove S1In unnecessary impurity reinforce the mark of conspicuousness object, and because pond degree is smaller three times, do not influence S1Boundary
Readability.Pond step-length three times is 2 and pond range is 2 × 2 (units: pixel).During operating three times respectively
Generate notable figure Sfuse11, Sfuse12, Sfuse13.Then the above-mentioned average fusion of same addition is carried out to three width figures, obtains notable figure
Sfuse2.Continue to Sfuse2It is iterated operation, as shown in fig. 1, finally obtains significant result figure.
Finally notable figure is done to front transfer using cross entropy loss function (Cross Entropy Function).Wherein
Cross entropy weight be primary figure and notable figure, be expressed as follows with mathematical formulae:
Wherein G is true value figure (Ground Truth, GT);What W was represented is the set of network parameter;βiIt is balance of weights ginseng
Number;| I | represent all pixels point set in picture;|I|-For non-significant pixel collection;|I|+For significant pixel collection;And
The data set of the model training is on the personage's pictures collected, picture size be unified for 225 × 225 (unit:
Pixel), it the use of Batch-size is 1, the random decline that learning rate is 1e-8 is trained.When 200000 consumption of model iteration
Between more than 24 hours.This method is realized using Python based on Caffe frame.The GPU used is Tesla M40
(12GB)。
By the way that the fusion function of Chi Huayu convolution operation realization, the part of primary personage's picture of input are used for multiple times above
Characteristic information is gradually merged based on global semantic information.In the process, local feature information and global sense information
Fusion will not accomplish in one move but gradually supplement the global semantic character physical's frame lost during Chi Huayu convolution
The semantic information of other defects such as information or personage's main body.The model existing 6 SOD data sets (DUT-OMRON,
ECSSD, HKU-IS, PASCA-S, SED1, SED2) on have good performance.
Claims (5)
1. character recognition method under a kind of water conservancy project complex scene based on saliency detection, is technically characterized in that use
Multi-level full convolutional neural networks carry out the detection of conspicuousness object to picture, in the detection process specifically for wisp feelings
Condition acquires global semantic feature, and acquisition local message is as supplement.
2. character recognition method under the water conservancy project complex scene according to claim 1 based on saliency detection, special
Sign is: acquiring the global semantic feature i.e. main position of wisp personage as judging the basic of conspicuousness object space
Information uses short connection that the local message i.e. figure action details of shallow-layer convolutional layer output is semantic for the overall situation as supplemental information
It is supplemented.
3. character recognition method under the water conservancy project complex scene according to claim 1 based on saliency detection, special
Sign is: Model Fusion local message and global semantic feature using short connection.
4. character recognition method under the water conservancy project complex scene according to claim 2 or 3 based on saliency detection,
Be characterized in that: this method conspicuousness object in picture is to have highly sensitive inspection in the case that wisp personage accounting is low
The property surveyed.
5. character recognition method under the water conservancy project complex scene according to claim 2 or 3 based on saliency detection,
It is characterized in that, this method has complex background present in scene high-intensitive filter capacity.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910240747.7A CN109977970A (en) | 2019-03-27 | 2019-03-27 | Character recognition method under water conservancy project complex scene based on saliency detection |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910240747.7A CN109977970A (en) | 2019-03-27 | 2019-03-27 | Character recognition method under water conservancy project complex scene based on saliency detection |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN109977970A true CN109977970A (en) | 2019-07-05 |
Family
ID=67081032
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201910240747.7A Pending CN109977970A (en) | 2019-03-27 | 2019-03-27 | Character recognition method under water conservancy project complex scene based on saliency detection |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN109977970A (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112223288A (en) * | 2020-10-09 | 2021-01-15 | 南开大学 | Visual fusion service robot control method |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106447658A (en) * | 2016-09-26 | 2017-02-22 | 西北工业大学 | Significant target detection method based on FCN (fully convolutional network) and CNN (convolutional neural network) |
| CN108509880A (en) * | 2018-03-21 | 2018-09-07 | 南京邮电大学 | A kind of video personage behavior method for recognizing semantics |
-
2019
- 2019-03-27 CN CN201910240747.7A patent/CN109977970A/en active Pending
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106447658A (en) * | 2016-09-26 | 2017-02-22 | 西北工业大学 | Significant target detection method based on FCN (fully convolutional network) and CNN (convolutional neural network) |
| CN108509880A (en) * | 2018-03-21 | 2018-09-07 | 南京邮电大学 | A kind of video personage behavior method for recognizing semantics |
Non-Patent Citations (1)
| Title |
|---|
| QIBIN HOU 等: "Deeply Supervised Salient Object Detection", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 * |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112223288A (en) * | 2020-10-09 | 2021-01-15 | 南开大学 | Visual fusion service robot control method |
| CN112223288B (en) * | 2020-10-09 | 2021-09-14 | 南开大学 | Visual fusion service robot control method |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN116052016B (en) | Fine segmentation and detection method of clouds and cloud shadows in remote sensing images based on deep learning | |
| CN108171698B (en) | Method for automatically detecting human heart coronary calcified plaque | |
| CN111275696B (en) | Medical image processing method, image processing method and device | |
| CN111862143B (en) | Automatic monitoring method for river dike collapse | |
| CN109086824A (en) | A kind of sediment sonar image classification method based on convolutional neural networks | |
| CN109816012A (en) | A multi-scale object detection method fused with context information | |
| CN110232394A (en) | A kind of multi-scale image semantic segmentation method | |
| CN110349087B (en) | RGB-D image high-quality grid generation method based on adaptive convolution | |
| CN114663439A (en) | Remote sensing image land and sea segmentation method | |
| CN109886221A (en) | Sand dredger recognition methods based on saliency detection | |
| Pham et al. | A new deep learning approach based on bilateral semantic segmentation models for sustainable estuarine wetland ecosystem management | |
| CN112288776B (en) | A Target Tracking Method Based on Multi-Time Step Pyramid Codec | |
| CN113312993B (en) | A PSPNet-based Land Cover Classification Method for Remote Sensing Data | |
| CN111476723B (en) | Remote sensing image lost pixel recovery method for failure of Landsat-7 scanning line corrector | |
| CN112785629A (en) | Aurora motion characterization method based on unsupervised deep optical flow network | |
| CN114943893B (en) | Feature enhancement method for land coverage classification | |
| CN118072001B (en) | Camouflaged target detection method based on scale feature perception and wide-range perception convolution | |
| Ji et al. | Domain adaptive and interactive differential attention network for remote sensing image change detection | |
| CN109741340A (en) | Refinement method of ice layer segmentation in ice sheet radar image based on FCN-ASPP network | |
| CN111696033A (en) | Real image super-resolution model and method for learning cascaded hourglass network structure based on angular point guide | |
| CN117726954B (en) | A method and system for segmenting land and sea in remote sensing images | |
| CN114943894A (en) | ConvCRF-based high-resolution remote sensing image building extraction optimization method | |
| CN112906645B (en) | Sea ice target extraction method with SAR data and multispectral data fused | |
| Yuan et al. | Capturing small objects and edges information for cross-sensor and cross-region land cover semantic segmentation in arid areas | |
| CN117011713A (en) | Method for extracting field information based on convolutional neural network |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190705 |
|
| RJ01 | Rejection of invention patent application after publication |