+

CN106874868A - A kind of method for detecting human face and system based on three-level convolutional neural networks - Google Patents

A kind of method for detecting human face and system based on three-level convolutional neural networks Download PDF

Info

Publication number
CN106874868A
CN106874868A CN201710078431.3A CN201710078431A CN106874868A CN 106874868 A CN106874868 A CN 106874868A CN 201710078431 A CN201710078431 A CN 201710078431A CN 106874868 A CN106874868 A CN 106874868A
Authority
CN
China
Prior art keywords
face
network
feature vector
training
convolutional neural
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710078431.3A
Other languages
Chinese (zh)
Other versions
CN106874868B (en
Inventor
王鲁许
白洪亮
董远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Faceall Co
Original Assignee
Beijing Faceall Co
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Faceall Co filed Critical Beijing Faceall Co
Priority to CN201710078431.3A priority Critical patent/CN106874868B/en
Publication of CN106874868A publication Critical patent/CN106874868A/en
Application granted granted Critical
Publication of CN106874868B publication Critical patent/CN106874868B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a kind of method for detecting human face based on three-level convolutional neural networks and system, method has the advantages that:In the training process, by increasing first n grades training result as the input of rear stage, the missing problem of training data is compensate for, so as to improve the degree of accuracy and the recall rate of Face datection, and improves the performance of overall network.Human face characteristic point is added in training sample, the classification of face and the positioning precision of face rectangle frame are improved by human face characteristic point, so as to be reached the standard grade close to network is reached, and further improve recall rate and the degree of accuracy of Face datection;Only passing through the classification side-play amount in the first (the second) side-play amount being calculated carries out the regression correction of picture classification, so ensure that the correct part of classification no longer carries out regression correction, so that the speed of Face datection is improved, and reach the purpose for further excavating network performance.System has detection method identical beneficial effect.

Description

A kind of method for detecting human face and system based on three-level convolutional neural networks
Technical field
The present invention relates to human face detection tech field, and in particular to a kind of Face datection based on three-level convolutional neural networks Method and system.
Background technology
Since 21st century, computer technology flourishes, and is widely applied to various fields;With calculating The development of machine technology, human face detection tech arises at the historic moment and in continuous iteration, renewal.Face datection refers to for any Image collection, uses certain strategy to scan for it to determine the wherein image with face.
Face datection is a key link in Automatic face recognition system.Early stage recognition of face research mainly for With the facial image recognition (such as image without background) compared with Condition of Strong Constraint, often assume face location always or be readily available, Therefore Face datection problem is not taken seriously.
With the development of the applications such as ecommerce, recognition of face turns into most potential biometric verification of identity means, this Application background requirement Automatic face recognition system can have certain recognition capability to general pattern, and for thus being faced is Row problem causes that Face datection is paid attention to initially as an independent problem by researcher.Today, the application of Face datection Background far beyond the category of face identification system, content-based retrieval, Digital Video Processing, video detection, The aspect such as face modeling and face tracking has important application value.
The search strategy that human face detection tech is typically used is rolled up for decision tree, logistic regression, naive Bayesian and three-level Product neutral net scheduling algorithm etc., wherein the method for detecting human face based on three-level convolutional neural networks/system addresses detection speed is fast, Recognition accuracy is high and rapid iteration, update.Method for detecting human face based on three-level convolutional neural networks of the prior art:1) By multistage performance, enhanced network is trained step by step step by step, and the candidate frame that previous stage is judged as face is passed into next stage Learnt as training sample;2) made decisions by the classification of face and the Recurrent networks of face frame in every one-level;3) such as Fruit is classified correct directly by corrected data whole rear feed.
The deficiencies in the prior art part is that, because previous stage network performance is poor, there is part face cannot correctly sentence It is fixed, cause incoming next stage face candidate frame to have loss, overall performance is poor;Only by face classification and the recurrence nothing of face frame The performance that method reaches network is reached the standard grade, and still has room for promotion;Data whole rear feed, the depth of e-learning is inadequate, it is impossible to excavate net Network performance.
The content of the invention
It is an object of the invention to provide a kind of method for detecting human face based on three-level convolutional neural networks and system, to solve Overall performance is poor;Only it is corrected by face classification and face frame, it is impossible to which the performance for reaching network is reached the standard grade;The portion of correct classification Divide the problem for still carrying out regression correction.
To achieve these goals, the present invention provides following technical scheme:
A kind of method for detecting human face based on three-level convolutional neural networks, comprises the following steps:
Obtain training sample and detection picture;The training sample at least includes being labeled with face frame and human face characteristic point Face picture;
Training sample input three-level convolutional neural networks are trained step by step, the process of the training is:
Rear dimensionality reduction is predicted according to the training sample and first n grades training result, obtain corresponding two dimensional character to Amount, and obtain the first side-play amount according to its calculating;
Regression correction is carried out to the two-dimensional feature vector by first side-play amount, corresponding training result is obtained;
Three-level convolutional neural networks after the detection picture input training are carried out into Face datection step by step, face square is obtained Shape frame.
The above-mentioned method for detecting human face based on three-level convolutional neural networks, the face picture in the training sample also contains Picture classification label and the face frame for uniquely determining.
The above-mentioned method for detecting human face based on three-level convolutional neural networks, the acquisition of the two-dimensional feature vector is including following Step:
M dimensional feature vectors are obtained according to the training sample and first n grades training result;
Dimension-reduction treatment is carried out to the m dimensional feature vectors by full convolutional layer/full articulamentum, obtain the two dimensional character to Amount.
The above-mentioned method for detecting human face based on three-level convolutional neural networks, the three-level network includes tie point, second Branch road and the 3rd branch road, the two grade network include the tie point and second branch road, the tie point with it is described Primary network station is identical.
The above-mentioned method for detecting human face based on three-level convolutional neural networks, in three-level network, the acquisition of m dimensional feature vectors Comprise the following steps:
The training result of the training sample and upper level is input into the tie point and obtains first eigenvector, by it It is input into second branch road and obtains second feature vector, is inputted the 3rd branch road and obtains third dimension characteristic vector;
By the first eigenvector, second feature vector and third feature vector spliced, obtain m dimensional features to Amount.
The above-mentioned method for detecting human face based on three-level convolutional neural networks, the acquisition of first side-play amount includes following step Suddenly:
The two-dimensional feature vector is input into SoftmaxWithLoss layers, is calculated and is obtained classification side-play amount;
The two-dimensional feature vector is input into Loss layers of Euclidean, is calculated and is obtained face frame side-play amount and the people Face characteristic point side-play amount.
The above-mentioned method for detecting human face based on three-level convolutional neural networks, the calculating of the classification side-play amount includes following step Suddenly:
The two-dimensional feature vector is defined;It is defined as Z={ z1,z2, wherein
Classified by softmax functions;It is divided into two classes, it is special to turn to:
The difference between the two-dimensional feature vector and the training sample for predicting is calculated by loss function;
Loss function is:
WhereinCalculate
AmendmentWherein α is coefficient.
The above-mentioned method for detecting human face based on three-level convolutional neural networks, the acquisition of the face rectangle frame includes following step Suddenly:
The detection picture input primary network station is screened to it, regression correction and is merged, obtained the first face time Select frame;
First face candidate frame input two grade network is screened to it, regression correction and is merged, obtained second Face candidate frame;
Second face candidate frame input three-level network is screened to it, regression correction and is merged, obtained face Rectangle frame.
The above-mentioned method for detecting human face based on three-level convolutional neural networks, screened, regression correction and merge include with Lower step:
According to detection picture/the first face candidate frame/the second face candidate frame and corresponding face probability, filter out big In the face candidate frame of setting probability threshold value;
Calculated according to the face candidate frame obtained after screening and obtain the second side-play amount, it is entered by second side-play amount Row regression correction;
The face candidate frame obtained after non-maxima suppression algorithm is to correction is merged, and obtains the first face candidate Frame/the second face candidate frame/face rectangle frame.
The method for detecting human face based on three-level convolutional neural networks that the present invention is provided, has the advantages that:
1) in the training process, by increasing first n grades training result as the input of rear stage, compensate for training data Missing problem, so as to improve the degree of accuracy and the recall rate of Face datection, and improve the performance of overall network;
2) human face characteristic point is added in training sample, the classification of face and face rectangle frame is made by human face characteristic point Positioning precision be improved, so as to reach the standard grade close to reaching network, and further improve Face datection recall rate and The degree of accuracy;
3) only passing through the classification side-play amount in the first (the second) side-play amount being calculated carries out the recurrence school of picture classification Just, so ensure that the correct part of classification no longer carries out regression correction, so that the speed of Face datection is improved, and reach To the purpose for further excavating network performance.
A kind of face detection system based on three-level convolutional neural networks, including three-level convolutional neural networks, the three-level Convolutional neural networks include:
Acquiring unit, is used to obtain training sample and detection picture;The training sample at least includes being labeled with face spy Levy face picture a little;
Network training unit, is used to step by step be trained training sample input three-level convolutional neural networks;
It includes:Feature vector module and regression correction module,
The feature vector module, to be dropped after being predicted according to the training sample and first n grades training result Dimension, obtains corresponding two-dimensional feature vector, and obtain the first side-play amount according to its calculating;
The regression correction module, is used to carry out recurrence school to the two-dimensional feature vector by first side-play amount Just, corresponding training result is obtained;
Face datection unit, is used to for the three-level convolutional neural networks after the detection picture input training to carry out people step by step Face detection, obtains face rectangle frame.
The face detection system based on three-level convolutional neural networks that the present invention is provided, has the advantages that:
1) one-level again is made up by the two grade network and three-level network in network training unit 2 (or Face datection unit 3) The defect of network performance difference, is improved the accuracy of picture classification, so as to improve the recall rate of Face datection and accurate Degree, and improve the performance of overall network;
2) human face characteristic point is added in the face picture in the training sample of acquiring unit 1, is made by human face characteristic point The classification of face and the positioning precision of face rectangle frame are improved, so that reached the standard grade close to network is reached, and further Improve recall rate and the degree of accuracy of Face datection;
3) the classification side-play amount for only being obtained by the cooperation of feature vector module 21 and regression correction module 22 carries out picture The regression correction of classification, so ensure that classification correctly is partly not required to be corrected, so that the speed of Face datection is obtained Improve, and reach the purpose for further excavating network performance.
Brief description of the drawings
In order to illustrate more clearly of the embodiment of the present application or technical scheme of the prior art, below will be to institute in embodiment The accompanying drawing for needing to use is briefly described, it should be apparent that, drawings in the following description are only described in the present invention A little embodiments, for those of ordinary skill in the art, can also obtain other accompanying drawings according to these accompanying drawings.
Fig. 1 is the structured flowchart of the method for detecting human face based on three-level convolutional neural networks provided in an embodiment of the present invention;
The flow of the method for detecting human face based on three-level convolutional neural networks that Fig. 2 is provided for one embodiment of the present invention Schematic diagram;
The flow of the method for detecting human face based on three-level convolutional neural networks that Fig. 3 is provided for one embodiment of the present invention Schematic diagram;
The flow of the method for detecting human face based on three-level convolutional neural networks that Fig. 4 is provided for one embodiment of the present invention Schematic diagram;
The flow of the method for detecting human face based on three-level convolutional neural networks that Fig. 5 is provided for one embodiment of the present invention Schematic diagram;
The flow of the method for detecting human face based on three-level convolutional neural networks that Fig. 6 is provided for one embodiment of the present invention Schematic diagram;
The flow of the method for detecting human face based on three-level convolutional neural networks that Fig. 7 is provided for one embodiment of the present invention Schematic diagram;
Fig. 8 is the structural representation of the face detection system based on three-level convolutional neural networks provided in an embodiment of the present invention Figure;
The structural representation of the primary network station that Fig. 9 is provided for one embodiment of the present invention;
The structural representation of the two grade network that Figure 10 is provided for one embodiment of the present invention;
The structural representation of the three-level network that Figure 11 is provided for one embodiment of the present invention.
Description of reference numerals:
1st, acquiring unit;2nd, network training unit;21st, feature vector module;22nd, regression correction module;3rd, Face datection Unit.
Specific embodiment
In order that those skilled in the art more fully understands technical scheme, below in conjunction with accompanying drawing to this hair It is bright to be further detailed.
It is the Face datection based on three-level convolutional neural networks provided in an embodiment of the present invention as shown in Fig. 1-7 and 9-11 Method, it is further comprising the steps of:
S101, acquisition training sample and detection picture;The training sample at least includes being labeled with face frame and face is special Levy face picture a little;
As shown in figs. 9-11, further, the three-level convolutional neural networks include primary network station, two grade network and three Level network, the three-level network includes tie point, the second branch road and the 3rd branch road, and the two grade network includes described first Road and second branch road, the tie point are identical with the primary network station.The network structure and primary network station of tie point It is identical, it is easy to differentiate, primary network station is represented with 12-net in figure, 24-net represents two grade network, and 48-net represents three-level Network;That is 24-net include 12-net branch roads and 24-net branch roads, 48-net include 12-net branch roads, 24-net branch roads and 48-net branch roads, and 12-net, 24-net be connected step by step with 48-net, and in this way, can step by step select training sample, exclusion does not have There are other pictures of face, obtain accurate face picture and its corresponding more accurate face frame (determines face position Put).
Further, the face picture in the training sample also contains picture classification label.Specifically, the training sample This for include the human face characteristic point information of tag along sort, the face frame that uniquely determines and mark face picture and other Picture;Picture classification training can be carried out by tag along sort, will training sample be divided into the face picture set that has label and Other classes of picture set two;Rectangular area of the face in the face picture can determine that by face frame, so as to confine the region As determine face location;Human face characteristic point (landmark points) is nose, glasses, face, forehead and facial contour line etc. Protruding parts, can be easy to judge the difference of face by these positions;Due to only determining face location by face frame There is error, face can be accurately positioned by human face characteristic point:By increasing or reducing face frame, human face characteristic point is set to fall Within the scope of face frame, so as to improve the Face detection precision of face frame.Detection picture be face picture, environment picture and The set of other any images;After the completion of waiting to train, can carry out detecting the Face datection of picture.The mode for obtaining training sample can Think by transferring face database of the prior art, or face picture is obtained by modes such as 3D printings, and add contingency table Face frame, the mark human face characteristic point for sign, uniquely determining, then be mixed in other pictures.
S102, by the training sample input three-level convolutional neural networks trained step by step;
Step by step training refer to successively according to primary network station, two grade network, three-level network order to three-level convolutional Neural net Network is trained, and three-level convolutional neural networks have learning ability, the mode by that can learn picture classification after training, and Corresponding position can be found out in picture to be confined with rectangle frame, it might even be possible to by introducing position of the human face characteristic point to rectangle frame Further correction is put, so as to when largely different picture is input into, be realized by the three-level convolutional neural networks after training Face classification, positioning.
In S102 steps, the training is further comprising the steps of:
S1021, rear dimensionality reduction is predicted according to the training sample and first n grades training result, obtains corresponding two dimension Characteristic vector, and obtain the first side-play amount according to its calculating;
Training result refers to be obtained after the neural network forecast of each grade, dimensionality reduction and regression correction in three-level convolutional neural networks Result;When training sample is input into primary network station, first n grades training result is " sky ", when training sample is input into two grade network When, first n grades training result is " training result of primary network station ", when training sample is input into three-level network, the instruction of upper level It is " training result of primary network station " and " training result of two grade network " to practice result;Predict and during dimensionality reduction refers to training process, Training sample to being input into is classified, the prediction of face location, and is converted the two-dimensional feature vector for ease of computing; During first side-play amount refers to training process, the two-dimensional feature vector predicted and obtained after dimensionality reduction is relative to training sample (mainly Refer to the tag along sort in predicted value and training sample, the face frame that uniquely determines and the human face characteristic point of mark these aspects Difference) difference, that is, predict after with prediction before value between difference;Preferably, carried out between the two by loss function Calculate.Upper level network (training result and training sample of upper level network are input into next stage) is made up by next stage network The defect of poor performance, is improved the accuracy of picture classification, so as to improve recall rate and the degree of accuracy of Face datection, and And improve the performance of overall network.
In step S1021, the two-dimensional feature vector is comprised the following steps:
S201, m dimensional feature vectors are obtained according to the training sample and first n grades training result;
It was pre- geodesic structure before full convolutional layer/full articulamentum, the m dimensional features of each network is obtained by the structure prediction Vector, because the structure of primary network station, two grade network, three-level network is differed, and is input into the picture being wherein trained Also difference, therefore the m dimensional feature vectors for obtaining also is differed;Two grade network enters to the error component that primary network station prediction occurs Row is corrected, and ibid, three-level network corrects two grade network;The main points of correction are can during one-level/two grade network predicts the result for obtaining Can occur not being classified to face picture set, but the picture containing label or containing label not be classified to face picture but The situation of the picture of set;The probability of the occurrence of can substantially reducing above-mentioned by two grades/three-level network, so that three-level is rolled up Product neutral net has the ability of self purification.
In three-level network in S201 steps, m dimensional feature vectors are comprised the following steps:
S301, the training result input tie point acquisition first eigenvector by training sample and upper level, its is defeated Enter the second branch road and obtain second feature vector, be inputted the 3rd branch road and obtain third dimension characteristic vector;
S302, by the first eigenvector, second feature vector and third feature vector spliced, obtain m dimension Characteristic vector.
Pre- geodesic structure in networks at different levels is respectively provided with splicing function;In three-level network, branch roads at different levels are separately operable and obtain Different characteristic vectors, the dimension of each characteristic vector (i.e. first eigenvector, second feature vector and third feature vector) Degree is differed, and features described above vector is overlapped, and obtains m dimensional feature vectors;In two grade network, splicing side ibid Formula, a few branch road, therefore no third characteristic vector;In primary network station, only one branch road, therefore the result that splicing is obtained is just It is the result of the branch road.Prepared to be converted into two-dimensional feature vector, face is represented in vector form, make calculating more square Just.Specifically, corresponding training data is separately input in three branch roads.First branch road and just the same with 12-net, Before full convolution, can obtain m dimension (by 16 tie up as a example by) characteristic vector, second branch road by the full articulamentums of 24-net it The face feature vector of n dimensions (so that 128 tie up as an example) can be obtained after preceding layer.3rd branch road is by the full articulamentums of 48-net The face feature vector of p dimensions (so that 256 tie up as an example) can be obtained after layer before, three characteristic vectors are spliced.It is false IfIt is the characteristic vector of 12-net,It is the characteristic vector of 24-net.It is the characteristic vector of 48-net.Three vectors are carried out into splicing can obtain 400 dimensions ((m+n+p) is tieed up)By X4By full articulamentum.
S202, dimension-reduction treatment is carried out to the m dimensional feature vectors by full convolutional layer/full articulamentum, obtain the two dimension Characteristic vector.
There is the pre- geodesic structure being predicted before full convolutional layer, it is from the training sample that this is pre- by pre- geodesic structure Geodesic structure is considered that face picture set is divided into a class, and other picture set are divided into another kind of;And obtain the face figure The prediction face frame and prediction human face characteristic point of piece set, the form for being converted into m dimensional feature vectors are represented.Full volume Lamination has the effect of multidimensional characteristic vectors dimensionality reduction to two dimension, and the m dimensional feature vectors can just obtain two by the full convolutional layer Dimensional feature vector, the calculating of the side-play amount being convenient between predicted value and training sample.
S1022, regression correction is carried out to the two-dimensional feature vector by first side-play amount, obtain corresponding training As a result;
There is rear feed structure after full convolution/articulamentum in networks at different levels, predicted value is returned by the structure Correction;Regression correction refers to that the value by the first side-play amount to predicting is compensated, skew, face frame that correction classification is produced The skew that the skew of generation and human face characteristic point are produced, so that face classification, Face detection are more accurate, final acquisition Face frame is also more accurate, the regression correction that the categorized correct part of network is not classified, and further excavates network Performance, it is ensured that detection speed.
In S1022 steps, the acquisition of first side-play amount is comprised the following steps:
S401, by the two-dimensional feature vector be input into SoftmaxWithLoss layer, calculate obtain classify side-play amount;
After two-dimensional feature vector is obtained, by the SoftmaxWithLoss layers of calculating of classification side-play amount, will be calculated Weight W, bias term b carries out rear feed, i.e., the regression correction that can be classified by side-play amount of classifying improves recalling for classification Rate, the degree of accuracy.
In S401 steps, the calculating of the classification side-play amount is comprised the following steps:
S501, the two-dimensional feature vector is defined;
It is defined as Z={ z1,z2, wherein
S502, classified by softmax functions;It is divided into two classes, it is special to turn to:
Difference between S503, the two-dimensional feature vector predicted by loss function calculating and the training sample;
Loss function is:
WhereinCalculate
AmendmentWherein α is coefficient.
S402, by the two-dimensional feature vector be input into Loss layer of Euclidean, calculate acquisition face frame side-play amount and The human face characteristic point side-play amount.
Face frame side-play amount is carried out in networks at different levels by the combination of Euclidean distance and loss function and face characteristic is inclined The regression correction of shifting amount, so as to realize the correction to the final face rectangle frame for obtaining, it is ensured that on the premise of recognition of face speed Further improve face identification rate.
S103, the detection picture is input into the three-level convolutional neural networks carries out Face datection step by step, obtain face Rectangle frame.
Testing result is that the detection picture being input into is classified by the networks at different levels in three-level convolutional neural networks, and Detection obtains the general designation of face location and human face characteristic point, and it is the face candidate frame that the detection of each network is obtained;Correspondence three Individual network, testing result has three, it is screened, regression correction and merge after be input into next stage detect, finally may be used To obtain face rectangle frame;Face rectangle frame is screened to first pass through specific program, then by human face characteristic point side-play amount, face frame After the combination of both side-play amounts is corrected to it, the rectangle frame that same or analogous face frame is obtained is remerged, it can determine The information such as the face location stated.
In S103 steps, the face rectangle frame is comprised the following steps:
S601, the detection picture input primary network station is screened to it, regression correction and is merged, being obtained the first Face candidate frame;
S602, the first face candidate frame input two grade network is screened to it, regression correction and is merged, being obtained Second face candidate frame;
S603, the second face candidate frame input three-level network is screened to it, regression correction and is merged, being obtained Face rectangle frame.
Primary network station detection obtains the first face candidate frame, and two grade network detection obtains the second face candidate frame, three-level net Network detection obtains face rectangle frame (three testing results in above three face candidate frame correspondence step 103), to the first two Testing result is screened, regression correction and merge after respectively obtain the second face candidate frame and final face rectangle frame; Further, obtain intercepting out from artwork after the first face candidate frame and be adjusted to 24*24px sizes and be input into the second network Detected, obtain intercepting out from artwork after the second face candidate frame and be adjusted to 48*48px sizes and be input into the 3rd network Detected, screened again after detection, regression correction and merge after obtain face rectangle frame.Detect step by step, obtain accurate face Rectangle frame (face location), so as to further increase recall rate and the degree of accuracy of detection.
In S103 steps, screened, regression correction and merging comprise the following steps:
S701, according to detection picture/the first face candidate frame/the second face candidate frame and corresponding face probability, sieve Select the face candidate frame more than setting probability threshold value;
S702, calculated according to the face candidate frame that is obtained after screening and obtain the second side-play amount, by second side-play amount Regression correction is carried out to it;
S703, the face candidate frame obtained after non-maxima suppression algorithm is to correction are merged, and obtain the first Face candidate frame/the second face candidate frame/face rectangle frame.
Face probability refers to that will detect that the part picture classification in picture is the face picture set after face pictures are closed In picture wherein include the probability of face;Face probability is compared with the probability threshold value of setting, if less than the setting Value, then delete the face candidate frame less than the setting value, the face candidate frame after being screened;By SoftmaxWithLoss The calculating that layer and Euclidean Loss layer carry out the second side-play amount, second side-play amount includes that the picture in detection process divides Class skew, the face frame skew for detecting and the human face characteristic point skew for detecting, so that after above-mentioned skew is to screening The face candidate frame for obtaining carries out regression correction, face candidate frame after being corrected;Again by non-maxima suppression algorithm to school The face candidate frame for just obtaining afterwards carries out frame merging, and non-maxima suppression algorithm is that face frame is arranged by the probability of face Sequence, the face frame and other frames for choosing maximum probability calculates registration, and registration just deletes corresponding frame more than certain threshold value, Merge the purpose of frame so as to reach, obtain the first face candidate frame/the second face candidate frame/face rectangle frame.By screening, return Returning correction and frame to merge makes the recall rate of Face datection and the degree of accuracy further improve, and ensure that the speed of detection.
The method for detecting human face based on three-level convolutional neural networks that the present invention is provided, has the advantages that:
1) in the training process, by increasing first n grades training result as the input of rear stage, compensate for training data Missing problem, so as to improve the degree of accuracy and the recall rate of Face datection, and improve the performance of overall network;
2) human face characteristic point is added in training sample, the classification of face and face rectangle frame is made by human face characteristic point Positioning precision be improved, so as to reach the standard grade close to reaching network, and further improve Face datection recall rate and The degree of accuracy;
3) only passing through the classification side-play amount in the first (the second) side-play amount being calculated carries out the recurrence school of picture classification Just, so ensure that the correct part of classification no longer carries out regression correction, so that the speed of Face datection is improved, and reach To the purpose for further excavating network performance.
As shown in figure 8, the embodiment of the present invention also provides the face detection system based on three-level convolutional neural networks, including three Level convolutional neural networks, the three-level convolutional neural networks include:
Acquiring unit 1, is used to obtain training sample and detection picture;The training sample at least includes being labeled with face spy Levy face picture a little;
Network training unit 2, is used to step by step be trained training sample input three-level convolutional neural networks;
It includes:Feature vector module and regression correction module,
The feature vector module 21, after being predicted according to the training sample and first n grades training result Dimensionality reduction, obtains corresponding two-dimensional feature vector, and obtain the first side-play amount according to its calculating;
The regression correction module 22, is used to carry out recurrence school to the two-dimensional feature vector by first side-play amount Just, corresponding training result is obtained;
Face datection unit 3, is used to carry out the three-level convolutional neural networks after the detection picture input training step by step Face datection, obtains face rectangle frame.
The face detection system based on three-level convolutional neural networks that the present invention is provided, has the advantages that:
1) one-level again is made up by the two grade network and three-level network in network training unit 2 (or Face datection unit 3) The defect of network performance difference, is improved the accuracy of picture classification, so as to improve the recall rate of Face datection and accurate Degree, and improve the performance of overall network;
2) human face characteristic point is added in the face picture in the training sample of acquiring unit 1, is made by human face characteristic point The classification of face and the positioning precision of face rectangle frame are improved, so that reached the standard grade close to network is reached, and further Improve recall rate and the degree of accuracy of Face datection;
3) the classification side-play amount for only being obtained by the cooperation of feature vector module 21 and regression correction module 22 carries out picture The regression correction of classification, so ensure that classification correctly is partly not required to be corrected, so that the speed of Face datection is obtained Improve, and reach the purpose for further excavating network performance.
Some one exemplary embodiments of the invention only are described by way of explanation above, undoubtedly, for ability The those of ordinary skill in domain, without departing from the spirit and scope of the present invention, can be with a variety of modes to institute The embodiment of description is modified.Therefore, above-mentioned accompanying drawing and description are inherently illustrative, should not be construed as to the present invention The limitation of claims.

Claims (10)

1. a kind of method for detecting human face based on three-level convolutional neural networks, it is characterised in that comprise the following steps:
Obtain training sample and detection picture;The training sample at least includes being labeled with the face of face frame and human face characteristic point Picture;
Training sample input three-level convolutional neural networks are trained step by step, the process of the training is:
Rear dimensionality reduction is predicted according to the training sample and first n grades training result, corresponding two-dimensional feature vector is obtained, and The first side-play amount is obtained according to its calculating;
Regression correction is carried out to the two-dimensional feature vector by first side-play amount, corresponding training result is obtained;
Three-level convolutional neural networks after the detection picture input training are carried out into Face datection step by step, face rectangle is obtained Frame.
2. method for detecting human face according to claim 1, it is characterised in that the face picture in the training sample also contains There is picture classification label.
3. method for detecting human face according to claim 1, it is characterised in that the acquisition of the two-dimensional feature vector include with Lower step:
M dimensional feature vectors are obtained according to the training sample and first n grades training result;
Dimension-reduction treatment is carried out to the m dimensional feature vectors by full convolutional layer/full articulamentum, the two-dimensional feature vector is obtained.
4. method for detecting human face according to claim 1, it is characterised in that three-level convolutional neural networks include one-level net Network, two grade network and three-level network, the three-level network include tie point, the second branch road and the 3rd branch road, described two grades Network includes the tie point and second branch road, and the tie point is identical with the primary network station.
5. the method for detecting human face according to claim 3 or 4, it is characterised in that in three-level network, m dimensional feature vectors Comprised the following steps:
The training result of the training sample and upper level is input into the tie point and obtains first eigenvector, be inputted Second branch road obtains second feature vector, is inputted the 3rd branch road and obtains third feature vector;
The first eigenvector, second feature vector and third feature vector are spliced, m dimensional feature vectors are obtained.
6. method for detecting human face according to claim 1, it is characterised in that the acquisition of first side-play amount includes following Step:
The two-dimensional feature vector is input into SoftmaxWithLoss layers, is calculated and is obtained classification side-play amount;
The two-dimensional feature vector is input into Loss layers of Euclidean, is calculated and is obtained face frame side-play amount and face spy Levy a side-play amount.
7. method for detecting human face according to claim 6, it is characterised in that the calculating of the classification side-play amount includes following Step:
The two-dimensional feature vector is defined;It is defined as Z={ z1,z2, wherein
Classified by softmax functions;It is divided into two classes, it is special to turn to:
y 1 = h θ ( z 1 ) = e z 1 Σ j = 1 2 e z j , y 2 = h θ ( z 2 ) = e z 2 Σ j = 1 2 e z j ;
The difference between the two-dimensional feature vector and the training sample for predicting is calculated by loss function;
Loss function is:
WhereinCalculate
AmendmentWherein α is coefficient.
8. method for detecting human face according to claim 1, it is characterised in that the acquisition of the face rectangle frame includes following Step:
The detection picture input primary network station is screened to it, regression correction and is merged, obtained the first face candidate frame;
First face candidate frame input two grade network is screened to it, regression correction and is merged, obtained the second face Candidate frame;
Second face candidate frame input three-level network is screened to it, regression correction and is merged, obtained face rectangle Frame.
9. method for detecting human face according to claim 8, it is characterised in that screened, regression correction and merging include Following steps:
According to detection picture/the first face candidate frame/the second face candidate frame and corresponding face probability, filter out more than setting Determine the face candidate frame of probability threshold value;
Calculated according to the face candidate frame obtained after screening and obtain the second side-play amount, it is returned by second side-play amount Return correction;
The face candidate frame obtained after non-maxima suppression algorithm is to correction is merged, obtain the first face candidate frame/ Second face candidate frame/face rectangle frame.
10. a kind of face detection system based on three-level convolutional neural networks, it is characterised in that including three-level convolutional Neural net Network, the three-level convolutional neural networks include:
Acquiring unit, is used to obtain training sample and detection picture;The training sample at least includes being labeled with human face characteristic point Face picture;
Network training unit, is used to step by step be trained training sample input three-level convolutional neural networks;
It includes:Feature vector module and regression correction module,
The feature vector module, to be predicted rear dimensionality reduction according to the training sample and first n grades training result, obtains To corresponding two-dimensional feature vector, and the first side-play amount is obtained according to its calculating;
The regression correction module, is used to carry out regression correction to the two-dimensional feature vector by first side-play amount, obtains To corresponding training result;
Face datection unit, is used to for the three-level convolutional neural networks after the detection picture input training to carry out face inspection step by step Survey, obtain face rectangle frame.
CN201710078431.3A 2017-02-14 2017-02-14 Face detection method and system based on three-level convolutional neural network Active CN106874868B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710078431.3A CN106874868B (en) 2017-02-14 2017-02-14 Face detection method and system based on three-level convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710078431.3A CN106874868B (en) 2017-02-14 2017-02-14 Face detection method and system based on three-level convolutional neural network

Publications (2)

Publication Number Publication Date
CN106874868A true CN106874868A (en) 2017-06-20
CN106874868B CN106874868B (en) 2020-09-18

Family

ID=59167030

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710078431.3A Active CN106874868B (en) 2017-02-14 2017-02-14 Face detection method and system based on three-level convolutional neural network

Country Status (1)

Country Link
CN (1) CN106874868B (en)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107665355A (en) * 2017-09-27 2018-02-06 重庆邮电大学 A kind of agricultural pests detection method based on region convolutional neural networks
CN107679450A (en) * 2017-08-25 2018-02-09 珠海多智科技有限公司 Obstruction conditions servant's face recognition method based on deep learning
CN107688786A (en) * 2017-08-30 2018-02-13 南京理工大学 A kind of method for detecting human face based on concatenated convolutional neutral net
CN107784294A (en) * 2017-11-15 2018-03-09 武汉烽火众智数字技术有限责任公司 A kind of persona face detection method based on deep learning
CN107784288A (en) * 2017-10-30 2018-03-09 华南理工大学 A kind of iteration positioning formula method for detecting human face based on deep neural network
CN107808142A (en) * 2017-11-09 2018-03-16 北京小米移动软件有限公司 Eyeglass detection method and device
CN107886074A (en) * 2017-11-13 2018-04-06 苏州科达科技股份有限公司 A kind of method for detecting human face and face detection system
CN108363957A (en) * 2018-01-19 2018-08-03 成都考拉悠然科技有限公司 Road traffic sign detection based on cascade network and recognition methods
CN108509940A (en) * 2018-04-20 2018-09-07 北京达佳互联信息技术有限公司 Facial image tracking, device, computer equipment and storage medium
CN108921131A (en) * 2018-07-26 2018-11-30 中国银联股份有限公司 A kind of method and device generating Face datection model, three-dimensional face images
CN108960064A (en) * 2018-06-01 2018-12-07 重庆锐纳达自动化技术有限公司 A kind of Face datection and recognition methods based on convolutional neural networks
CN109344740A (en) * 2018-09-12 2019-02-15 上海了物网络科技有限公司 Face identification system, method and computer readable storage medium
CN109376693A (en) * 2018-11-22 2019-02-22 四川长虹电器股份有限公司 Method for detecting human face and system
CN109389105A (en) * 2018-12-20 2019-02-26 北京万里红科技股份有限公司 A kind of iris detection and viewpoint classification method based on multitask
CN109635693A (en) * 2018-12-03 2019-04-16 武汉烽火众智数字技术有限责任公司 A kind of face image detection method and device
CN109753931A (en) * 2019-01-04 2019-05-14 广州广电卓识智能科技有限公司 Convolutional neural network training method, system and face feature point detection method
CN110263852A (en) * 2019-06-20 2019-09-20 北京字节跳动网络技术有限公司 Data processing method, device and electronic equipment
CN110717481A (en) * 2019-12-12 2020-01-21 浙江鹏信信息科技股份有限公司 Method for realizing face detection by using cascaded convolutional neural network
CN111209819A (en) * 2019-12-30 2020-05-29 新大陆数字技术股份有限公司 Rotation-invariant face detection method, system equipment and readable storage medium
CN111382297A (en) * 2018-12-29 2020-07-07 杭州海康存储科技有限公司 Method and device for reporting user data of user side
CN112232215A (en) * 2020-10-16 2021-01-15 哈尔滨市科佳通用机电股份有限公司 Railway wagon coupler yoke key joist falling fault detection method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105740758A (en) * 2015-12-31 2016-07-06 上海极链网络科技有限公司 Internet video face recognition method based on deep learning
CN106096535A (en) * 2016-06-07 2016-11-09 广东顺德中山大学卡内基梅隆大学国际联合研究院 A kind of face verification method based on bilinearity associating CNN
CN106228137A (en) * 2016-07-26 2016-12-14 广州市维安科技股份有限公司 A kind of ATM abnormal human face detection based on key point location
CN106295476A (en) * 2015-05-29 2017-01-04 腾讯科技(深圳)有限公司 Face key point localization method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106295476A (en) * 2015-05-29 2017-01-04 腾讯科技(深圳)有限公司 Face key point localization method and device
CN105740758A (en) * 2015-12-31 2016-07-06 上海极链网络科技有限公司 Internet video face recognition method based on deep learning
CN106096535A (en) * 2016-06-07 2016-11-09 广东顺德中山大学卡内基梅隆大学国际联合研究院 A kind of face verification method based on bilinearity associating CNN
CN106228137A (en) * 2016-07-26 2016-12-14 广州市维安科技股份有限公司 A kind of ATM abnormal human face detection based on key point location

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HAOXIANG LI.ET AL: ""A Convolutional Neural Network Cascade for Face Detection"", 《2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 *

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107679450A (en) * 2017-08-25 2018-02-09 珠海多智科技有限公司 Obstruction conditions servant's face recognition method based on deep learning
CN107688786A (en) * 2017-08-30 2018-02-13 南京理工大学 A kind of method for detecting human face based on concatenated convolutional neutral net
CN107665355A (en) * 2017-09-27 2018-02-06 重庆邮电大学 A kind of agricultural pests detection method based on region convolutional neural networks
CN107784288A (en) * 2017-10-30 2018-03-09 华南理工大学 A kind of iteration positioning formula method for detecting human face based on deep neural network
CN107808142A (en) * 2017-11-09 2018-03-16 北京小米移动软件有限公司 Eyeglass detection method and device
CN107886074A (en) * 2017-11-13 2018-04-06 苏州科达科技股份有限公司 A kind of method for detecting human face and face detection system
CN107886074B (en) * 2017-11-13 2020-05-19 苏州科达科技股份有限公司 Face detection method and face detection system
CN107784294A (en) * 2017-11-15 2018-03-09 武汉烽火众智数字技术有限责任公司 A kind of persona face detection method based on deep learning
CN107784294B (en) * 2017-11-15 2021-06-11 武汉烽火众智数字技术有限责任公司 Face detection and tracking method based on deep learning
CN108363957A (en) * 2018-01-19 2018-08-03 成都考拉悠然科技有限公司 Road traffic sign detection based on cascade network and recognition methods
CN108509940A (en) * 2018-04-20 2018-09-07 北京达佳互联信息技术有限公司 Facial image tracking, device, computer equipment and storage medium
CN108960064A (en) * 2018-06-01 2018-12-07 重庆锐纳达自动化技术有限公司 A kind of Face datection and recognition methods based on convolutional neural networks
CN108921131B (en) * 2018-07-26 2022-05-24 中国银联股份有限公司 A method and device for generating a face detection model and a three-dimensional face image
CN108921131A (en) * 2018-07-26 2018-11-30 中国银联股份有限公司 A kind of method and device generating Face datection model, three-dimensional face images
CN109344740A (en) * 2018-09-12 2019-02-15 上海了物网络科技有限公司 Face identification system, method and computer readable storage medium
CN109376693A (en) * 2018-11-22 2019-02-22 四川长虹电器股份有限公司 Method for detecting human face and system
CN109635693A (en) * 2018-12-03 2019-04-16 武汉烽火众智数字技术有限责任公司 A kind of face image detection method and device
CN109635693B (en) * 2018-12-03 2023-03-31 武汉烽火众智数字技术有限责任公司 Front face image detection method and device
CN109389105A (en) * 2018-12-20 2019-02-26 北京万里红科技股份有限公司 A kind of iris detection and viewpoint classification method based on multitask
CN109389105B (en) * 2018-12-20 2022-02-08 北京万里红科技有限公司 Multitask-based iris detection and visual angle classification method
CN111382297A (en) * 2018-12-29 2020-07-07 杭州海康存储科技有限公司 Method and device for reporting user data of user side
CN111382297B (en) * 2018-12-29 2024-05-17 杭州海康存储科技有限公司 User side user data reporting method and device
CN109753931A (en) * 2019-01-04 2019-05-14 广州广电卓识智能科技有限公司 Convolutional neural network training method, system and face feature point detection method
CN110263852A (en) * 2019-06-20 2019-09-20 北京字节跳动网络技术有限公司 Data processing method, device and electronic equipment
CN110263852B (en) * 2019-06-20 2021-10-08 北京字节跳动网络技术有限公司 Data processing method and device and electronic equipment
CN110717481B (en) * 2019-12-12 2020-04-07 浙江鹏信信息科技股份有限公司 Method for realizing face detection by using cascaded convolutional neural network
CN110717481A (en) * 2019-12-12 2020-01-21 浙江鹏信信息科技股份有限公司 Method for realizing face detection by using cascaded convolutional neural network
CN111209819A (en) * 2019-12-30 2020-05-29 新大陆数字技术股份有限公司 Rotation-invariant face detection method, system equipment and readable storage medium
CN112232215B (en) * 2020-10-16 2021-04-06 哈尔滨市科佳通用机电股份有限公司 Railway wagon coupler yoke key joist falling fault detection method
CN112232215A (en) * 2020-10-16 2021-01-15 哈尔滨市科佳通用机电股份有限公司 Railway wagon coupler yoke key joist falling fault detection method

Also Published As

Publication number Publication date
CN106874868B (en) 2020-09-18

Similar Documents

Publication Publication Date Title
CN106874868A (en) A kind of method for detecting human face and system based on three-level convolutional neural networks
CN114648665B (en) Weak supervision target detection method and system
CN113378686A (en) Two-stage remote sensing target detection method based on target center point estimation
CN112200143A (en) Road disease detection method based on candidate area network and machine vision
WO2019140767A1 (en) Recognition system for security check and control method thereof
CN111368690A (en) Deep learning-based video image ship detection method and system under influence of sea waves
CN116778277B (en) Cross-domain model training method based on progressive information decoupling
Wang et al. Feature extraction and segmentation of pavement distress using an improved hybrid task cascade network
CN113704276B (en) Map updating method, device, electronic device and computer-readable storage medium
CN112651996B (en) Target detection tracking method, device, electronic equipment and storage medium
CN118968035B (en) A method for detecting small targets in target areas based on UAV images
CN110929746A (en) A deep neural network-based method for location, extraction and classification of electronic file titles
CN116977710A (en) Remote sensing image long tail distribution target semi-supervised detection method
CN102024149B (en) Method of object detection and training method of classifier in hierarchical object detector
CN113343989B (en) Target detection method and system based on self-adaption of foreground selection domain
CN115170611A (en) Complex intersection vehicle driving track analysis method, system and application
CN118823736A (en) A wildlife target detection method based on improved yolov8 and knowledge distillation
Han et al. MS-YOLOv8-based object detection method for pavement diseases
Mu et al. Small target detection in drone aerial images based on feature fusion
CN112287895A (en) Model construction method, recognition method and system for river drain outlet detection
Chen et al. All-in-one YOLO architecture for safety hazard detection of environment along high-speed railway
CN107247967A (en) A kind of vehicle window annual test mark detection method based on R CNN
Bi et al. DR-YOLO: An improved multi-scale small object detection model for drone aerial photography scenes based on YOLOv7
Dou et al. Analysis of vehicle and pedestrian detection effects of improved YOLOv8 model in drone-assisted urban traffic monitoring system
CN114863122A (en) Intelligent high-precision pavement disease identification method based on artificial intelligence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载