WO2018176281A1 - Sketch image generation method and device - Google Patents
Sketch image generation method and device Download PDFInfo
- Publication number
- WO2018176281A1 WO2018176281A1 PCT/CN2017/078637 CN2017078637W WO2018176281A1 WO 2018176281 A1 WO2018176281 A1 WO 2018176281A1 CN 2017078637 W CN2017078637 W CN 2017078637W WO 2018176281 A1 WO2018176281 A1 WO 2018176281A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- face
- sketch
- hair
- feature
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 89
- 230000001815 facial effect Effects 0.000 claims abstract description 190
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 170
- 230000002194 synthesizing effect Effects 0.000 claims abstract description 8
- 238000012549 training Methods 0.000 claims description 81
- 230000008569 process Effects 0.000 claims description 31
- 238000001914 filtration Methods 0.000 claims description 29
- 230000015654 memory Effects 0.000 claims description 27
- 238000009499 grossing Methods 0.000 claims description 14
- 230000015572 biosynthetic process Effects 0.000 claims description 6
- 238000003860 storage Methods 0.000 claims description 6
- 238000003786 synthesis reaction Methods 0.000 claims description 6
- 238000005516 engineering process Methods 0.000 abstract description 25
- 238000013461 design Methods 0.000 description 44
- 230000006870 function Effects 0.000 description 40
- 238000010586 diagram Methods 0.000 description 15
- 230000004913 activation Effects 0.000 description 14
- 238000012545 processing Methods 0.000 description 12
- 238000004590 computer program Methods 0.000 description 6
- 230000000694 effects Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 208000032544 Cicatrix Diseases 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000005034 decoration Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 231100000241 scar Toxicity 0.000 description 1
- 230000037387 scars Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000010189 synthetic method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
Definitions
- the present application relates to the field of image processing technologies, and in particular, to a method and an apparatus for generating a sketch image.
- the sketch portrait automatic generation refers to a process of automatically generating a face image with a sketch style by inputting a face image.
- the automatic generation technology of face sketch images has important applications in many fields.
- a sketch image generated based on a photo of a suspect’s ID card can be used to compare with a sketch image drawn according to the witness’s description, thereby assisting the public security organ in determining the identity of the suspect; in the animation industry and
- the field of social networking is mainly used to render sketches of people's photos in a stylized manner.
- the current face sketch image automatic generation technology is mainly based on a synthetic method, that is, a complete sketch image is synthesized by using similar parts in the sample image with the input image.
- a database including a plurality of sample image blocks and a sketch image block corresponding to each sample image block is created, wherein each sample image block respectively includes different feature information related to the face, such as facial features and faces. Decorative information such as decorations, hair, and beards.
- the input image is divided into a plurality of image blocks, and a sample image block similar to the image block is searched in the database for each image segmentation, and a sketch image block corresponding to each image block similar to the image block is obtained, and all the acquired sketches are obtained.
- the image block is combined into a sketch image.
- the multi-scale Markov Random Filed (MRF) algorithm model is used to remove the edges between adjacent sub-blocks in the synthesized sketch image to obtain a relatively natural sketch image.
- the synthetic face-based sketch image automatic generation technology smoothes the synthesized sketch image by the MRF algorithm model, so that some features such as flaws and scars on the face of the synthesized sketch image are smoothed out.
- the synthesized sketch image does not retain the texture detail information in the face photo well.
- the synthetic face-based sketch image automatic generation technology usually needs to establish a sample database, and the sample information included in the sample database is related to the face, and the sample database in the established sample database has a limited number of samples and cannot cover enough samples. Data, so when the elements not included in the sample data appear in the face image, the synthetic face-based sketch image automatic generation technology cannot accurately generate the sketch image, so the accuracy of the synthetic face-based sketch image automatic generation technique is better.
- the embodiment of the present application provides a method and a device for generating a sketch image, which are used to solve the problem that the automatic generation technology of the face sketch image existing in the prior art has low accuracy, poor generalization ability, and slow sketch image generation. problem.
- an embodiment of the present application provides a method for generating a sketch image, which may be applied to an electronic device, including:
- the first network in the deep convolutional neural network model is pre-trained Obtaining a facial sketch feature in the face image, obtaining a facial structure sketch map, and obtaining the P convolution layer of the second network branch in the deep convolutional neural network model
- the hair sketch feature in the face image obtains a hair texture sketch map, wherein the P is an integer greater than zero.
- the facial structure sketch map and the hair texture sketch map are then synthesized to obtain a sketch image of the face image.
- a design based on a deep convolutional neural network, includes a first network branch for generating a face feature, and a structure for generating a second network branch including a hair feature, from a large number of training samples.
- the effective feature expression is learned, and the network model which can generate the accurate and natural face sketch image of the original image is trained to realize the automatic generation of the face sketch image.
- the technique of generating a face sketch image based on a deep convolutional neural network no longer depends on the sample database, but through the deep convolutional neural network.
- a network branch generates a structural sketch map including facial features, generates a structural sketch map including hair features through a second network branch in the deep convolutional neural network, and then synthesizes the structural sketch map and the texture sketch map to obtain a final human face sketch image. It improves the accuracy and generalization ability of the face sketch image generation technology, and reduces the workload in the face sketch image generation process, thereby improving the speed of face sketch image generation.
- each of the convolutional layers in the deep convolutional neural network model has a modified linear unit (English: Rectified Linear Units, referred to as: ReLU) as an activation function.
- the convolution kernel size model used by each of the convolutional layers in the deep convolutional neural network model is an r x r model.
- the first N convolutional layers of the first network branch are the same as or coincide with the first N convolutional layers of the second network branch, and the N is an integer greater than 0 and less than P.
- the first N convolution layers of the first network branch are the same as the first N convolution layers of the second network branch, or the first N convolution layers of the first network branch and the The first N convolutional layers of the second network branch share the first N convolutional layers in the deep convolutional neural network model.
- the first N convolution layers of the first network branch in the embodiment of the present application are the same as or coincide with the first N convolution layers of the second network branch, which improves the computational efficiency of the deep convolutional neural network model.
- the obtaining, by the P convolution layers of the first network branch in the deep convolutional neural network model, the facial sketch features in the face image including:
- the first N convolutional layers of the first network branch are used to filter the background features in the face image to be processed, and the last M convolutional layers are used to obtain the facial structure sketch map;
- the second network branch The first N convolutional layers are used to filter the background features in the face image to be processed, and the M convolutional layers are used to obtain the texture sketch of the hair texture, which improves the accuracy of the face sketch image generation technology and improves the accuracy. The speed at which face sketch images are generated.
- the convolution kernel size of the last M convolutional layers of the first network branch, and the second The convolution kernel sizes of the last M convolutional layers of the network branch are equal.
- the convolution kernel size of the last M convolutional layers of the first network branch in the above design is equal to the convolution kernel size of the last M convolutional layers of the second network branch, and the face sketch image generation is improved.
- the accuracy of the technology is equal to the convolution kernel size of the last M convolutional layers of the second network branch.
- the M is 2
- the convolution kernels of the last two convolution layers of the first network branch are equal in size
- the convolutions of the last two convolution layers of the second network branch are The core size is equal.
- the N is 4, and the first N convolution layers of the first network branch in the deep convolutional neural network model are filtered to filter background features in the face image, including:
- the background features in the horizontal direction and the vertical direction of the face image are filtered by the first convolution layer and the second convolution layer in the first N convolution layers of the first network branch in the deep convolutional neural network model.
- the face image filtered for the background feature is horizontally and vertically Smoothing in the direction.
- the first convolutional layer and the second convolutional layer in the first N convolutional layers of the first network branch are used to filter the background features of the horizontal and vertical directions of the face image to be processed, and the third The convolution layer and the fourth convolution layer are used for smoothing the horizontal and vertical directions for the face image filtered by the background feature, thereby improving the accuracy of the face sketch image generation technique and causing the generated Sketch images are more natural.
- the convolution kernel size of the first convolutional layer is equal to the convolution kernel size of the second convolutional layer, and the convolution kernel size of the third convolutional layer and the fourth convolutional layer
- the convolution kernel has the same size.
- the convolution kernel size of the first convolutional layer is equal to the convolution kernel size of the second convolutional layer, and the convolution kernel size of the third convolutional layer and the convolution kernel size of the fourth convolutional layer The same, the accuracy of the face sketch image generation technology is improved.
- the method further includes:
- S (i, j) is a pixel value of a pixel point of the i-th row and the j-th column in the sketch image of the face image
- P h(i, j) is a sketch image of the face image
- S S(i, j) is the pixel value of the pixel of the i-th row and the j-th column in the sketch image of the face structure
- S t(i, j) is A pixel value of a pixel of the i-th row and the j-th column in the hair texture sketch map, wherein i, j are integers greater than zero.
- the synthesized sketch image can not only retain the facial structure information well, but also the hair texture information. Also reserved is better.
- the deep convolutional neural network model is trained as follows:
- the training sample database includes a plurality of personal face sample images and a sketch sample image corresponding to each face sample image
- the initialized Deep convolutional neural network models include weights and offsets
- the background features in the face sample image are filtered by the first N convolution layers of the K-1 sub-depth convolutional neural network model to obtain the face sample image.
- a face feature map the K being an integer greater than 0;
- the weight and offset used in the K+1th training process are adjusted based on an error value between the sketch image of the face sample image and the sketch sample image corresponding to the face sample image.
- the deep convolutional neural network model is trained by using a large number of face sample images, and the sketch image is not dependent on the sample database when generating the sketch image of the face image to be processed, but can directly pass the trained depth volume.
- the neural network model generates the sketch image of the face image, which improves the accuracy and generalization ability of the face sketch image generation technology, and reduces the workload in the process of generating the face sketch image, thereby improving the face sketch.
- the speed of image generation is used to generate a large number of face sample images, and the sketch image is not dependent on the sample database when generating the sketch image of the face image to be processed, but can directly pass the trained depth volume.
- the background features in the face sample image are filtered by the first N convolution layers of the K-1 sub-adjusted deep convolutional neural network model.
- the pixel value of any one of the pixel points in the sketch average map is an average value of pixel values of pixel points in the same position in the sketch sample image in the training sample database that are at the same position as the any one of the pixel points;
- the background features in the face enhancement image are filtered by the first N convolutional layers of the K-1 sub-depth convolutional neural network model.
- the face enhancement image is obtained, and the facial feature information and the hair feature of the face sample image are enhanced.
- Information which improves the accuracy of face sketch image generation techniques.
- the face feature map of the face sample image is obtained by the last M convolution layers of the first network branch of the K-1 sub-depth deep convolutional neural network model.
- Facial sketch features including:
- the facial sketch feature in the facial enhancement feature map is acquired by the last M convolutional layers of the first network branch of the K-1 adjusted deep convolutional neural network model.
- the face sample image is enhanced by adding the pixel values of the pixel points including the face feature information and the pixel points at the same position of the image block in the corresponding target region in the face feature map.
- the facial feature information enables the synthesized sketch image to retain the facial structure information well.
- the face feature map of the face sample image is obtained by the last M convolution layers of the second network branch of the K-1 sub-depth deep convolutional neural network model.
- Hair sketch features including:
- the hair sketch feature in the hair enhancement feature map is obtained by the last M convolution layers of the second network branch of the K-1 adjusted deep convolutional neural network model.
- the face sample image is enhanced by adding the pixel values of the image block including the hair feature information and the pixel points at the same position of the image block in the corresponding target region in the face feature map.
- the hair feature information enables the synthesized sketch image to retain the hair texture information well.
- the obtaining an image block including facial feature information from the plurality of mutually overlapping image blocks includes:
- each of the image blocks Determining, for each of the plurality of mutually overlapping image blocks, a face probability of each pixel point in each of the image blocks as a facial feature point; determining a number of pixel points whose face probability is not 0 When it is greater than the preset threshold, it is determined that each of the image blocks is an image block including facial feature information.
- each of the An image block is an image block including facial feature information, which improves the accuracy of acquiring an image block including facial feature information.
- the obtaining an image block including hair feature information from the plurality of mutually overlapping image blocks includes:
- each of the image blocks Determining, for each of the plurality of mutually overlapping image blocks, a hair probability of each pixel point in each of the image blocks as a hair feature point; determining a number of pixel points whose hair probability is not zero When it is greater than the preset threshold, it is determined that each of the image blocks is an image block including hair feature information.
- each of the An image block is an image block including hair feature information, which improves the accuracy of acquiring an image block including hair feature information.
- the embodiment of the present application provides a device for generating a sketch image, including:
- An obtaining module configured to obtain a face image to be processed
- a depth convolutional neural network model configured to acquire a facial structure sketch map and a hair texture sketch map in the face image acquired by the obtaining module;
- the deep convolutional neural network model is pre-trained, including the first network a branch module and a second network branch module;
- the first network branching module is configured to acquire a facial sketch feature in the facial image acquired by the acquiring module, to obtain a facial structure sketch map, where the first network branching module includes P convolutional layers.
- P is an integer greater than 0;
- the second network branching module is configured to obtain a hair sketch feature in the face image acquired by the acquiring module, to obtain a hair texture sketch map; and the second network branching module includes P convolution layers;
- a synthesizing module configured to synthesize the facial structure sketch map obtained by the first network branching module and the hair texture sketch map obtained by the second network branching module to obtain a sketch image of the facial image.
- the first N convolution layers of the P convolutional layers included in the first network branching module are The first N convolutional layers in the P convolutional layers included in the second network branching module are the same or coincident, and the N is an integer greater than 0 and less than P.
- the first network branching module is specifically configured to:
- the second network branch module is specifically configured to:
- the convolution kernel size of the last M convolutional layers of the first network branching module is equal to the convolution kernel size of the last M convolutional layers of the second network branching module.
- the N is 4, and the first network branching module filters the background features in the face image when passing through the first N convolution layers of the first network branching module. Specifically for:
- the convolution kernel size of the first convolutional layer is equal to the convolution kernel size of the second convolutional layer, and the convolution kernel size of the third convolutional layer and the fourth convolutional layer
- the convolution kernel has the same size.
- the acquiring module is further configured to acquire a hair probability of each pixel point in the face image as a hair feature point;
- the synthesis module is specifically configured to:
- S (i, j) is a pixel value of a pixel point of the i-th row and the j-th column in the sketch image of the face image
- P h(i, j) is a sketch image of the face image
- S S(i, j) is the pixel value of the pixel of the i-th row and the j-th column in the sketch image of the face structure
- S t(i, j) is A pixel value of a pixel of the i-th row and the j-th column in the hair texture sketch map, wherein i, j are integers greater than zero.
- the device further includes:
- a training module for training the deep convolutional neural network model by:
- the training sample database includes a plurality of personal face sample images and a sketch sample image corresponding to each face sample image
- the initialized Deep convolutional neural network models include weights and offsets
- the background features in the face sample image are filtered by the first N convolution layers of the K-1 sub-depth convolutional neural network model to obtain the face sample image.
- a face feature map the K being an integer greater than 0;
- the weight and offset used in the K+1th training process are adjusted based on an error value between the sketch image of the face sample image and the sketch sample image corresponding to the face sample image.
- the training module filters the face sample image by the first N convolution layers of the K-1 sub-depth convolutional neural network model during the Kth training process.
- the background feature specifically for:
- the pixel value of any one of the pixel points in the sketch average map is an average value of pixel values of pixel points in the same position in the sketch sample image in the training sample database that are at the same position as the any one of the pixel points;
- the background features in the face enhancement image are filtered by the first N convolutional layers of the K-1 sub-depth convolutional neural network model.
- the acquiring module is further configured to divide the face sample image into a plurality of mutually overlapping image blocks, and obtain facial feature information from the plurality of mutually overlapping image blocks.
- the training module acquires the face feature map of the face sample image in the last M convolution layers of the first network branch module of the deep convolutional neural network model that has undergone K-1 adjustments.
- the facial sketch feature it is specifically used to:
- the acquiring module is further configured to divide the face sample image into a plurality of overlapping image blocks, and obtain hair feature information from the plurality of mutually overlapping image blocks.
- Image block
- the training module acquires the face feature map of the face sample image in the last M convolution layers of the second network branch module of the deep convolutional neural network model that has undergone K-1 adjustments.
- sketching hair features it is specifically used to:
- the acquiring module is specifically configured to: when acquiring an image block including facial feature information from the plurality of mutually overlapping image blocks:
- each of the image blocks Determining, for each of the plurality of mutually overlapping image blocks, a face probability of each pixel point in each of the image blocks as a facial feature point; determining a number of pixel points whose face probability is not 0 When it is greater than the preset threshold, it is determined that each of the image blocks is an image block including facial feature information.
- the acquiring module is specifically configured to: when acquiring an image block including hair feature information from the plurality of mutually overlapping image blocks:
- each of the image blocks Determining, for each of the plurality of mutually overlapping image blocks, a hair probability of each pixel point in each of the image blocks as a hair feature point; determining a number of pixel points whose hair probability is not zero When it is greater than the preset threshold, it is determined that each of the image blocks is an image block including hair feature information.
- a design based on a deep convolutional neural network, includes a first network branch for generating a face feature, and a structure for generating a second network branch including a hair feature, from a large number of training samples.
- the effective feature expression is learned, and the network model which can generate the accurate and natural face sketch image of the original image is trained to realize the automatic generation of the face sketch image.
- the technique of generating a face sketch image based on a deep convolutional neural network no longer depends on the sample database, but through the deep convolutional neural network.
- a network branch generates a structural sketch map including facial features, generates a structural sketch map including hair features through a second network branch in the deep convolutional neural network, and then synthesizes the structural sketch map and the texture sketch map to obtain a final human face sketch image. It improves the accuracy and generalization ability of the face sketch image generation technology, and reduces the workload in the face sketch image generation process, thereby improving the speed of face sketch image generation.
- an embodiment of the present invention further provides a deep convolutional neural network model, where the model includes a first network branching module and a second network branching module;
- the first network branching module includes P convolution layers, and is configured to acquire a facial sketch feature in the face image acquired by the acquiring module, to obtain a facial structure sketch map; wherein the P is greater than 0. The integer.
- the second network branching module includes P convolution layers, and is configured to obtain a hair sketch feature in the face image acquired by the acquiring module, to obtain a hair texture sketch map.
- an embodiment of the present application further provides a terminal, where the terminal includes a processor and a memory, where the memory is used to store a software program, and the processor is configured to read a software program stored in the memory and implement the first
- the electronic device can be a mobile terminal, a computer, or the like.
- the embodiment of the present application further provides a computer storage medium, where the software program stores a software program, where the software program can be implemented by one or more processors and can implement the first aspect or the first Any of the aspects provided by the design.
- FIG. 1 is a schematic flowchart diagram of a method for generating a sketch image according to an embodiment of the present application
- FIG. 2A is a schematic structural diagram of a first deep convolutional neural network model according to an embodiment of the present application
- FIG. 2B is a schematic structural diagram of another first deep convolutional neural network model according to an embodiment of the present application.
- FIG. 3 is a schematic flowchart of a method for filtering background features in a face image according to an embodiment of the present disclosure
- FIG. 4 is a schematic structural diagram of a second deep convolutional neural network model according to an embodiment of the present application.
- FIG. 5 is a schematic diagram of a process for generating a sketch image according to an embodiment of the present application.
- FIG. 6A is a view of four face images to be processed according to an embodiment of the present application.
- 6B is an effect diagram of generating a sketch image of four face images to be processed according to an embodiment of the present application
- FIG. 7 is a schematic structural diagram of a first deep convolutional neural network model according to an embodiment of the present application.
- FIG. 8 is a schematic flowchart of a first deep convolutional neural network model training process according to an embodiment of the present disclosure
- FIG. 9 is a schematic diagram of a method for adding an image block according to an embodiment of the present application.
- FIG. 10 is a schematic structural diagram of a device for generating a sketch image according to an embodiment of the present application.
- FIG. 11 is a schematic structural diagram of a deep convolutional neural network model according to an embodiment of the present application.
- FIG. 12 is a schematic structural diagram of a terminal implementation manner according to an embodiment of the present disclosure.
- the embodiment of the present invention provides a method and a device for generating a sketch image, which are used to solve the problem that the automatic face generation image automatic generation technology in the prior art has low accuracy, poor generalization ability, and slow sketch image generation. .
- the method and the device are based on the same inventive concept. Since the principles of the method and the device for solving the problem are similar, the implementation of the device and the method can be referred to each other, and the repeated description is not repeated.
- the embodiments of the present application can be applied to electronic devices, such as computers, tablets, notebooks, smart phones, servers, and the like.
- the fields of application of the embodiments of the present application include, but are not limited to, a face image field, a vehicle image field, a plant image field, or other types of image fields.
- the embodiment of the present application is applied to the face image field, and when generating a face sketch image, a plurality of personal face sample images are used for training in advance; when applied to a vehicle image field, when generating a vehicle sketch image, a plurality of vehicle sample images are used in advance. Training; applied to the field of plant images, when plant sketch images are generated, several plant sample images are used in advance for training; when applied to other types of image fields, when generating other types of sketch images, several other types of sample images are pre-applied. Train.
- Embodiments of the present application can be used to generate grayscale images in addition to being used to generate sketch images.
- the embodiment of the present application is applied to the field of face images, and when generating a face gray image, a plurality of face sketch sample images are used in advance for training; when applied to a vehicle image field, when generating a vehicle gray image, a plurality of The vehicle sketch sample image is used for training; in the field of plant image, when plant grayscale image is generated, several plant sketch sample images are used for training in advance; when applied to other types of image fields, other types of grayscale images are generated in advance.
- Several other types of sketch sample images are trained.
- a convolutional neural network is a multi-layered neural network, each layer consisting of multiple two-dimensional planes, each of which consists of multiple independent neurons.
- a neuron can be considered as one pixel.
- FIG. 1 is a flowchart of a method for generating a sketch image according to an embodiment of the present disclosure. The method is performed by an electronic device, and specifically includes the following:
- Step S101 Acquire a face image to be processed.
- step S101 the manner of acquiring the face image to be processed includes, but is not limited to, collecting a face image to be processed through the sensing device, acquiring a face image to be processed in a database, and the like.
- the sensing device includes, but is not limited to, a light sensing device, an imaging device, an acquisition device, and the like.
- the database includes, but is not limited to, a local database, a cloud database, a USB flash drive, a hard disk, and the like.
- Step S102 Acquire a facial sketch feature in the face image by using P convolution layers of the first network branch in the pre-trained first deep convolutional neural network model, to obtain a facial structure sketch map, where the P Is an integer greater than 0.
- Step S103 Acquire a hair sketch feature in the face image by using P convolution layers of the second network branch in the first deep convolutional neural network model to obtain a hair texture sketch map.
- Step S104 synthesizing the facial structure sketch map and the hair texture sketch map to obtain a sketch image of the face image.
- step S102 and step S103 are not strictly sequential. Step S103 may be performed after step S102, or step S103 may be performed after step S102, or step S102 and step S103 may be performed simultaneously. The embodiment is not specifically limited herein.
- a design based on a deep convolutional neural network, includes a first network branch for generating a face feature, and a structure for generating a second network branch including a hair feature, from a large number of training samples.
- the effective feature expression is learned, and the network model which can generate the accurate and natural face sketch image of the original image is trained to realize the automatic generation of the face sketch image.
- the technique of generating a face sketch image based on a deep convolutional neural network no longer depends on the sample database, but through the deep convolutional neural network.
- a network branch generates a structural sketch map including facial features, generates a structural sketch map including hair features through a second network branch in the deep convolutional neural network, and then synthesizes the structural sketch map and the texture sketch map to obtain a final human face sketch image. It improves the accuracy and generalization ability of the face sketch image generation technology, and reduces the workload in the face sketch image generation process, thereby improving the speed of face sketch image generation.
- the first deep convolutional neural network model may further include an input layer before the P convolution layers of the first network branch and the P convolution layers of the second network branch, where the input layer The number of filter channels is 3.
- the electronic device processes the image of the face to be processed through the input layer to obtain three images, which are images including red (English: red, referred to as: R) elements, and green.
- R red
- G green
- B blue
- an image of the R element, an image of the G element, and an image of the B element are input to the first convolutional layer.
- the first deep convolutional neural network model may also extract an element feature generation image separately for the luminance chrominance YUV element.
- Each of the convolutional layers in the first deep convolutional neural network model may be an activation function by using a modified linear unit (English: Rectified Linear Units, referred to as ReLU).
- ReLU Rectified Linear Units
- the convolution kernel (Convolution, referred to as Conv) used in each convolution layer in the first deep convolutional neural network model may be A*B, wherein both A and B are positive.
- the integers, A and B, may be equal or non-equal, and are not specifically limited in this embodiment.
- each convolution layer in the first deep convolutional neural network model have one or more feature maps, the number of output feature maps and the number of input feature maps and filtering.
- the number of channels is related, for example, inputting a face image, and after passing through the three filtering channels of the input layer, three feature maps are obtained.
- the first network branch and the second network branch in the embodiment of the present application may be two independent branches, as shown in FIG. 2A; of course, the first network branch and the second network branch may also be shared.
- the first depth convolution god The first N convolutional layers in the network model are shown in Figure 2B.
- the first N convolution layers of the first network branch and the first N convolutions of the second network branch The layers are the same, and the N is an integer greater than 0 and less than P.
- the first network branch and the second network branch are two independent branches, or the first network branch and the second network branch share the first N convolution layers
- the first The first N convolutional layers of a network branch and the first N convolutional layers of the second network branch are used to filter background features in the face image to obtain a facial feature map.
- N taking N as 4 as an example, a process of obtaining a face feature map by filtering background features in the face image through the first N convolution layers is described in detail:
- the convolution kernel size of the first convolutional layer is equal to the convolution kernel size of the second convolutional layer.
- the first convolution layer may be a convolution layer for filtering background features in a horizontal direction of the face image, and the second convolution layer is for filtering a vertical direction of the face image.
- a convolutional layer of a background feature the first convolutional layer may also be a convolution layer for filtering background features in a vertical direction of the face image, the second convolutional layer being for filtering the face
- the convolution layer of the background feature in the horizontal direction of the image is not specifically limited herein. That is, in the embodiment of the present application, the order of filtering the background feature in the horizontal direction of the face image or the background feature in the vertical direction of the face image is not specifically limited.
- the convolution kernel size of the third convolutional layer is equal to the convolution kernel size of the fourth convolutional layer.
- the third convolutional layer may be a convolutional layer for smoothing processing in a horizontal direction for the face image filtered with background features, the fourth convolutional layer being used for filtering background features.
- the face image is a convolution layer that is smoothed in a vertical direction; the third convolution layer may also be used for smoothing the vertical direction for the face image filtered by the background feature.
- the convolutional layer is a convolutional layer for performing smoothing processing in the horizontal direction for the face image in which the background feature is filtered.
- the embodiment of the present application is not specifically limited herein. That is, in the embodiment of the present application, the smoothing process in the horizontal direction or the smoothing process in the vertical direction is not specifically limited.
- the first N convolutional layers of the first network branch in the embodiment of the present application are the same as the first N convolutional layers of the second network branch or share the first N in the first deep convolutional neural network model.
- a convolution layer that improves computational efficiency of the first deep convolutional neural network model and uses a first convolutional layer and a second convolutional layer in the first N convolutional layers of the first network branch for filtering
- the background feature of the face image to be processed in the horizontal direction and the vertical direction, the third convolution layer and the fourth convolution layer are used for smoothing the horizontal and vertical directions for the face image filtered with the background feature , improve the accuracy of the face sketch image generation technology, and make the generated sketch image more natural.
- the number of filter channels of the first convolution layer is a
- the number of filter channels of the second convolution layer is b
- the number of filter channels of the third convolution layer is c
- the fourth The number of filtering channels of the convolutional layer is d
- the a, b may both be positive integers greater than or equal to 100 and less than or equal to 200, and the a and the b are equal
- the c and d may both be greater than or equal to 1.
- a positive integer less than or equal to 100, the c and the d being equal.
- the number of the filtering channels of each convolution layer is not specifically limited in this embodiment.
- the convolution kernel size of the fifth convolutional layer of the first network branch is equal to the convolution kernel size of the sixth convolutional layer.
- the fifth convolution layer of the first network branch may be a convolution layer for acquiring a horizontal sketch feature in the horizontal feature map
- the sixth convolution layer is for acquiring the facial feature a convolution layer of a face sketch feature in a vertical direction in the figure
- the fifth convolution layer of the first network branch may also be a convolution layer for acquiring a face sketch feature in a vertical direction in the face feature map
- the sixth volume is a convolution layer for acquiring the horizontal sketch feature in the horizontal feature map.
- the embodiment of the present application is not specifically limited herein. That is, in the embodiment of the present application, the order of acquiring the face sketch feature in the horizontal direction in the face feature map or the face sketch feature in the vertical direction in the face feature map is not specifically limited.
- the convolution kernel size of the fifth convolutional layer of the second network branch is equal to the convolution kernel size of the sixth convolutional layer.
- the fifth convolution layer of the second network branch may be a convolution layer for acquiring a horizontal direction hair sketch feature in the face feature map, and the sixth convolution layer is for acquiring the facial feature a convolutional layer of a vertical hair sketch feature in the figure; the fifth convolutional layer of the second network branch may also be a convolution layer for acquiring a vertical hair sketch feature in the facial feature map
- the sixth volume layer is a convolution layer for acquiring the horizontal direction hair sketch feature in the face feature map, which is not specifically limited herein. That is, in the embodiment of the present application, the order of acquiring the hair sketch feature in the horizontal direction in the face feature map or the hair sketch feature in the vertical direction in the face feature map is not specifically limited.
- the convolution kernel size of the last M convolution layers of the first network branch is equal to the convolution kernel size of the last M convolution layers of the second network branch.
- N 4 and M as 2 as an example
- the convolution kernel size of the fifth convolutional layer of the first network branch is equal to the convolution kernel size of the fifth convolutional layer of the second network branch
- the convolution kernel size of the sixth convolutional layer of the first network branch is equal to the convolution kernel size of the sixth convolutional layer of the second network branch.
- a fifth convolution layer of the first network branch, a sixth convolution layer of the first network branch, a fifth convolution layer of the second network branch, and the second network branch can be all 1.
- the hair probability of the hair feature point may be based on each pixel point.
- the synthesis is performed. Specifically, before performing the synthesis of the facial structure sketch map and the hair texture sketch map to obtain the sketch image of the face image, each pixel point in the face image is acquired. The probability of hair for the hair feature points.
- the facial structure sketch map and the hair texture sketch map are combined to obtain a sketch image of the face image, which meets the following formula requirements:
- S (i, j) is a pixel value of a pixel point of the i-th row and the j-th column in the sketch image of the face image
- P h(i, j) is a sketch image of the face image
- S S(i, j) is the pixel value of the pixel of the i-th row and the j-th column in the sketch image of the face structure
- S t(i, j) is A pixel value of a pixel of the i-th row and the j-th column in the hair texture sketch map, wherein i, j are integers greater than zero.
- the synthetic sketch image not only retains the facial structure information well, but also the hair texture by adopting a method of synthesizing the facial structure sketch map and the hair texture sketch map based on the hair probability to obtain the sketch image of the face image. Information is also kept better.
- the hair probability of each pixel in the face image is obtained by using a second deep convolutional neural network model.
- the second deep convolutional neural network model may include seven connection layers, wherein the first connection layer, the second connection layer, and the third connection layer each include one ReLU A convolution layer with a convolution (Conv) kernel size of 5 ⁇ 5, a pooling layer with a Conv size of 3 ⁇ 3, and a local response normalization (English: Local Response Normalization, for short: The LRN) layer; the fourth connection layer includes a convolution layer having a ReLU as an activation function and a Conv kernel size of 3 ⁇ 3; and the fifth connection layer includes a convolution with a ReLU as an activation function and a Conv kernel size of 3 ⁇ 3.
- the first connection layer, the second connection layer, and the third connection layer each include one ReLU A convolution layer with a convolution (Conv) kernel size of 5 ⁇ 5, a pooling layer with a Conv size of 3 ⁇ 3, and a local response normalization (English: Local Response Normalization, for short: The LRN) layer; the fourth connection layer includes a convolution layer having a ReLU as
- the sixth connection layer includes a convolution layer having a ReLU as an activation function and a Conv core size of 1 ⁇ 1; and the seventh connection layer includes a convolution layer having a ReLU as an activation function and a Conv size of 1 ⁇ 1.
- the second deep convolutional neural network model can be trained in advance by sample images in the Helen dataset of the Helen database.
- the first connection layer, the second connection layer, and the third connection layer are configured to acquire hair features, facial features, and background features of the face image; a fourth connection layer and a fifth connection layer, configured to Obtaining the facial image of the hair feature, the facial feature, and the background feature, acquiring the facial contour feature, the hair contour feature, and the background contour feature in the horizontal direction and the vertical direction; the sixth connection layer and the seventh connection layer are used for acquiring The face image of the facial contour feature, the hair contour feature, and the background contour feature are smoothed in the horizontal direction and the vertical direction.
- the pixel contour is covered at each pixel point In the region, the hair probability of each pixel is 1, the face probability and the background probability are both 0; when the pixel is located in the area covered by the face contour, the face probability of each pixel is 1
- the hair probability and the background probability are both 0; when each pixel point is located in an area covered by the background contour, the background probability of each pixel point is 1, and the hair probability and the face probability are both 0.
- the pixels in the area covered by the facial contour are facial feature points
- the pixels in the area covered by the hair contour are hair feature points
- the pixels in the area covered by the background contour are background feature points.
- a second network branch of a deep convolutional neural network model four convolutional layers, acquires a hair texture sketch of the face image. Then, the facial portion of the facial structure sketch map is obtained according to the hair probability of each pixel point as the hair feature point, and the hair portion of the hair texture sketch map is obtained according to the hair probability of each pixel point as the hair feature point, and finally The face portion and the hair portion synthesize a sketch image of the face image.
- the face image is synthesized by the sketch generation method provided in the embodiment of the present application to obtain a sketch image, as shown in FIG. 6A.
- FIG. 6B there are four face images to be processed, as shown in FIG. 6B, and an effect diagram of generating a sketch image for the four face images to be processed shown in FIG. 6A, that is, four to-be-processed as shown in FIG. 6A.
- the face image is processed by the first deep convolutional neural network model to obtain a sketch image.
- the first deep convolutional neural network model used in the embodiment of the present application may be obtained by training the initial depth convolutional neural network model in the training sample database in advance, and the training sample database includes several individuals.
- the face sample image and the sketch sample image corresponding to each face sample image, the initialized first deep convolutional neural network model may include weights and offsets, and may of course include only weights, offset to zero.
- the first four convolutional layers of the first deep convolutional neural network model shared by the first network branch and the second network branch are respectively a ReLU with an activation function and a Conv kernel size of 5 ⁇ 5.
- the sixth convolutional layer of the first network branch with the ReLU as the activation function and the Conv kernel size of 3 ⁇ 3; the last two convolutional layers of the second network branch have the ReLU as the activation function and the Conv size is The fifth convolutional layer of the 3 ⁇ 3 second network branch, with ReLU as the activation function and the Conv kernel size is 3 ⁇
- the size of the above convolution kernel is only an example, and does not specifically limit the configuration of the size of the convolution kernel in the present application.
- the size of the size of the convolution core is not specifically limited in the embodiment of the present application.
- the training process of the first deep convolutional neural network model is shown in Figure 8:
- the weighted configuration of the initialized first deep convolutional neural network model conforms to a Gaussian distribution with a mean of 0 and a variance of 0.01, and the offset configuration is 0.
- the face sample image and the pixel value of the pixel at the same position in the sketch average image are added to obtain a face enhancement image.
- the pixel value of any one of the pixel points in the sketch average graph is an average value of pixel values of pixel points at the same position as the any one of the sketch sample images in the training sample database.
- the face enhancement image for filtering the background feature is horizontally and vertically Smoothing in the direction.
- the face sample image is divided into a plurality of mutually overlapping image blocks, and an image block including facial feature information and an image block including hair feature information are acquired from the plurality of mutually overlapping image blocks.
- the number of image blocks including facial feature information is H, and the number of image blocks including hair feature information is Q, and both H and Q are positive integers.
- step 805 acquiring an image block that includes facial feature information from the plurality of mutually overlapping image blocks may be implemented by:
- Implementation manner 1 determining each of the image blocks of the plurality of mutually overlapping image blocks Each pixel in the image block is a face probability of the facial feature point; when it is determined that the number of pixel points whose face probability is not 0 is greater than a preset threshold, determining each of the image blocks as an image block including facial feature information .
- Implementation 2 Obtain an image block including facial feature information from the plurality of mutually overlapping image blocks by a feature recognition method.
- the method for feature recognition may include a feature recognition method based on a local histogram, a feature recognition method based on a binarized histogram, and the like, which are not specifically limited in this embodiment of the present application.
- step 805 acquiring an image block including the header feature information from the plurality of mutually overlapping image blocks may be implemented by:
- Embodiment 1 determining, for each of the plurality of mutually overlapping image blocks, a hair probability that each pixel in each image block is a hair feature point; determining that the hair probability is not zero When the number of pixels is greater than a preset threshold, it is determined that each of the image blocks is an image block including hair feature information.
- Implementation 2 Obtaining an image block including hair feature information from the plurality of mutually overlapping image blocks by a feature recognition method.
- the method for feature recognition may include a feature recognition method based on a local histogram, a feature recognition method based on a binarized histogram, and the like, which are not specifically limited in this embodiment of the present application.
- f is a positive integer that is not more than H.
- S808 determining, for the gth image block including hair feature information, that the gth image block including the hair feature information corresponds to a target region in the face feature image of the face sample image, and the target is The image block in the region and the pixel value of the pixel at the same position in the gth image block including the hair feature information are added to obtain a g-th hair enhancement feature map.
- g is a positive integer that is not more than Q.
- step S806 and step S808 there is no strict sequence.
- Step S808 may be performed after step S806, or step S808 may be performed first, or step S806 may be performed first, or step S806 and step S808 may be performed simultaneously.
- the example is not specifically limited here.
- an error value between the sketch image of the face sample image and the sketch sample image corresponding to the face sample image and an adjustment amount of the offset are determined according to the network learning rate, and then adjusted according to the adjustment amount. K+1 times The weights and offsets used in the training process.
- the network learning rate is the weight and the offset of each adjustment.
- the network learning rate of the first deep convolutional neural network model may be k ⁇ 10 ⁇ 10 , where k is a positive integer not greater than 100.
- the example is not specifically limited here.
- the loss function value of the first deep convolutional neural network model is greater than a preset threshold, performing K+1th training; if the loss function value of the first deep convolutional neural network model is less than or equal to a preset threshold, the first Deep convolutional neural network model training is completed.
- the loss function of the first deep convolutional neural network model conforms to the following formula:
- L g is the loss function value of the first deep convolutional neural network model
- L s is the loss function value of the first network branch
- L t is the loss function value of the second network branch
- ⁇ is a scalar parameter for maintaining A balance between the loss function value of the first network branch and the loss function value of the second network branch.
- the value of the loss function of the first network branch may be the mean square error of the first network branch (English: Mean Squared Error, MSE for short), or may be the absolute error and (Sum of Absolute Difference, SAD) value.
- the value of the mean absolute error (MAD) value may be used as the value of the average absolute error (MAD), and may be other error values.
- the embodiment of the present application is not specifically limited herein. Taking the loss function value of the first network branch as an example of the MSE value of the first network branch, the loss function value of the first network branch can be determined by the following formula:
- L s is a loss function value of the first network branch;
- p s is an image block including the facial feature information of the fth; the s s is a sketch sample image corresponding to the face sample image, f image blocks included in the target region corresponding to the image block including the facial feature information; P s is all image blocks including facial feature information;
- is the number of all image blocks including facial feature information, ie
- the face structure sketch map of the fth face sample image the image block included in the target region corresponding to the fth image block including the face feature information, that is,
- w g is the weight and offset of the first N convolutional layers of the first deep convolutional neural network model
- w s is the weight of the last M convolutional layers of the first network branch of the first deep convolutional neural network model Offset.
- the value of the loss function of the second network branch may be the weight of the MSE value and the Sorted Matching Mean Square Error (SM) value of the second network branch, and may also be other error values.
- SM ( ⁇ ) sort ⁇ MSE ( ⁇ ) ⁇
- Sort () is a sort function.
- the loss function value of the second network branch can be determined by the following formula:
- ⁇ is a scalar parameter
- L t is a loss function value of the first network branch
- p t is the gth image block including hair feature information
- the s t is a sketch corresponding to the face sample image
- P t is all image blocks including hair feature information
- is all images including hair feature information
- is equal to Q
- the hair texture sketch map of the gth face sample image the image block included in the target region corresponding to the gth image block including the hair feature information, ie
- w g is the weight and offset of the first N convolutional layers of the first deep convolutional neural network model
- w t is the weight of the last M convolutional layers of the second network branch of the first deep convolutional neural network model Offset.
- the loss function function of the first network branch is the MSE value of the first network branch
- the loss function value of the second network branch is the weight of the MSE value and the SM value of the second network branch, for example, the first deep convolutional neural network model
- the loss function value is determined by the following formula:
- a design based on a deep convolutional neural network, includes a first network branch for generating a face feature, and a structure for generating a second network branch including a hair feature, from a large number of training samples.
- the effective feature expression is learned, and the network model which can generate the accurate and natural face sketch image of the original image is trained to realize the automatic generation of the face sketch image.
- the technique of generating a face sketch image based on a deep convolutional neural network no longer depends on the sample database, but through the deep convolutional neural network.
- a network branch generates a structural sketch map including facial features, generates a structural sketch map including hair features through a second network branch in the deep convolutional neural network, and then synthesizes the structural sketch map and the texture sketch map to obtain a final human face sketch image. It improves the accuracy and generalization ability of the face sketch image generation technology, and reduces the workload in the face sketch image generation process, thereby improving the speed of face sketch image generation.
- the embodiment of the present invention provides a sketch image generating apparatus 10, specifically for implementing the method described in the embodiments described in FIG. 1 to FIG. 5, FIG. 7, and FIG.
- the structure is as shown in FIG. 10, and includes an acquisition module 11, a deep convolutional neural network model 12, and a synthesis module 13, wherein:
- the obtaining module 11 is configured to acquire a face image to be processed.
- a depth convolutional neural network model 12 configured to acquire a facial structure sketch map and a hair texture sketch map in the face image acquired by the obtaining module 11; wherein the deep convolutional neural network model 12 is pre-trained
- the first network branching module 121 and the second network branching module 122 are configured.
- the structure of the deep convolutional neural network model 12 is as shown in FIG. 11:
- the first network branching module 121 is configured to acquire a facial sketch feature in the face image acquired by the acquiring module 11 to obtain a facial structure sketch map, where the first network branching module includes P convolution layers. Wherein P is an integer greater than zero.
- the second network branching module 122 is configured to obtain a hair sketch feature in the face image acquired by the obtaining module 11 to obtain a hair texture sketch map; and the second network branching module includes P convolution layers.
- a synthesizing module 13 configured to synthesize the facial structure sketch map obtained by the first network branching module 121 and the hair texture sketch map obtained by the second network branching module 122 to obtain a sketch image of the facial image .
- the first N convolution layers of the P convolution layers included in the first network branching module 121 and the P convolution layers included in the second network branching module 122 are the same or coincident,
- the N is an integer greater than 0 and less than P.
- the first network branching module 121 is configured to filter background features in the face image by using the first N convolution layers of the first network branching module 121. Obtaining a face feature map, and then acquiring the face sketch feature in the face feature map by the last M convolution layers of the first network branching module 121.
- the second network branching module 122 is configured to filter background features in the face image by using the first N convolution layers of the second network branch in the deep convolutional neural network model 12 to obtain a person.
- the face feature map is then obtained by the last M convolution layers of the second network branch to obtain the hair sketch feature in the face feature map.
- P M + N.
- the convolution kernel size of the last M convolutional layers of the first network branching module 121 is equal to the convolution kernel size of the last M convolutional layers of the second network branching module 122.
- the N is 4, and the first network branching module 121 filters the face image in the first N convolution layers of the first network branching module 121.
- the background feature is specifically configured to filter the background of the face image in the horizontal direction and the vertical direction by using the first convolution layer and the second convolution layer in the first N convolution layers of the first network branching module 121. a feature, then through the third convolutional layer and the fourth convolutional layer of the first N convolutional layers of the first network branching module 121, for the face image filtered by the background feature, horizontally and vertically Smoothing in the direction.
- the convolution kernel size of the first convolution layer is equal to the convolution kernel size of the second convolution layer, and the convolution kernel size of the third convolution layer and the convolution kernel size of the fourth convolution layer the same.
- the obtaining module 11 is further configured to acquire a hair probability that each pixel point in the face image is a hair feature point.
- the synthesizing module 13 is configured to synthesize the facial structure sketch map obtained by the first network branching module 121 and the hair texture sketch map obtained by the second network branching module 122 to obtain the facial image.
- the sketch image meets the following formula requirements:
- S (i, j) is a pixel value of a pixel point of the i-th row and the j-th column in the sketch image of the face image
- P h(i, j) is a sketch image of the face image
- S S(i, j) is the pixel value of the pixel of the i-th row and the j-th column in the sketch image of the face structure
- S t(i, j) is A pixel value of a pixel of the i-th row and the j-th column in the hair texture sketch map, wherein i, j are integers greater than zero.
- the device further includes:
- the training module 14 is configured to train the deep convolutional neural network model 12 by:
- the deep convolutional neural network model 12 includes weights and offsets.
- the background features in the face sample image are filtered by the first N convolution layers of the K-1 sub-depth convolutional neural network model 12 to obtain the face sample image.
- the face feature map, the K being an integer greater than zero.
- a face structure sketch map of the face sample image and a hair texture sketch map of the face sample image are combined to obtain a sketch image of the face sample image.
- an error value between the sketch image of the face sample image and the sketch sample image corresponding to the face sample image is acquired.
- the weight and offset used in the K+1th training process are adjusted based on an error value between the sketch image of the face sample image and the sketch sample image corresponding to the face sample image.
- the training module 14 filters the first N convolution layers of the deep convolutional neural network model 12 that has undergone K-1 adjustments during the Kth training process to filter the image in the face sample image.
- the background feature it is specifically used to:
- the face sample image and the pixel value of the pixel at the same position in the sketch average map are added to obtain a face enhancement image.
- the pixel value of any one of the pixel points in the sketch average map is an average value of pixel values of pixel points in the same position in the sketch sample image in the training sample database that are at the same position as the any one of the pixel points;
- the background features in the face enhancement image are filtered by the first N convolutional layers of the deep convolutional neural network model 12 that have been K-1 adjusted.
- the acquiring module 11 is further configured to divide the face sample image into a plurality of mutually overlapping image blocks, and obtain a face including the plurality of mutually overlapping image blocks.
- the training module 14 acquires the facial features of the face sample image by using the last M convolution layers of the first network branching module 121 of the deep convolutional neural network model 12 that has undergone K-1 adjustments.
- the facial sketch features in the figure are specifically used to:
- the acquiring module 11 is further configured to divide the face sample image into a plurality of overlapping image blocks, and obtain the hair including the plurality of mutually overlapping image blocks.
- the training module 14 acquires the facial features of the face sample image after the M M convolution layers of the second network branching module 122 of the deep convolutional neural network model 12 that has undergone K-1 adjustments.
- the hair sketch feature in the figure is specifically used to:
- the hair enhancement feature map is obtained by the last M convolution layers of the second network branch module 122 of the K-1 sub-depended deep convolutional neural network model 12 Hair sketch features.
- the acquiring module 11 is configured to: when acquiring an image block that includes facial feature information from the plurality of mutually overlapping image blocks, specifically:
- each of the image blocks Determining, for each of the plurality of mutually overlapping image blocks, a face probability of each pixel point in each of the image blocks as a facial feature point; determining a number of pixel points whose face probability is not 0 When it is greater than the preset threshold, it is determined that each of the image blocks is an image block including facial feature information.
- the obtaining module 11 acquires a packet from the plurality of mutually overlapping image blocks.
- an image block including hair feature information is used, it is specifically used to:
- each of the image blocks Determining, for each of the plurality of mutually overlapping image blocks, a hair probability of each pixel point in each of the image blocks as a hair feature point; determining a number of pixel points whose hair probability is not zero When it is greater than the preset threshold, it is determined that each of the image blocks is an image block including hair feature information.
- each functional module in each embodiment of the present application may be integrated into one processing. In the device, it can also be physically existed alone, or two or more modules can be integrated into one module.
- the above integrated modules can be implemented in the form of hardware or in the form of software functional modules.
- the collector 1201, the processor 1202, and the memory 1203 can be included.
- the physical hardware corresponding to the deep convolutional neural network model 12, the synthesis module 13 and the training module 14 may be the processor 1202.
- the processor 1202 can be a central processing unit (English: central processing unit, CPU for short), or a digital processing unit or the like.
- the processor 1202 acquires a face image to be processed through the collector 1201.
- the memory 1203 is configured to store a program executed by the processor 1202.
- the specific connection medium between the above-mentioned collector 1201, processor 1202 and memory 1203 is not limited in the embodiment of the present application.
- the memory 1203, the processor 1202, and the collector 1201 are connected by a bus 1204 in FIG. 12, and the bus is indicated by a thick line in FIG. 12, and the connection manner between other components is only schematically illustrated. , not limited to.
- the bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one thick line is shown in Figure 12, but it does not mean that there is only one bus or one type of bus.
- the memory 1203 may be a volatile memory (English: volatile memory), such as a random access memory (English: random-access memory, abbreviation: RAM); the memory 1203 may also be a non-volatile memory (English: non-volatile memory)
- read-only memory English: read-only memory, abbreviation: ROM
- flash memory English: flash memory
- hard disk English: hard disk drive, abbreviation: HDD
- solid state drive English: solid-state drive Abbreviation: SSD
- memory 1203 is any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited thereto.
- the memory 1203 may be a combination of the above memories.
- the processor 1202 is configured to execute the program code stored in the memory 1203, and is specifically configured to perform the method described in the foregoing embodiments corresponding to FIG. 1 to FIG. 9, and may be specifically implemented by referring to the corresponding embodiments in FIG. 1 to FIG. Narration.
- a design based on a deep convolutional neural network, includes a first network branch for generating a face feature, and a structure for generating a second network branch including a hair feature, from a large number of training samples.
- the effective feature expression is learned, and the network model which can generate the accurate and natural face sketch image of the original image is trained to realize the automatic generation of the face sketch image.
- the technique of generating a face sketch image based on a deep convolutional neural network no longer depends on the sample database, but through the deep convolutional neural network.
- a network branch generates a structural sketch map including facial features, generates a structural sketch map including hair features through a second network branch in the deep convolutional neural network, and then synthesizes the structural sketch map and the texture sketch map to obtain a final human face sketch image. It improves the accuracy and generalization ability of the face sketch image generation technology, and reduces the workload in the face sketch image generation process, thereby improving the speed of face sketch image generation.
- embodiments of the present application can be provided as a method, system, or computer program product. Therefore, the present application may employ an entirely hardware embodiment, an entirely software embodiment, or a combination of software and hardware. The form of the case. Moreover, the application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
- computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
- the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
- the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
- These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
- the instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
Disclosed are a sketch image generation method and device, applicable for solving the issue in which prior art face sketch image automatic generation technology is less accurate, has poorer generalization capabilities and slower sketch image generation. The method comprises: acquiring a face image to be processed; acquiring, by means of P convolutional layers of a first network branch in a pre-trained deep convolutional neural network model, facial sketch features of the face image so as to obtain a facial structure sketch image; acquiring, by means of P convolutional layers of a second network branch in the deep convolutional neural network model, hair sketch features of the face image so as to obtain a hair texture sketch image; and synthesizing the facial structure sketch image and the hair texture sketch image to obtain a sketch image of the face image.
Description
本申请涉及图像处理技术领域,特别涉及一种素描图像的生成方法及装置。The present application relates to the field of image processing technologies, and in particular, to a method and an apparatus for generating a sketch image.
人脸素描图像自动生成(sketch portrait automatic generation),是指将输入的人脸图像自动生成具有素描风格的人脸图像的过程。The sketch portrait automatic generation refers to a process of automatically generating a face image with a sketch style by inputting a face image.
人脸素描图像的自动生成技术在诸多领域都有重要的应用。如,在公共安全领域,可以利用基于犯罪嫌疑人的身份证照片生成的素描图像与根据目击证人的描述绘制的素描图像进行比对,从而辅助公安机关确定犯罪嫌疑人的身份;在动漫产业与社交网络领域,主要用于将人物照片进行素描风格化的渲染。The automatic generation technology of face sketch images has important applications in many fields. For example, in the field of public safety, a sketch image generated based on a photo of a suspect’s ID card can be used to compare with a sketch image drawn according to the witness’s description, thereby assisting the public security organ in determining the identity of the suspect; in the animation industry and The field of social networking is mainly used to render sketches of people's photos in a stylized manner.
目前的人脸素描图像自动生成技术主要是基于合成的方法,即利用样本图像中与输入图像中相似的部分来合成完整的素描图像。The current face sketch image automatic generation technology is mainly based on a synthetic method, that is, a complete sketch image is synthesized by using similar parts in the sample image with the input image.
具体的,首先建立一个包括大量的样本图像块及每一个样本图像块对应的素描图像块的数据库,其中,每一个样本图像块分别包含不同的与人脸相关的特征信息,如五官、人脸装饰物、头发、胡须等特征信息。其次将输入图像分成许多图像分块,针对每一个图像分块在数据库中搜索与其相似的样本图像块,获取每一个图像分块相似的样本图像块对应的素描图像块,并将获取的所有素描图像块合成一个素描图像。然后通过多尺度马尔科夫随机场(英文:Markov Random Filed,简称:MRF)算法模型去除合成的素描图像中相邻子块间的边缘,得到相对自然的素描图像。Specifically, first, a database including a plurality of sample image blocks and a sketch image block corresponding to each sample image block is created, wherein each sample image block respectively includes different feature information related to the face, such as facial features and faces. Decorative information such as decorations, hair, and beards. Secondly, the input image is divided into a plurality of image blocks, and a sample image block similar to the image block is searched in the database for each image segmentation, and a sketch image block corresponding to each image block similar to the image block is obtained, and all the acquired sketches are obtained. The image block is combined into a sketch image. Then, the multi-scale Markov Random Filed (MRF) algorithm model is used to remove the edges between adjacent sub-blocks in the synthesized sketch image to obtain a relatively natural sketch image.
然而,基于合成的人脸素描图像自动生成技术通过MRF算法模型使合成的素描图像趋于平滑的方式,使得合成的素描图像中的一些如人脸上的痣、疤痕等细节特征被平滑掉,从而导致合成的素描图像不能很好的保留人脸照片中的纹理细节信息。并且基于合成的人脸素描图像自动生成技术通常需要建立一个样本数据库,而样本数据库中样本数据包括的特征信息均与人脸相关,而建立的样本数据库中样本数据数量有限并不能涵盖足够的样本数据,因此当人脸图像中出现样本数据中没有包括的元素时,基于合成的人脸素描图像自动生成技术不能准确的生成素描图像,因此基于合成的人脸素描图像自动生成技术的准确性较低且泛化能力较差。并且,基于合成的人脸素描图像自动生成技术在将原始图像生成素描图像时,要将原始数据与所有样本图像块进行搜索比对,并将所有获取的素描图像块进行合成,工作量大导致素描图像生成的速度慢。However, the synthetic face-based sketch image automatic generation technology smoothes the synthesized sketch image by the MRF algorithm model, so that some features such as flaws and scars on the face of the synthesized sketch image are smoothed out. As a result, the synthesized sketch image does not retain the texture detail information in the face photo well. And the synthetic face-based sketch image automatic generation technology usually needs to establish a sample database, and the sample information included in the sample database is related to the face, and the sample database in the established sample database has a limited number of samples and cannot cover enough samples. Data, so when the elements not included in the sample data appear in the face image, the synthetic face-based sketch image automatic generation technology cannot accurately generate the sketch image, so the accuracy of the synthetic face-based sketch image automatic generation technique is better. Low and generalization ability is poor. Moreover, based on the synthetic face sketch image automatic generation technology, when the original image is generated into a sketch image, the original data is searched and compared with all the sample image blocks, and all the acquired sketch image blocks are synthesized, resulting in a large workload. Sketch images are generated slowly.
发明内容Summary of the invention
本申请实施例提供了一种素描图像的生成方法及装置,用以解决现有技术中存在的人脸素描图像自动生成技术准确性较低,泛化能力较差且素描图像生成的速度慢的问题。The embodiment of the present application provides a method and a device for generating a sketch image, which are used to solve the problem that the automatic generation technology of the face sketch image existing in the prior art has low accuracy, poor generalization ability, and slow sketch image generation. problem.
第一方面,本申请实施例提供了一种素描图像的生成方法,该方法可以应用于电子设备,包括:In a first aspect, an embodiment of the present application provides a method for generating a sketch image, which may be applied to an electronic device, including:
电子设备获取待处理的人脸图像后,通过预先训练的深度卷积神经网络模型中第一网
络分支的P个卷积层,获取所述人脸图像中的面部素描特征,得到面部结构素描图,并通过所述深度卷积神经网络模型中第二网络分支的P个卷积层,获取所述人脸图像中的头发素描特征,得到头发纹理素描图,其中,所述P为大于0的整数。After the electronic device acquires the image of the face to be processed, the first network in the deep convolutional neural network model is pre-trained
Obtaining a facial sketch feature in the face image, obtaining a facial structure sketch map, and obtaining the P convolution layer of the second network branch in the deep convolutional neural network model The hair sketch feature in the face image obtains a hair texture sketch map, wherein the P is an integer greater than zero.
然后将所述面部结构素描图以及所述头发纹理素描图合成得到所述人脸图像的素描图像。The facial structure sketch map and the hair texture sketch map are then synthesized to obtain a sketch image of the face image.
本申请实施例中基于深度卷积神经网络的方式,通过设计包括用于生成包括人脸特征的第一网络分支,以及用于生成包括头发特征的第二网络分支的结构,从大量的训练样本中学习出有效的特征表达,训练出能够将原始图像生成准确自然的人脸素描图像的网络模型,实现人脸素描图像的自动生成。相比于现有技术中基于合成的人脸素描图像自动生成技术,基于深度卷积神经网络生成人脸素描图像的技术,不再依赖于样本数据库,而是通过深度卷积神经网络中的第一网络分支生成包括人脸特征的结构素描图,通过深度卷积神经网络中的第二网络分支生成包括头发特征的结构素描图,然后合成结构素描图和纹理素描图得到最终的人脸素描图像,提高了人脸素描图像生成技术的准确性以及泛化能力,并且减少了人脸素描图像生成过程中的工作量,从而提高了人脸素描图像生成的速度。In the embodiment of the present application, based on a deep convolutional neural network, a design includes a first network branch for generating a face feature, and a structure for generating a second network branch including a hair feature, from a large number of training samples. The effective feature expression is learned, and the network model which can generate the accurate and natural face sketch image of the original image is trained to realize the automatic generation of the face sketch image. Compared with the prior art synthetic face-based sketch image automatic generation technology, the technique of generating a face sketch image based on a deep convolutional neural network no longer depends on the sample database, but through the deep convolutional neural network. A network branch generates a structural sketch map including facial features, generates a structural sketch map including hair features through a second network branch in the deep convolutional neural network, and then synthesizes the structural sketch map and the texture sketch map to obtain a final human face sketch image. It improves the accuracy and generalization ability of the face sketch image generation technology, and reduces the workload in the face sketch image generation process, thereby improving the speed of face sketch image generation.
可选的,所述深度卷积神经网络模型中的每一个卷积层均以修正线性单元(英文:Rectified Linear Units,简称:ReLU)作为激活函数。所述深度卷积神经网络模型中的每一个卷积层所用的卷积核尺寸模型为r×r模型。Optionally, each of the convolutional layers in the deep convolutional neural network model has a modified linear unit (English: Rectified Linear Units, referred to as: ReLU) as an activation function. The convolution kernel size model used by each of the convolutional layers in the deep convolutional neural network model is an r x r model.
在一种可能的设计中,所述第一网络分支的前N个卷积层与所述第二网络分支的前N个卷积层相同或者重合,所述N为大于0小于P的整数。In a possible design, the first N convolutional layers of the first network branch are the same as or coincide with the first N convolutional layers of the second network branch, and the N is an integer greater than 0 and less than P.
具体的,所述第一网络分支的前N个卷积层与所述第二网络分支的前N个卷积层相同,或者,所述第一网络分支的前N个卷积层与所述第二网络分支的前N个卷积层共用所述深度卷积神经网络模型中的前N个卷积层。Specifically, the first N convolution layers of the first network branch are the same as the first N convolution layers of the second network branch, or the first N convolution layers of the first network branch and the The first N convolutional layers of the second network branch share the first N convolutional layers in the deep convolutional neural network model.
本申请实施例中所述第一网络分支的前N个卷积层与所述第二网络分支的前N个卷积层相同或者重合,提高了所述深度卷积神经网络模型的计算效率。The first N convolution layers of the first network branch in the embodiment of the present application are the same as or coincide with the first N convolution layers of the second network branch, which improves the computational efficiency of the deep convolutional neural network model.
在一种可能的设计中,所述通过所述深度卷积神经网络模型中第一网络分支的P个卷积层,获取所述人脸图像中的面部素描特征,包括:In a possible design, the obtaining, by the P convolution layers of the first network branch in the deep convolutional neural network model, the facial sketch features in the face image, including:
通过所述深度卷积神经网络模型中第一网络分支的所述前N个卷积层,过滤所述人脸图像中的背景特征,得到人脸特征图;Filtering background features in the face image by using the first N convolution layers of the first network branch in the deep convolutional neural network model to obtain a facial feature map;
通过所述第一网络分支的后M个卷积层,获取所述人脸特征图中的面部素描特征。Obtaining a facial sketch feature in the face feature map by the last M convolution layers of the first network branch.
所述通过所述深度卷积神经网络模型中第二网络分支的P个卷积层,获取所述人脸图像中的头发素描特征,包括:Obtaining the hair sketch feature in the face image by using the P convolution layers of the second network branch in the deep convolutional neural network model, including:
通过所述深度卷积神经网络模型中第二网络分支的所述前N个卷积层,过滤所述人脸图像中的背景特征,得到人脸特征图;Filtering background features in the face image by using the first N convolution layers of the second network branch in the deep convolutional neural network model to obtain a facial feature map;
通过所述第二网络分支的后M个卷积层,获取所述人脸特征图中的头发素描特征;Obtaining a hair sketch feature in the face feature map by using the last M convolution layers of the second network branch;
其中,P=M+N。Where P = M + N.
上述设计中,通过采用第一网络分支的前N个卷积层用于过滤待处理的人脸图像中的背景特征,后M个卷积层用于获取面部结构素描图;第二网络分支的前N个卷积层用于过滤待处理的人脸图像中的背景特征,后M个卷积层用于获取头发纹理素描图的方式,提高了人脸素描图像生成技术的准确性,提高了人脸素描图像生成的速度。In the above design, the first N convolutional layers of the first network branch are used to filter the background features in the face image to be processed, and the last M convolutional layers are used to obtain the facial structure sketch map; the second network branch The first N convolutional layers are used to filter the background features in the face image to be processed, and the M convolutional layers are used to obtain the texture sketch of the hair texture, which improves the accuracy of the face sketch image generation technology and improves the accuracy. The speed at which face sketch images are generated.
在一种可能的设计中,所述第一网络分支的后M个卷积层的卷积核尺寸,与所述第二
网络分支的后M个卷积层的卷积核尺寸对应相等。In a possible design, the convolution kernel size of the last M convolutional layers of the first network branch, and the second
The convolution kernel sizes of the last M convolutional layers of the network branch are equal.
上述设计中所述第一网络分支的后M个卷积层的卷积核尺寸与所述第二网络分支的后M个卷积层的卷积核尺寸对应相等,提高了人脸素描图像生成技术的准确性。The convolution kernel size of the last M convolutional layers of the first network branch in the above design is equal to the convolution kernel size of the last M convolutional layers of the second network branch, and the face sketch image generation is improved. The accuracy of the technology.
在一种可能的设计中,所述M为2,所述第一网络分支的后两个卷积层的卷积核尺寸相等,所述第二网络分支的后两个卷积层的卷积核尺寸相等。In a possible design, the M is 2, the convolution kernels of the last two convolution layers of the first network branch are equal in size, and the convolutions of the last two convolution layers of the second network branch are The core size is equal.
在一种可能的设计中,所述N为4,所述通过深度卷积神经网络模型中第一网络分支的前N个卷积层,过滤所述人脸图像中的背景特征,包括:In a possible design, the N is 4, and the first N convolution layers of the first network branch in the deep convolutional neural network model are filtered to filter background features in the face image, including:
通过深度卷积神经网络模型中第一网络分支的前N个卷积层中的第一卷积层以及第二卷积层,过滤所述人脸图像水平方向以及垂直方向的背景特征。The background features in the horizontal direction and the vertical direction of the face image are filtered by the first convolution layer and the second convolution layer in the first N convolution layers of the first network branch in the deep convolutional neural network model.
通过深度卷积神经网络模型中第一网络分支的前N个卷积层中的第三卷积层以及第四卷积层,针对过滤了背景特征的所述人脸图像,在水平方向以及垂直方向上进行平滑处理。Through the third convolutional layer and the fourth convolutional layer in the first N convolutional layers of the first network branch in the deep convolutional neural network model, the face image filtered for the background feature is horizontally and vertically Smoothing in the direction.
上述设计中,通过采用第一网络分支的前N个卷积层中的第一卷积层以及第二卷积层用于过滤待处理的人脸图像水平方向以及垂直方向的背景特征,第三卷积层以及第四卷积层用于针对过滤了背景特征的所述人脸图像,在水平方向以及垂直方向上进行平滑处理,提高了人脸素描图像生成技术的准确性,并且使得生成的素描图像更自然。In the above design, the first convolutional layer and the second convolutional layer in the first N convolutional layers of the first network branch are used to filter the background features of the horizontal and vertical directions of the face image to be processed, and the third The convolution layer and the fourth convolution layer are used for smoothing the horizontal and vertical directions for the face image filtered by the background feature, thereby improving the accuracy of the face sketch image generation technique and causing the generated Sketch images are more natural.
在一种可能的设计中,所述第一卷积层的卷积核尺寸与第二卷积层的卷积核尺寸相等,第三卷积层的卷积核尺寸和第四卷积层的卷积核尺寸相同。In one possible design, the convolution kernel size of the first convolutional layer is equal to the convolution kernel size of the second convolutional layer, and the convolution kernel size of the third convolutional layer and the fourth convolutional layer The convolution kernel has the same size.
上述设计中,所述第一卷积层的卷积核尺寸与第二卷积层的卷积核尺寸相等,第三卷积层的卷积核尺寸和第四卷积层的卷积核尺寸相同,提高了人脸素描图像生成技术的准确性。In the above design, the convolution kernel size of the first convolutional layer is equal to the convolution kernel size of the second convolutional layer, and the convolution kernel size of the third convolutional layer and the convolution kernel size of the fourth convolutional layer The same, the accuracy of the face sketch image generation technology is improved.
在一种可能的设计中,所述方法还包括:In one possible design, the method further includes:
获取所述人脸图像中每一个像素点为头发特征点的头发概率;Obtaining a hair probability that each pixel in the face image is a hair feature point;
所述将所述面部结构素描图以及所述头发纹理素描图合成得到所述人脸图像的素描图像,符合如下公式要求:And combining the facial structure sketch map and the hair texture sketch map to obtain a sketch image of the face image, which meets the following formula requirements:
S(i,j)=(1-Ph(i,j))×SS(i,j)+Ph(i,j)×St(i,j)
S (i,j) =(1-P h(i,j) )×S S(i,j) +P h(i,j) ×S t(i,j)
其中,所述S(i,j)为所述人脸图像的素描图像中第i行第j列的像素点的像素值,Ph(i,j)为所述人脸图像的素描图像中第i行第j列的像素点的头发概率,SS(i,j)为所述面部结构素描图像中第i行第j列的像素点的像素值,St(i,j)为所述头发纹理素描图中第i行第j列的像素点的像素值,所述i,j均为大于0的整数。Wherein the S (i, j) is a pixel value of a pixel point of the i-th row and the j-th column in the sketch image of the face image, and P h(i, j) is a sketch image of the face image The hair probability of the pixel of the i-th row and the j-th column, S S(i, j) is the pixel value of the pixel of the i-th row and the j-th column in the sketch image of the face structure, and S t(i, j) is A pixel value of a pixel of the i-th row and the j-th column in the hair texture sketch map, wherein i, j are integers greater than zero.
上述设计中,通过采用基于头发概率将面部结构素描图以及头发纹理素描图合成得到人脸图像的素描图像的方式,使得合成的素描图像不仅能够很好的保留面部结构信息,同时对于头发纹理信息也保留的比较好。In the above design, by using a facial image sketch map and a hair texture sketch map based on the hair probability to obtain a sketch image of the face image, the synthesized sketch image can not only retain the facial structure information well, but also the hair texture information. Also reserved is better.
在一种可能的设计中,所述深度卷积神经网络模型通过如下方式训练得到:In one possible design, the deep convolutional neural network model is trained as follows:
将训练样本数据库中的若干个人脸样本图像输入初始化的深度卷积神经网络模型进行训练;所述训练样本数据库包括若干个人脸样本图像以及每个人脸样本图像对应的素描样本图像,所述初始化的深度卷积神经网络模型包括权重和偏置;Performing training by inputting a plurality of personal face sample images in the training sample database into the initialized deep convolutional neural network model; the training sample database includes a plurality of personal face sample images and a sketch sample image corresponding to each face sample image, the initialized Deep convolutional neural network models include weights and offsets;
在第K次训练过程中,通过经过K-1次调整的深度卷积神经网络模型的前N个卷积层,过滤所述人脸样本图像中的背景特征,得到所述人脸样本图像的人脸特征图,所述K为大于0的整数;In the Kth training process, the background features in the face sample image are filtered by the first N convolution layers of the K-1 sub-depth convolutional neural network model to obtain the face sample image. a face feature map, the K being an integer greater than 0;
通过所述经过K-1次调整的深度卷积神经网络模型的第一网络分支的后M个卷积层,
获取所述人脸样本图像的人脸特征图中的面部素描特征,得到所述人脸样本图像的面部结构素描图;Passing the last M convolutional layers of the first network branch of the K-1 sub-depended deep convolutional neural network model,
Obtaining a facial sketch feature in the face feature image of the face sample image, and obtaining a facial structure sketch map of the face sample image;
通过所述经过K-1次调整的深度卷积神经网络模型的第二网络分支的后M个卷积层,获取所述人脸样本图像的人脸特征图中的头发素描特征,得到所述人脸样本图像的头发纹理素描图;Acquiring the hair sketch feature in the face feature image of the face sample image by using the last M convolution layers of the second network branch of the K-1 adjusted deep convolutional neural network model, Sketch image of hair texture of face sample image;
将所述人脸样本图像的面部结构素描图以及所述人脸样本图像的头发纹理素描图合成得到所述人脸样本图像的素描图像;Combining a facial structure sketch map of the face sample image and a hair texture sketch map of the face sample image to obtain a sketch image of the face sample image;
在第K次训练后,获取所述人脸样本图像的素描图像与所述人脸样本图像对应的素描样本图像之间的误差值;Obtaining an error value between the sketch image of the face sample image and the sketch sample image corresponding to the face sample image after the Kth training;
基于所述人脸样本图像的素描图像与所述人脸样本图像对应的素描样本图像之间的误差值,调整第K+1次训练过程所使用的权重和偏置。The weight and offset used in the K+1th training process are adjusted based on an error value between the sketch image of the face sample image and the sketch sample image corresponding to the face sample image.
上述设计中,通过采用大量的人脸样本图像对深度卷积神经网络模型进行训练,在将待处理的人脸图像生成素描图像时不再依赖于样本数据库,而可以直接通过训练好的深度卷积神经网络模型生成所述人脸图像的素描图像,提高了人脸素描图像生成技术的准确性以及泛化能力,并且减少了人脸素描图像生成过程中的工作量,从而提高了人脸素描图像生成的速度。In the above design, the deep convolutional neural network model is trained by using a large number of face sample images, and the sketch image is not dependent on the sample database when generating the sketch image of the face image to be processed, but can directly pass the trained depth volume. The neural network model generates the sketch image of the face image, which improves the accuracy and generalization ability of the face sketch image generation technology, and reduces the workload in the process of generating the face sketch image, thereby improving the face sketch. The speed of image generation.
在一种可能的设计中,在第K次训练过程中,通过经过K-1次调整的深度卷积神经网络模型的前N个卷积层,过滤所述人脸样本图像中的背景特征,包括:In a possible design, during the Kth training, the background features in the face sample image are filtered by the first N convolution layers of the K-1 sub-adjusted deep convolutional neural network model. include:
将所述人脸样本图像和素描平均图中同一位置的像素点的像素值相加,得到人脸加强图像;Adding the face sample image and the pixel value of the pixel at the same position in the sketch average image to obtain a face enhancement image;
其中,针对所述素描平均图中的任一个像素点的像素值为:所述训练样本数据库中所有素描样本图像中与所述任一个像素点在同一位置的像素点的像素值的平均值;The pixel value of any one of the pixel points in the sketch average map is an average value of pixel values of pixel points in the same position in the sketch sample image in the training sample database that are at the same position as the any one of the pixel points;
通过经过K-1次调整的深度卷积神经网络模型的前N个卷积层,过滤所述人脸加强图像中的背景特征。The background features in the face enhancement image are filtered by the first N convolutional layers of the K-1 sub-depth convolutional neural network model.
上述设计中,通过采用将所述人脸样本图像和素描平均图中同一位置的像素点的像素值相加,得到人脸加强图像的方式,强化了人脸样本图像的面部特征信息以及头发特征信息,从而提高了人脸素描图像生成技术的准确性。In the above design, by adding the pixel values of the pixel points at the same position in the face sample image and the sketch average image, the face enhancement image is obtained, and the facial feature information and the hair feature of the face sample image are enhanced. Information, which improves the accuracy of face sketch image generation techniques.
在一种可能的设计中,通过所述经过K-1次调整的深度卷积神经网络模型的第一网络分支的后M个卷积层,获取所述人脸样本图像的人脸特征图中的面部素描特征,包括:In a possible design, the face feature map of the face sample image is obtained by the last M convolution layers of the first network branch of the K-1 sub-depth deep convolutional neural network model. Facial sketch features, including:
将所述人脸样本图像的人脸特征图划分成若干个相互重叠的图像块,并从所述若干个相互重叠的图像块中获取包括面部特征信息的图像块;Dividing a face feature map of the face sample image into a plurality of mutually overlapping image blocks, and acquiring image blocks including facial feature information from the plurality of mutually overlapping image blocks;
针对每一个包括面部特征信息的图像块,确定所述每一个包括面部特征信息的图像块在所述人脸样本图像的人脸特征图中对应的目标区域,并将所述目标区域中的图像块和所述每一个包括面部特征信息的图像块中同一位置的像素点的像素值相加,得到面部加强特征图;Determining, for each of the image blocks including the facial feature information, the image block including the facial feature information corresponding to the target region in the facial feature map of the facial sample image, and the image in the target region And adding a pixel value of the pixel at the same position in each of the image blocks including the facial feature information to obtain a face enhancement feature map;
针对每一个面部加强特征图,通过所述经过K-1次调整的深度卷积神经网络模型的第一网络分支的后M个卷积层,获取所述面部加强特征图中的面部素描特征。For each facial enhancement feature map, the facial sketch feature in the facial enhancement feature map is acquired by the last M convolutional layers of the first network branch of the K-1 adjusted deep convolutional neural network model.
上述设计中,通过采用将包括面部特征信息的图像块和所述人脸特征图中对应的目标区域中的图像块同一位置的像素点的像素值相加的方式,强化了人脸样本图像的面部特征信息,使得合成的素描图像能够很好的保留面部结构信息。
In the above design, the face sample image is enhanced by adding the pixel values of the pixel points including the face feature information and the pixel points at the same position of the image block in the corresponding target region in the face feature map. The facial feature information enables the synthesized sketch image to retain the facial structure information well.
在一种可能的设计中,通过所述经过K-1次调整的深度卷积神经网络模型的第二网络分支的后M个卷积层,获取所述人脸样本图像的人脸特征图中的头发素描特征,包括:In a possible design, the face feature map of the face sample image is obtained by the last M convolution layers of the second network branch of the K-1 sub-depth deep convolutional neural network model. Hair sketch features, including:
将所述人脸样本图像的人脸特征图划分成若干个相互重叠的图像块,并从所述若干个相互重叠的图像块中获取包括头发特征信息的图像块;Dividing a face feature map of the face sample image into a plurality of mutually overlapping image blocks, and acquiring image blocks including hair feature information from the plurality of mutually overlapping image blocks;
针对每一个包括头发特征信息的图像块,将所述人脸样本图像和所述包括头发特征信息的图像块中同一位置的像素点的像素值相加,得到头发加强特征图;And adding, to each of the image blocks including the hair feature information, the pixel values of the pixel positions at the same position in the image block including the hair feature information to obtain a hair enhancement feature map;
针对每一个头发加强特征图,通过所述经过K-1次调整的深度卷积神经网络模型的第二网络分支的后M个卷积层,获取所述头发加强特征图中的头发素描特征。For each hair enhancement feature map, the hair sketch feature in the hair enhancement feature map is obtained by the last M convolution layers of the second network branch of the K-1 adjusted deep convolutional neural network model.
上述设计中,通过采用将包括头发特征信息的图像块和所述人脸特征图中对应的目标区域中的图像块同一位置的像素点的像素值相加的方式,强化了人脸样本图像的头发特征信息,使得合成的素描图像能够很好的保留头发纹理信息。In the above design, the face sample image is enhanced by adding the pixel values of the image block including the hair feature information and the pixel points at the same position of the image block in the corresponding target region in the face feature map. The hair feature information enables the synthesized sketch image to retain the hair texture information well.
在一种可能的设计中,所述从所述若干个相互重叠的图像块中获取包括面部特征信息的图像块,包括:In a possible design, the obtaining an image block including facial feature information from the plurality of mutually overlapping image blocks includes:
针对所述若干个相互重叠的图像块中的每一个图像块,确定所述每一个图像块中的每一个像素点为面部特征点的面部概率;在确定面部概率不为0的像素点的数量大于预设阈值时,则确定所述每一个图像块为包括面部特征信息的图像块。Determining, for each of the plurality of mutually overlapping image blocks, a face probability of each pixel point in each of the image blocks as a facial feature point; determining a number of pixel points whose face probability is not 0 When it is greater than the preset threshold, it is determined that each of the image blocks is an image block including facial feature information.
上述设计中,通过确定所述每一个图像块中的每一个像素点为面部特征点的面部概率,然后在确定面部概率不为0的像素点的数量大于预设阈值时,则确定所述每一个图像块为包括面部特征信息的图像块,提高了获取包括面部特征信息的图像块的准确性。In the above design, by determining a face probability of each pixel point in each image block as a facial feature point, and then determining that the number of pixel points whose face probability is not 0 is greater than a preset threshold, determining each of the An image block is an image block including facial feature information, which improves the accuracy of acquiring an image block including facial feature information.
在一种可能的设计中,所述从所述若干个相互重叠的图像块中获取包括头发特征信息的图像块,包括:In a possible design, the obtaining an image block including hair feature information from the plurality of mutually overlapping image blocks includes:
针对所述若干个相互重叠的图像块中的每一个图像块,确定所述每一个图像块中的每一个像素点为头发特征点的头发概率;在确定头发概率不为0的像素点的数量大于预设阈值时,则确定所述每一个图像块为包括头发特征信息的图像块。Determining, for each of the plurality of mutually overlapping image blocks, a hair probability of each pixel point in each of the image blocks as a hair feature point; determining a number of pixel points whose hair probability is not zero When it is greater than the preset threshold, it is determined that each of the image blocks is an image block including hair feature information.
上述设计中,通过确定所述每一个图像块中的每一个像素点为头发特征点的面部概率,然后在确定头发概率不为0的像素点的数量大于预设阈值时,则确定所述每一个图像块为包括头发特征信息的图像块,提高了获取包括头发特征信息的图像块的准确性。In the above design, by determining a face probability of each pixel point in each image block as a hair feature point, and then determining that the number of pixel points whose hair probability is not 0 is greater than a preset threshold, determining each of the An image block is an image block including hair feature information, which improves the accuracy of acquiring an image block including hair feature information.
第二方面,本申请实施例提供了一种素描图像的生成装置,包括:In a second aspect, the embodiment of the present application provides a device for generating a sketch image, including:
获取模块,用于获取待处理的人脸图像;An obtaining module, configured to obtain a face image to be processed;
深度卷积神经网络模型,用于获取所述获取模块获取的所述人脸图像中的面部结构素描图以及头发纹理素描图;所述深度卷积神经网络模型为预先训练的,包括第一网络分支模块以及第二网络分支模块;a depth convolutional neural network model, configured to acquire a facial structure sketch map and a hair texture sketch map in the face image acquired by the obtaining module; the deep convolutional neural network model is pre-trained, including the first network a branch module and a second network branch module;
其中,所述第一网络分支模块,用于获取所述获取模块获取的所述人脸图像中的面部素描特征,得到面部结构素描图;所述第一网络分支模块包括P个卷积层,其中,所述P为大于0的整数;The first network branching module is configured to acquire a facial sketch feature in the facial image acquired by the acquiring module, to obtain a facial structure sketch map, where the first network branching module includes P convolutional layers. Wherein P is an integer greater than 0;
所述第二网络分支模块,用于获取所述获取模块获取的所述人脸图像中的头发素描特征,得到头发纹理素描图;所述第二网络分支模块包括P个卷积层;The second network branching module is configured to obtain a hair sketch feature in the face image acquired by the acquiring module, to obtain a hair texture sketch map; and the second network branching module includes P convolution layers;
合成模块,用于将所述第一网络分支模块得到的所述面部结构素描图以及所述第二网络分支模块得到的所述头发纹理素描图合成得到所述人脸图像的素描图像。And a synthesizing module, configured to synthesize the facial structure sketch map obtained by the first network branching module and the hair texture sketch map obtained by the second network branching module to obtain a sketch image of the facial image.
在一种可能的设计中,所述第一网络分支模块包括的P个卷积层中的前N个卷积层与
所述第二网络分支模块包括的P个卷积层中的前N个卷积层相同或者重合,所述N为大于0小于P的整数。In a possible design, the first N convolution layers of the P convolutional layers included in the first network branching module are
The first N convolutional layers in the P convolutional layers included in the second network branching module are the same or coincident, and the N is an integer greater than 0 and less than P.
在一种可能的设计中,所述第一网络分支模块,具体用于:In a possible design, the first network branching module is specifically configured to:
通过所述第一网络分支模块的所述前N个卷积层,过滤所述人脸图像中的背景特征,得到人脸特征图;Filtering background features in the face image by using the first N convolution layers of the first network branching module to obtain a facial feature map;
通过所述第一网络分支模块的后M个卷积层,获取所述人脸特征图中的面部素描特征;Obtaining a facial sketch feature in the facial feature map by using the last M convolution layers of the first network branching module;
所述第二网络分支模块,具体用于:The second network branch module is specifically configured to:
通过所述深度卷积神经网络模型中第二网络分支的所述前N个卷积层,过滤所述人脸图像中的背景特征,得到人脸特征图;Filtering background features in the face image by using the first N convolution layers of the second network branch in the deep convolutional neural network model to obtain a facial feature map;
通过所述第二网络分支的后M个卷积层,获取所述人脸特征图中的头发素描特征;Obtaining a hair sketch feature in the face feature map by using the last M convolution layers of the second network branch;
其中,P=M+N。Where P = M + N.
在一种可能的设计中,所述第一网络分支模块的后M个卷积层的卷积核尺寸,与所述第二网络分支模块的后M个卷积层的卷积核尺寸对应相等。In a possible design, the convolution kernel size of the last M convolutional layers of the first network branching module is equal to the convolution kernel size of the last M convolutional layers of the second network branching module. .
在一种可能的设计中,所述N为4,所述第一网络分支模块,在通过所述第一网络分支模块的前N个卷积层,过滤所述人脸图像中的背景特征时,具体用于:In a possible design, the N is 4, and the first network branching module filters the background features in the face image when passing through the first N convolution layers of the first network branching module. Specifically for:
通过所述第一网络分支模块的前N个卷积层中的第一卷积层以及第二卷积层,过滤所述人脸图像水平方向以及垂直方向的背景特征;Filtering background features in a horizontal direction and a vertical direction of the face image by using a first convolution layer and a second convolution layer in the first N convolution layers of the first network branching module;
通过所述第一网络分支模块的前N个卷积层中的第三卷积层以及第四卷积层,针对过滤了背景特征的所述人脸图像,在水平方向以及垂直方向上进行平滑处理。Smoothing in the horizontal direction and the vertical direction for the face image filtered by the background feature by the third convolution layer and the fourth convolution layer in the first N convolution layers of the first network branching module deal with.
在一种可能的设计中,所述第一卷积层的卷积核尺寸与第二卷积层的卷积核尺寸相等,第三卷积层的卷积核尺寸和第四卷积层的卷积核尺寸相同。In one possible design, the convolution kernel size of the first convolutional layer is equal to the convolution kernel size of the second convolutional layer, and the convolution kernel size of the third convolutional layer and the fourth convolutional layer The convolution kernel has the same size.
在一种可能的设计中,所述获取模块,还用于获取所述人脸图像中每一个像素点为头发特征点的头发概率;In a possible design, the acquiring module is further configured to acquire a hair probability of each pixel point in the face image as a hair feature point;
所述合成模块,具体用于:The synthesis module is specifically configured to:
将所述第一网络分支模块得到的所述面部结构素描图以及所述第二网络分支模块得到的所述头发纹理素描图合成得到所述人脸图像的素描图像,符合如下公式要求:Combining the facial structure sketch map obtained by the first network branching module and the hair texture sketch map obtained by the second network branching module to obtain a sketch image of the face image, which meets the following formula requirements:
S(i,j)=(1-Ph(i,j))×SS(i,j)+Ph(i,j)×St(i,j)
S (i,j) =(1-P h(i,j) )×S S(i,j) +P h(i,j) ×S t(i,j)
其中,所述S(i,j)为所述人脸图像的素描图像中第i行第j列的像素点的像素值,Ph(i,j)为所述人脸图像的素描图像中第i行第j列的像素点的头发概率,SS(i,j)为所述面部结构素描图像中第i行第j列的像素点的像素值,St(i,j)为所述头发纹理素描图中第i行第j列的像素点的像素值,所述i,j均为大于0的整数。Wherein the S (i, j) is a pixel value of a pixel point of the i-th row and the j-th column in the sketch image of the face image, and P h(i, j) is a sketch image of the face image The hair probability of the pixel of the i-th row and the j-th column, S S(i, j) is the pixel value of the pixel of the i-th row and the j-th column in the sketch image of the face structure, and S t(i, j) is A pixel value of a pixel of the i-th row and the j-th column in the hair texture sketch map, wherein i, j are integers greater than zero.
在一种可能的设计中,所述装置还包括:In one possible design, the device further includes:
训练模块,用于通过如下方式训练得到所述深度卷积神经网络模型:a training module for training the deep convolutional neural network model by:
将训练样本数据库中的若干个人脸样本图像输入初始化的深度卷积神经网络模型进行训练;所述训练样本数据库包括若干个人脸样本图像以及每个人脸样本图像对应的素描样本图像,所述初始化的深度卷积神经网络模型包括权重和偏置;Performing training by inputting a plurality of personal face sample images in the training sample database into the initialized deep convolutional neural network model; the training sample database includes a plurality of personal face sample images and a sketch sample image corresponding to each face sample image, the initialized Deep convolutional neural network models include weights and offsets;
在第K次训练过程中,通过经过K-1次调整的深度卷积神经网络模型的前N个卷积层,过滤所述人脸样本图像中的背景特征,得到所述人脸样本图像的人脸特征图,所述K为大于0的整数;In the Kth training process, the background features in the face sample image are filtered by the first N convolution layers of the K-1 sub-depth convolutional neural network model to obtain the face sample image. a face feature map, the K being an integer greater than 0;
通过所述经过K-1次调整的深度卷积神经网络模型的第一网络分支模块的后M个卷积
层,获取所述人脸样本图像的人脸特征图中的面部素描特征,得到所述人脸样本图像的面部结构素描图;The last M convolutions of the first network branch module through the K-1 adjusted deep convolutional neural network model
a layer, acquiring a facial sketch feature in the face feature image of the face sample image, and obtaining a facial structure sketch map of the face sample image;
通过所述经过K-1次调整的深度卷积神经网络模型的第二网络分支模块的后M个卷积层,获取所述人脸样本图像的人脸特征图中的头发素描特征,得到所述人脸样本图像的头发纹理素描图;Obtaining a hair sketch feature in the face feature image of the face sample image by using the last M convolution layers of the second network branch module of the K-1 sub-depended deep convolutional neural network model a sketch of the hair texture of the face sample image;
将所述人脸样本图像的面部结构素描图以及所述人脸样本图像的头发纹理素描图合成得到所述人脸样本图像的素描图像;Combining a facial structure sketch map of the face sample image and a hair texture sketch map of the face sample image to obtain a sketch image of the face sample image;
在第K次训练后,获取所述人脸样本图像的素描图像与所述人脸样本图像对应的素描样本图像之间的误差值;Obtaining an error value between the sketch image of the face sample image and the sketch sample image corresponding to the face sample image after the Kth training;
基于所述人脸样本图像的素描图像与所述人脸样本图像对应的素描样本图像之间的误差值,调整第K+1次训练过程所使用的权重和偏置。The weight and offset used in the K+1th training process are adjusted based on an error value between the sketch image of the face sample image and the sketch sample image corresponding to the face sample image.
在一种可能的设计中,所述训练模块,在第K次训练过程中,通过经过K-1次调整的深度卷积神经网络模型的前N个卷积层,过滤所述人脸样本图像中的背景特征时,具体用于:In a possible design, the training module filters the face sample image by the first N convolution layers of the K-1 sub-depth convolutional neural network model during the Kth training process. When used in the background feature, specifically for:
将所述人脸样本图像和素描平均图中同一位置的像素点的像素值相加,得到人脸加强图像;Adding the face sample image and the pixel value of the pixel at the same position in the sketch average image to obtain a face enhancement image;
其中,针对所述素描平均图中的任一个像素点的像素值为:所述训练样本数据库中所有素描样本图像中与所述任一个像素点在同一位置的像素点的像素值的平均值;The pixel value of any one of the pixel points in the sketch average map is an average value of pixel values of pixel points in the same position in the sketch sample image in the training sample database that are at the same position as the any one of the pixel points;
通过经过K-1次调整的深度卷积神经网络模型的前N个卷积层,过滤所述人脸加强图像中的背景特征。The background features in the face enhancement image are filtered by the first N convolutional layers of the K-1 sub-depth convolutional neural network model.
在一种可能的设计中,所述获取模块,还用于将所述人脸样本图像划分成若干个相互重叠的图像块,并从所述若干个相互重叠的图像块中获取包括面部特征信息的图像块;In a possible design, the acquiring module is further configured to divide the face sample image into a plurality of mutually overlapping image blocks, and obtain facial feature information from the plurality of mutually overlapping image blocks. Image block
所述训练模块,在通过所述经过K-1次调整的深度卷积神经网络模型的第一网络分支模块的后M个卷积层,获取所述人脸样本图像的人脸特征图中的面部素描特征时,具体用于:The training module acquires the face feature map of the face sample image in the last M convolution layers of the first network branch module of the deep convolutional neural network model that has undergone K-1 adjustments. When the facial sketch feature is used, it is specifically used to:
针对所述获取模块获取的每一个包括面部特征信息的图像块,确定所述每一个包括面部特征信息的图像块在所述人脸样本图像的人脸特征图中对应的目标区域,并将所述目标区域中的图像块和所述每一个包括面部特征信息的图像块中同一位置的像素点的像素值相加,得到面部加强特征图;Determining, for each image block that includes the facial feature information acquired by the acquiring module, that each of the image blocks including the facial feature information is in a corresponding target region in the facial feature map of the facial sample image, and Adding pixel values of the image points in the target area and the pixel points of the same position in each of the image blocks including the facial feature information to obtain a face enhancement feature map;
针对每一个面部加强特征图,通过所述经过K-1次调整的深度卷积神经网络模型的第一网络分支模块的后M个卷积层,获取所述面部加强特征图中的面部素描特征。Obtaining facial sketch features in the facial enhancement feature map by using the last M convolutional layers of the first network branching module of the K-1 sub-depended deep convolutional neural network model for each facial enhancement feature map .
在一种可能的设计中,所述获取模块,还用于将所述人脸样本图像划分成若干个相互重叠的图像块,并从所述若干个相互重叠的图像块中获取包括头发特征信息的图像块;In a possible design, the acquiring module is further configured to divide the face sample image into a plurality of overlapping image blocks, and obtain hair feature information from the plurality of mutually overlapping image blocks. Image block
所述训练模块,在通过所述经过K-1次调整的深度卷积神经网络模型的第二网络分支模块的后M个卷积层,获取所述人脸样本图像的人脸特征图中的头发素描特征时,具体用于:The training module acquires the face feature map of the face sample image in the last M convolution layers of the second network branch module of the deep convolutional neural network model that has undergone K-1 adjustments. When sketching hair features, it is specifically used to:
针对所述获取模块获取的每一个包括头发特征信息的图像块,将所述人脸样本图像和所述包括头发特征信息的图像块中同一位置的像素点的像素值相加,得到头发加强特征图;And adding, to the image block including the hair feature information acquired by the acquiring module, the pixel sample values of the face sample image and the pixel position of the same position in the image block including the hair feature information to obtain a hair enhancement feature Figure
针对每一个头发加强特征图,通过所述经过K-1次调整的深度卷积神经网络模型的第二网络分支模块的后M个卷积层,获取所述头发加强特征图中的头发素描特征。
Obtaining the hair sketch feature in the hair enhancement feature map by using the rear M convolution layers of the second network branching module of the K-1 sub-depth deep convolutional neural network model for each hair enhancement feature map .
在一种可能的设计中,所述获取模块,在从所述若干个相互重叠的图像块中获取包括面部特征信息的图像块时,具体用于:In a possible design, the acquiring module is specifically configured to: when acquiring an image block including facial feature information from the plurality of mutually overlapping image blocks:
针对所述若干个相互重叠的图像块中的每一个图像块,确定所述每一个图像块中的每一个像素点为面部特征点的面部概率;在确定面部概率不为0的像素点的数量大于预设阈值时,则确定所述每一个图像块为包括面部特征信息的图像块。Determining, for each of the plurality of mutually overlapping image blocks, a face probability of each pixel point in each of the image blocks as a facial feature point; determining a number of pixel points whose face probability is not 0 When it is greater than the preset threshold, it is determined that each of the image blocks is an image block including facial feature information.
在一种可能的设计中,所述获取模块,在从所述若干个相互重叠的图像块中获取包括头发特征信息的图像块时,具体用于:In a possible design, the acquiring module is specifically configured to: when acquiring an image block including hair feature information from the plurality of mutually overlapping image blocks:
针对所述若干个相互重叠的图像块中的每一个图像块,确定所述每一个图像块中的每一个像素点为头发特征点的头发概率;在确定头发概率不为0的像素点的数量大于预设阈值时,则确定所述每一个图像块为包括头发特征信息的图像块。Determining, for each of the plurality of mutually overlapping image blocks, a hair probability of each pixel point in each of the image blocks as a hair feature point; determining a number of pixel points whose hair probability is not zero When it is greater than the preset threshold, it is determined that each of the image blocks is an image block including hair feature information.
本申请实施例中基于深度卷积神经网络的方式,通过设计包括用于生成包括人脸特征的第一网络分支,以及用于生成包括头发特征的第二网络分支的结构,从大量的训练样本中学习出有效的特征表达,训练出能够将原始图像生成准确自然的人脸素描图像的网络模型,实现人脸素描图像的自动生成。相比于现有技术中基于合成的人脸素描图像自动生成技术,基于深度卷积神经网络生成人脸素描图像的技术,不再依赖于样本数据库,而是通过深度卷积神经网络中的第一网络分支生成包括人脸特征的结构素描图,通过深度卷积神经网络中的第二网络分支生成包括头发特征的结构素描图,然后合成结构素描图和纹理素描图得到最终的人脸素描图像,提高了人脸素描图像生成技术的准确性以及泛化能力,并且减少了人脸素描图像生成过程中的工作量,从而提高了人脸素描图像生成的速度。In the embodiment of the present application, based on a deep convolutional neural network, a design includes a first network branch for generating a face feature, and a structure for generating a second network branch including a hair feature, from a large number of training samples. The effective feature expression is learned, and the network model which can generate the accurate and natural face sketch image of the original image is trained to realize the automatic generation of the face sketch image. Compared with the prior art synthetic face-based sketch image automatic generation technology, the technique of generating a face sketch image based on a deep convolutional neural network no longer depends on the sample database, but through the deep convolutional neural network. A network branch generates a structural sketch map including facial features, generates a structural sketch map including hair features through a second network branch in the deep convolutional neural network, and then synthesizes the structural sketch map and the texture sketch map to obtain a final human face sketch image. It improves the accuracy and generalization ability of the face sketch image generation technology, and reduces the workload in the face sketch image generation process, thereby improving the speed of face sketch image generation.
第三方面,本发明实施例还提供了一种深度卷积神经网络模型,该模型包括第一网络分支模块以及第二网络分支模块;In a third aspect, an embodiment of the present invention further provides a deep convolutional neural network model, where the model includes a first network branching module and a second network branching module;
其中,所述第一网络分支模块包括P个卷积层,用于获取所述获取模块获取的所述人脸图像中的面部素描特征,得到面部结构素描图;其中,所述P为大于0的整数。The first network branching module includes P convolution layers, and is configured to acquire a facial sketch feature in the face image acquired by the acquiring module, to obtain a facial structure sketch map; wherein the P is greater than 0. The integer.
所述第二网络分支模块包括P个卷积层,用于获取所述获取模块获取的所述人脸图像中的头发素描特征,得到头发纹理素描图。The second network branching module includes P convolution layers, and is configured to obtain a hair sketch feature in the face image acquired by the acquiring module, to obtain a hair texture sketch map.
第四方面,本申请实施例还提供了一种终端,该终端包括处理器和存储器,所述存储器用于存储软件程序,所述处理器用于读取所述存储器中存储的软件程序并实现第一方面或上述第一方面的任意一种设计提供的方法。该电子设备可以是移动终端、计算机等等。In a fourth aspect, an embodiment of the present application further provides a terminal, where the terminal includes a processor and a memory, where the memory is used to store a software program, and the processor is configured to read a software program stored in the memory and implement the first The method provided by one aspect or any of the above first aspects of the design. The electronic device can be a mobile terminal, a computer, or the like.
第五方面,本申请实施例中还提供一种计算机存储介质,该存储介质中存储软件程序,该软件程序在被一个或多个处理器读取并执行时可实现第一方面或上述第一方面的任意一种设计提供的方法。In a fifth aspect, the embodiment of the present application further provides a computer storage medium, where the software program stores a software program, where the software program can be implemented by one or more processors and can implement the first aspect or the first Any of the aspects provided by the design.
图1为本申请实施例提供的一种素描图像的生成方法的流程示意图;FIG. 1 is a schematic flowchart diagram of a method for generating a sketch image according to an embodiment of the present application;
图2A为本申请实施例提供的一种第一深度卷积神经网络模型的结构示意图;2A is a schematic structural diagram of a first deep convolutional neural network model according to an embodiment of the present application;
图2B为本申请实施例提供的另一种第一深度卷积神经网络模型的结构示意图;2B is a schematic structural diagram of another first deep convolutional neural network model according to an embodiment of the present application;
图3为本申请实施例提供的一种过滤人脸图像中的背景特征方法的流程示意图;FIG. 3 is a schematic flowchart of a method for filtering background features in a face image according to an embodiment of the present disclosure;
图4为本申请实施例提供的第二深度卷积神经网络模型的结构示意如图;4 is a schematic structural diagram of a second deep convolutional neural network model according to an embodiment of the present application;
图5为本申请实施例提供的一种素描图像的生成过程的示意图;FIG. 5 is a schematic diagram of a process for generating a sketch image according to an embodiment of the present application; FIG.
图6A为本申请实施例提供的四个待处理的人脸图像;
FIG. 6A is a view of four face images to be processed according to an embodiment of the present application; FIG.
图6B为本申请实施例提供的四个待处理的人脸图像生成素描图像的效果图;6B is an effect diagram of generating a sketch image of four face images to be processed according to an embodiment of the present application;
图7为本申请实施例提供的第一深度卷积神经网络模型的结构示意图;FIG. 7 is a schematic structural diagram of a first deep convolutional neural network model according to an embodiment of the present application;
图8为本申请实施例提供的一种第一深度卷积神经网络模型训练过程的流程示意图;FIG. 8 is a schematic flowchart of a first deep convolutional neural network model training process according to an embodiment of the present disclosure;
图9为本申请实施例提供的一种图像块加入方法的示意图;FIG. 9 is a schematic diagram of a method for adding an image block according to an embodiment of the present application;
图10为本申请实施例提供的一种素描图像的生成装置的结构示意图;FIG. 10 is a schematic structural diagram of a device for generating a sketch image according to an embodiment of the present application;
图11为本申请实施例提供的一种深度卷积神经网络模型的结构示意图;FIG. 11 is a schematic structural diagram of a deep convolutional neural network model according to an embodiment of the present application;
图12为本申请实施例提供的一种终端实现方式的结构示意图。FIG. 12 is a schematic structural diagram of a terminal implementation manner according to an embodiment of the present disclosure.
下面将结合附图对本申请实施例作进一步地详细描述。The embodiments of the present application will be further described in detail below with reference to the accompanying drawings.
本申请实施例提供一种素描图像的生成方法及装置,用以解决现有技术中存在的人脸素描图像自动生成技术准确性较低,泛化能力较差且素描图像生成的速度慢的问题。其中,方法和装置是基于同一发明构思的,由于方法及装置解决问题的原理相似,因此装置与方法的实施可以相互参见,重复之处不再赘述。The embodiment of the present invention provides a method and a device for generating a sketch image, which are used to solve the problem that the automatic face generation image automatic generation technology in the prior art has low accuracy, poor generalization ability, and slow sketch image generation. . The method and the device are based on the same inventive concept. Since the principles of the method and the device for solving the problem are similar, the implementation of the device and the method can be referred to each other, and the repeated description is not repeated.
本申请实施例可以应用于电子设备中,比如计算机,平板电脑、笔记本、智能手机、服务器等。The embodiments of the present application can be applied to electronic devices, such as computers, tablets, notebooks, smart phones, servers, and the like.
本申请实施例的应用领域包括但不限定于:人脸图像领域、车辆图像领域、植物图像领域、或是其他类型的图像领域。The fields of application of the embodiments of the present application include, but are not limited to, a face image field, a vehicle image field, a plant image field, or other types of image fields.
相应的,本申请实施例应用于人脸图像领域,生成人脸素描图像时,预先采用若干个人脸样本图像进行训练;应用于车辆图像领域,生成车辆素描图像时,预先采用若干个车辆样本图像进行训练;应用于植物图像领域,生成植物素描图像时,预先采用若干个植物样本图像进行训练;应用于其他类型的图像领域,生成其他类型的素描图像时,预先采用若干个其他类型的样本图像进行训练。Correspondingly, the embodiment of the present application is applied to the face image field, and when generating a face sketch image, a plurality of personal face sample images are used for training in advance; when applied to a vehicle image field, when generating a vehicle sketch image, a plurality of vehicle sample images are used in advance. Training; applied to the field of plant images, when plant sketch images are generated, several plant sample images are used in advance for training; when applied to other types of image fields, when generating other types of sketch images, several other types of sample images are pre-applied. Train.
本申请实施例除了可以用于生成素描图像,也可以用于生成灰度图像。Embodiments of the present application can be used to generate grayscale images in addition to being used to generate sketch images.
相应的,本申请实施例应用于人脸图像领域,生成人脸灰度图像时,预先采用若干个人脸素描样本图像进行训练;应用于车辆图像领域,生成车辆灰度图像时,预先采用若干个车辆素描样本图像进行训练;应用于植物图像领域,生成植物灰度图像时,预先采用若干个植物素描样本图像进行训练;应用于其他类型的图像领域,生成其他类型的灰度图像时,预先采用若干个其他类型的素描样本图像进行训练。Correspondingly, the embodiment of the present application is applied to the field of face images, and when generating a face gray image, a plurality of face sketch sample images are used in advance for training; when applied to a vehicle image field, when generating a vehicle gray image, a plurality of The vehicle sketch sample image is used for training; in the field of plant image, when plant grayscale image is generated, several plant sketch sample images are used for training in advance; when applied to other types of image fields, other types of grayscale images are generated in advance. Several other types of sketch sample images are trained.
为了使得本申请的实施例更容易被理解,下面,首先对本申请的实施例中涉及的一些描述加以说明,以便与本领域技术人员理解,这些说明不应视为对本申请所要求的保护范围的限定。In order to make the embodiments of the present application easier to understand, in the following, some descriptions of the embodiments of the present application are first described in order to be understood by those skilled in the art, and the description should not be regarded as the scope of protection required by the present application. limited.
卷积神经网络是一个多层的神经网络,每层由多个二维平面组成,而每个平面由多个独立神经元组成。在本申请实施例中,神经元可以认为是一个一个的像素。A convolutional neural network is a multi-layered neural network, each layer consisting of multiple two-dimensional planes, each of which consists of multiple independent neurons. In the embodiment of the present application, a neuron can be considered as one pixel.
若干,是指两个或两个以上。A few, referring to two or more.
另外,需要理解的是,在本申请的描述中,“第一”、“第二”等词汇,仅用于区分描述的目的,而不能理解为指示或暗示相对重要性,也不能理解为指示或暗示顺序。In addition, it should be understood that in the description of the present application, the terms "first", "second" and the like are used only to distinguish the purpose of description, and are not to be understood as indicating or implying relative importance, nor as an indication. Or suggest the order.
参阅图1所示,为本申请实施例提供的素描图像的生成方法的流程图,所述方法由电子设备执行,具体可以包括如下:FIG. 1 is a flowchart of a method for generating a sketch image according to an embodiment of the present disclosure. The method is performed by an electronic device, and specifically includes the following:
步骤S101,获取待处理的人脸图像。
Step S101: Acquire a face image to be processed.
需要说明的是,在步骤S101中,获取待处理的人脸图像的方式包括但不限于:通过传感设备采集待处理的人脸图像、在数据库中获取待处理的人脸图像等。It should be noted that, in step S101, the manner of acquiring the face image to be processed includes, but is not limited to, collecting a face image to be processed through the sensing device, acquiring a face image to be processed in a database, and the like.
所述传感设备包括但不限于:光线传感设备、摄像设备、采集设备等。The sensing device includes, but is not limited to, a light sensing device, an imaging device, an acquisition device, and the like.
所述数据库包括但不限于:本地数据库、云端数据库、U盘、硬盘等。The database includes, but is not limited to, a local database, a cloud database, a USB flash drive, a hard disk, and the like.
步骤S102,通过预先训练的第一深度卷积神经网络模型中第一网络分支的P个卷积层,获取所述人脸图像中的面部素描特征,得到面部结构素描图,其中,所述P为大于0的整数。Step S102: Acquire a facial sketch feature in the face image by using P convolution layers of the first network branch in the pre-trained first deep convolutional neural network model, to obtain a facial structure sketch map, where the P Is an integer greater than 0.
步骤S103,通过所述第一深度卷积神经网络模型中第二网络分支的P个卷积层,获取所述人脸图像中的头发素描特征,得到头发纹理素描图。Step S103: Acquire a hair sketch feature in the face image by using P convolution layers of the second network branch in the first deep convolutional neural network model to obtain a hair texture sketch map.
步骤S104,将所述面部结构素描图以及所述头发纹理素描图合成得到所述人脸图像的素描图像。Step S104, synthesizing the facial structure sketch map and the hair texture sketch map to obtain a sketch image of the face image.
需要说明的是,步骤S102和步骤S103并没有严格的先后顺序,可以先执行步骤S102后执行步骤S103,也可以先执行步骤S102后执行步骤S103,也可以同时执行步骤S102和步骤S103,本申请实施例在这里不做具体限定。It should be noted that step S102 and step S103 are not strictly sequential. Step S103 may be performed after step S102, or step S103 may be performed after step S102, or step S102 and step S103 may be performed simultaneously. The embodiment is not specifically limited herein.
本申请实施例中基于深度卷积神经网络的方式,通过设计包括用于生成包括人脸特征的第一网络分支,以及用于生成包括头发特征的第二网络分支的结构,从大量的训练样本中学习出有效的特征表达,训练出能够将原始图像生成准确自然的人脸素描图像的网络模型,实现人脸素描图像的自动生成。相比于现有技术中基于合成的人脸素描图像自动生成技术,基于深度卷积神经网络生成人脸素描图像的技术,不再依赖于样本数据库,而是通过深度卷积神经网络中的第一网络分支生成包括人脸特征的结构素描图,通过深度卷积神经网络中的第二网络分支生成包括头发特征的结构素描图,然后合成结构素描图和纹理素描图得到最终的人脸素描图像,提高了人脸素描图像生成技术的准确性以及泛化能力,并且减少了人脸素描图像生成过程中的工作量,从而提高了人脸素描图像生成的速度。In the embodiment of the present application, based on a deep convolutional neural network, a design includes a first network branch for generating a face feature, and a structure for generating a second network branch including a hair feature, from a large number of training samples. The effective feature expression is learned, and the network model which can generate the accurate and natural face sketch image of the original image is trained to realize the automatic generation of the face sketch image. Compared with the prior art synthetic face-based sketch image automatic generation technology, the technique of generating a face sketch image based on a deep convolutional neural network no longer depends on the sample database, but through the deep convolutional neural network. A network branch generates a structural sketch map including facial features, generates a structural sketch map including hair features through a second network branch in the deep convolutional neural network, and then synthesizes the structural sketch map and the texture sketch map to obtain a final human face sketch image. It improves the accuracy and generalization ability of the face sketch image generation technology, and reduces the workload in the face sketch image generation process, thereby improving the speed of face sketch image generation.
本申请实施例中,所述第一深度卷积神经网络模型在第一网络分支的P个卷积层以及第二网络分支的P个卷积层之前还可以包括输入层,所述输入层的滤波通道数为3。电子设备在获取到待处理的人脸图像后,通过所述输入层将待处理的人脸图像进行处理得到3个图像,分别为包括红色(英文:red,简称:R)元素的图像、绿色(英文:green,简称:G)元素的图像、蓝色(英文:blue,简称:B)元素的图像。再将得到R元素的图像、G元素的图像、B元素的图像输入到所述第一卷积层。所述第一深度卷积神经网络模型还可以针对明亮度色度YUV元素分别提取元素特征生成图像。In the embodiment of the present application, the first deep convolutional neural network model may further include an input layer before the P convolution layers of the first network branch and the P convolution layers of the second network branch, where the input layer The number of filter channels is 3. After acquiring the image of the face to be processed, the electronic device processes the image of the face to be processed through the input layer to obtain three images, which are images including red (English: red, referred to as: R) elements, and green. (English: green, abbreviation: G) Image of the element, blue (English: blue, abbreviated as: B) element image. Further, an image of the R element, an image of the G element, and an image of the B element are input to the first convolutional layer. The first deep convolutional neural network model may also extract an element feature generation image separately for the luminance chrominance YUV element.
其中,所述第一深度卷积神经网络模型中的每一个卷积层可以以修正线性单元(英文:Rectified Linear Units,简称:ReLU)作为激活函数。Each of the convolutional layers in the first deep convolutional neural network model may be an activation function by using a modified linear unit (English: Rectified Linear Units, referred to as ReLU).
本申请实施例中,第一深度卷积神经网络模型中的每一个卷积层所用的卷积核(英文:Convolution,简称:Conv)尺寸可以为A*B,其中,A和B均为正整数,A和B可以相等,也可以不相等,本申请实施例对此不作具体限定。In the embodiment of the present application, the convolution kernel (Convolution, referred to as Conv) used in each convolution layer in the first deep convolutional neural network model may be A*B, wherein both A and B are positive. The integers, A and B, may be equal or non-equal, and are not specifically limited in this embodiment.
需要说明的是,所述第一深度卷积神经网络模型中的每一个卷积层的输入和输出均有一个或多个特征图,输出的特征图的数量与输入的特征图的数量以及滤波通道数相关,比如输入一个人脸图像,通过所述输入层的3个滤波通道后,得到3个特征图。It should be noted that the input and output of each convolution layer in the first deep convolutional neural network model have one or more feature maps, the number of output feature maps and the number of input feature maps and filtering. The number of channels is related, for example, inputting a face image, and after passing through the three filtering channels of the input layer, three feature maps are obtained.
本申请实施例中的所述第一网络分支与所述第二网络分支可以为两个独立的分支,如图2A所示;当然所述第一网络分支与所述第二网络分支也可以共用所述第一深度卷积神
经网络模型中的前N个卷积层,如图2B所示。在所述第一网络分支与所述第二网络分支为两个独立的分支的情况下,所述第一网络分支的前N个卷积层和所述第二网络分支的前N个卷积层相同,所述N为大于0小于P的整数。The first network branch and the second network branch in the embodiment of the present application may be two independent branches, as shown in FIG. 2A; of course, the first network branch and the second network branch may also be shared. The first depth convolution god
The first N convolutional layers in the network model are shown in Figure 2B. In the case where the first network branch and the second network branch are two independent branches, the first N convolution layers of the first network branch and the first N convolutions of the second network branch The layers are the same, and the N is an integer greater than 0 and less than P.
无论所述第一网络分支与所述第二网络分支为两个独立的分支情况下,还是所述第一网络分支与所述第二网络分支共用前N个卷积层情况下,所述第一网络分支的前N个卷积层与所述第二网络分支的前N个卷积层均是用于过滤所述人脸图像中的背景特征,得到人脸特征图。下面,参阅图3所示,以所述N为4为例,针对通过前N个卷积层过滤所述人脸图像中的背景特征,得到人脸特征图的过程,进行具体描述:In the case that the first network branch and the second network branch are two independent branches, or the first network branch and the second network branch share the first N convolution layers, the first The first N convolutional layers of a network branch and the first N convolutional layers of the second network branch are used to filter background features in the face image to obtain a facial feature map. In the following, referring to FIG. 3, taking N as 4 as an example, a process of obtaining a face feature map by filtering background features in the face image through the first N convolution layers is described in detail:
S301,通过第一深度卷积神经网络模型中第一网络分支的前N个卷积层中的第一卷积层以及第二卷积层,过滤所述人脸图像水平方向以及垂直方向的背景特征。S301. Filter the background of the horizontal and vertical directions of the face image by using a first convolution layer and a second convolution layer in the first N convolution layers of the first network branch in the first deep convolutional neural network model. feature.
所述第一卷积层的卷积核尺寸与所述第二卷积层的卷积核尺寸相等。The convolution kernel size of the first convolutional layer is equal to the convolution kernel size of the second convolutional layer.
步骤S301中,所述第一卷积层可以为用于过滤所述人脸图像水平方向的背景特征的卷积层,所述第二卷积层为用于过滤所述人脸图像垂直方向的背景特征的卷积层;所述第一卷积层也可以为用于过滤所述人脸图像垂直方向的背景特征的卷积层,所述第二卷积层为用于过滤所述人脸图像水平方向的背景特征的卷积层,本申请实施例在这里不做具体限定。即,本申请实施例中,针对先过滤所述人脸图像水平方向的背景特征,还是过滤所述人脸图像垂直方向的背景特征的顺序不作具体限定。In step S301, the first convolution layer may be a convolution layer for filtering background features in a horizontal direction of the face image, and the second convolution layer is for filtering a vertical direction of the face image. a convolutional layer of a background feature; the first convolutional layer may also be a convolution layer for filtering background features in a vertical direction of the face image, the second convolutional layer being for filtering the face The convolution layer of the background feature in the horizontal direction of the image is not specifically limited herein. That is, in the embodiment of the present application, the order of filtering the background feature in the horizontal direction of the face image or the background feature in the vertical direction of the face image is not specifically limited.
S302,通过第一深度卷积神经网络模型中第一网络分支的前N个卷积层中的第三卷积层以及第四卷积层,针对过滤了背景特征的所述人脸图像,在水平方向以及垂直方向上进行平滑处理。S302, by using a third convolution layer and a fourth convolution layer in the first N convolution layers of the first network branch in the first deep convolutional neural network model, for the face image filtered by the background feature, Smoothing is performed in the horizontal direction and in the vertical direction.
步骤S302中,所述第三卷积层的卷积核尺寸与所述第四卷积层的卷积核尺寸相等。所述第三卷积层可以为用于针对过滤了背景特征的所述人脸图像,在水平方向上进行平滑处理的卷积层,所述第四卷积层为用于针对过滤了背景特征的所述人脸图像,在垂直方向上进行平滑处理的卷积层;所述第三卷积层也可以为用于针对过滤了背景特征的所述人脸图像,在垂直方向上进行平滑处理的卷积层,所述第四卷积层为用于针对过滤了背景特征的所述人脸图像,在水平方向上进行平滑处理的卷积层,本申请实施例在这里不做具体限定。即,本申请实施例中,针对先在水平方向进行平滑处理,还是先在垂直方向进行平滑处理不作具体限定。In step S302, the convolution kernel size of the third convolutional layer is equal to the convolution kernel size of the fourth convolutional layer. The third convolutional layer may be a convolutional layer for smoothing processing in a horizontal direction for the face image filtered with background features, the fourth convolutional layer being used for filtering background features The face image is a convolution layer that is smoothed in a vertical direction; the third convolution layer may also be used for smoothing the vertical direction for the face image filtered by the background feature. The convolutional layer is a convolutional layer for performing smoothing processing in the horizontal direction for the face image in which the background feature is filtered. The embodiment of the present application is not specifically limited herein. That is, in the embodiment of the present application, the smoothing process in the horizontal direction or the smoothing process in the vertical direction is not specifically limited.
本申请实施例中所述第一网络分支的前N个卷积层与所述第二网络分支的前N个卷积层相同或者共用所述第一深度卷积神经网络模型中的前N个卷积层,提高了所述第一深度卷积神经网络模型的计算效率,并且通过采用第一网络分支的前N个卷积层中的第一卷积层以及第二卷积层用于过滤待处理的人脸图像水平方向以及垂直方向的背景特征,第三卷积层以及第四卷积层用于针对过滤了背景特征的所述人脸图像,在水平方向以及垂直方向上进行平滑处理,提高了人脸素描图像生成技术的准确性,并且使得生成的素描图像更自然。The first N convolutional layers of the first network branch in the embodiment of the present application are the same as the first N convolutional layers of the second network branch or share the first N in the first deep convolutional neural network model. a convolution layer that improves computational efficiency of the first deep convolutional neural network model and uses a first convolutional layer and a second convolutional layer in the first N convolutional layers of the first network branch for filtering The background feature of the face image to be processed in the horizontal direction and the vertical direction, the third convolution layer and the fourth convolution layer are used for smoothing the horizontal and vertical directions for the face image filtered with the background feature , improve the accuracy of the face sketch image generation technology, and make the generated sketch image more natural.
可选的,所述第一卷积层的滤波通道数量为a,所述第二卷积层的滤波通道数量为b,所述第三卷积层的滤波通道数量为c,所述第四卷积层的滤波通道数量为d,所述a,b可以均为大于等于100且小于等于200的正整数,且所述a和所述b相等,所述c,d可以均为大于等于1且小于等于100的正整数,所述c和所述d相等。具体的,针对每一个卷积层的滤波通道数量,本申请实施例不作具体限定。
Optionally, the number of filter channels of the first convolution layer is a, the number of filter channels of the second convolution layer is b, and the number of filter channels of the third convolution layer is c, the fourth The number of filtering channels of the convolutional layer is d, and the a, b may both be positive integers greater than or equal to 100 and less than or equal to 200, and the a and the b are equal, and the c and d may both be greater than or equal to 1. And a positive integer less than or equal to 100, the c and the d being equal. Specifically, the number of the filtering channels of each convolution layer is not specifically limited in this embodiment.
在通过所述第一网络分支的前N个卷积层过滤所述人脸图像中的背景特征,得到人脸特征图后,通过所述第一网络分支的后M个卷积层,获取所述人脸特征图中的面部素描特征,得到面部结构素描图。其中,P=M+N。M,N的具体取值,本申请实施例中不作具体限定,以所述N为4,M为2为例,具体的,可以通过所述第一网络分支的第五卷积层和第六卷积层,获取所述人脸特征图中的水平方向和垂直方向的面部素描特征,得到面部结构素描图。Filtering background features in the face image by using the first N convolution layers of the first network branch to obtain a face feature map, and obtaining the location through the last M convolution layers of the first network branch The facial sketch feature in the face feature map is obtained, and a facial structure sketch map is obtained. Where P = M + N. The specific values of M and N are not specifically limited in the embodiment of the present application. The N is 4 and the M is 2. For example, the fifth convolutional layer and the sixth branch of the first network branch may be adopted. The convolution layer acquires facial sketch features in the horizontal and vertical directions in the facial feature map to obtain a facial structure sketch map.
所述第一网络分支的第五卷积层的卷积核尺寸与第六卷积层的卷积核尺寸相等。所述第一网络分支的第五卷积层可以为用于获取所述人脸特征图中的水平方向的面部素描特征的卷积层,第六卷积层为用于获取所述人脸特征图中的垂直方向的面部素描特征的卷积层;所述第一网络分支的第五卷积层也可以为用于获取所述人脸特征图中的垂直方向的面部素描特征的卷积层,第六卷积层为用于获取所述人脸特征图中的水平方向的面部素描特征的卷积层,本申请实施例在这里不做具体限定。即,本申请实施例中,针对先获取所述人脸特征图中的水平方向的面部素描特征,还是获取所述人脸特征图中的垂直方向的面部素描特征的顺序不作具体限定。The convolution kernel size of the fifth convolutional layer of the first network branch is equal to the convolution kernel size of the sixth convolutional layer. The fifth convolution layer of the first network branch may be a convolution layer for acquiring a horizontal sketch feature in the horizontal feature map, and the sixth convolution layer is for acquiring the facial feature a convolution layer of a face sketch feature in a vertical direction in the figure; the fifth convolution layer of the first network branch may also be a convolution layer for acquiring a face sketch feature in a vertical direction in the face feature map The sixth volume is a convolution layer for acquiring the horizontal sketch feature in the horizontal feature map. The embodiment of the present application is not specifically limited herein. That is, in the embodiment of the present application, the order of acquiring the face sketch feature in the horizontal direction in the face feature map or the face sketch feature in the vertical direction in the face feature map is not specifically limited.
在通过所述第二网络分支的前N个卷积层过滤所述人脸图像中的背景特征,得到人脸特征图后,通过所述第二网络分支的后M个卷积层,获取所述人脸特征图中的头发素描特征,得到头发纹理素描图。以所述N为4,M为2为例,通过所述第二网络分支的第五卷积层和第六卷积层,获取所述人脸特征图中的水平方向和垂直方向的头发素描特征,得到头发纹理素描图。Filtering a background feature in the face image by using the first N convolution layers of the second network branch to obtain a face feature map, and obtaining a location through the last M convolution layers of the second network branch The hair sketch feature in the face feature map is obtained, and a hair texture sketch is obtained. Taking N as 4 and M as 2 as an example, the horizontal and vertical hair sketches in the face feature map are obtained by the fifth convolution layer and the sixth convolution layer of the second network branch. Features to get a sketch of the hair texture sketch.
所述第二网络分支的第五卷积层的卷积核尺寸与第六卷积层的卷积核尺寸相等。所述第二网络分支的第五卷积层可以为用于获取所述人脸特征图中的水平方向的头发素描特征的卷积层,第六卷积层为用于获取所述人脸特征图中的垂直方向的头发素描特征的卷积层;所述第二网络分支的第五卷积层也可以为用于获取所述人脸特征图中的垂直方向的头发素描特征的卷积层,第六卷积层为用于获取所述人脸特征图中的水平方向的头发素描特征的卷积层,本申请实施例在这里不做具体限定。即,本申请实施例中,针对先获取所述人脸特征图中的水平方向的头发素描特征,还是获取所述人脸特征图中的垂直方向的头发素描特征的顺序不作具体限定。The convolution kernel size of the fifth convolutional layer of the second network branch is equal to the convolution kernel size of the sixth convolutional layer. The fifth convolution layer of the second network branch may be a convolution layer for acquiring a horizontal direction hair sketch feature in the face feature map, and the sixth convolution layer is for acquiring the facial feature a convolutional layer of a vertical hair sketch feature in the figure; the fifth convolutional layer of the second network branch may also be a convolution layer for acquiring a vertical hair sketch feature in the facial feature map The sixth volume layer is a convolution layer for acquiring the horizontal direction hair sketch feature in the face feature map, which is not specifically limited herein. That is, in the embodiment of the present application, the order of acquiring the hair sketch feature in the horizontal direction in the face feature map or the hair sketch feature in the vertical direction in the face feature map is not specifically limited.
可选的,所述第一网络分支的后M个卷积层的卷积核尺寸,与所述第二网络分支的后M个卷积层的卷积核尺寸对应相等。以N为4,M为2为例,所述第一网络分支的第五卷积层的卷积核尺寸与所述第二网络分支的第五卷积层的卷积核尺寸相等,所述第一网络分支的第六卷积层的卷积核尺寸与所述第二网络分支的第六卷积层的卷积核尺寸相等。Optionally, the convolution kernel size of the last M convolution layers of the first network branch is equal to the convolution kernel size of the last M convolution layers of the second network branch. Taking N as 4 and M as 2 as an example, the convolution kernel size of the fifth convolutional layer of the first network branch is equal to the convolution kernel size of the fifth convolutional layer of the second network branch, The convolution kernel size of the sixth convolutional layer of the first network branch is equal to the convolution kernel size of the sixth convolutional layer of the second network branch.
可选的,所述第一网络分支的第五卷积层、所述第一网络分支的第六卷积层、所述第二网络分支的第五卷积层与所述第二网络分支的第六卷积层这四个卷积层的滤波通道数量可以均为1。Optionally, a fifth convolution layer of the first network branch, a sixth convolution layer of the first network branch, a fifth convolution layer of the second network branch, and the second network branch The number of filter channels of the four convolutional layers of the sixth volume can be all 1.
在一种可能的实现方式中,在将所述面部结构素描图以及所述头发纹理素描图合成得到所述人脸图像的素描图像时,可以基于每个像素点为头发特征点的头发概率为基础进行合成,具体的,可以在执行合成在将所述面部结构素描图以及所述头发纹理素描图合成得到所述人脸图像的素描图像之前,先获取所述人脸图像中每一个像素点为头发特征点的头发概率。在通过上述方法得到面部结构素描图以及头发纹理素描图后,将所述面部结构素描图以及所述头发纹理素描图合成得到所述人脸图像的素描图像,符合如下公式要求:
In a possible implementation manner, when the facial structure sketch map and the hair texture sketch map are synthesized to obtain a sketch image of the face image, the hair probability of the hair feature point may be based on each pixel point. Basically, the synthesis is performed. Specifically, before performing the synthesis of the facial structure sketch map and the hair texture sketch map to obtain the sketch image of the face image, each pixel point in the face image is acquired. The probability of hair for the hair feature points. After obtaining the facial structure sketch map and the hair texture sketch map by the above method, the facial structure sketch map and the hair texture sketch map are combined to obtain a sketch image of the face image, which meets the following formula requirements:
S(i,j)=(1-Ph(i,j))×SS(i,j)+Ph(i,j)×St(i,j)
S (i,j) =(1-P h(i,j) )×S S(i,j) +P h(i,j) ×S t(i,j)
其中,所述S(i,j)为所述人脸图像的素描图像中第i行第j列的像素点的像素值,Ph(i,j)为所述人脸图像的素描图像中第i行第j列的像素点的头发概率,SS(i,j)为所述面部结构素描图像中第i行第j列的像素点的像素值,St(i,j)为所述头发纹理素描图中第i行第j列的像素点的像素值,所述i,j均为大于0的整数。Wherein the S (i, j) is a pixel value of a pixel point of the i-th row and the j-th column in the sketch image of the face image, and P h(i, j) is a sketch image of the face image The hair probability of the pixel of the i-th row and the j-th column, S S(i, j) is the pixel value of the pixel of the i-th row and the j-th column in the sketch image of the face structure, and S t(i, j) is A pixel value of a pixel of the i-th row and the j-th column in the hair texture sketch map, wherein i, j are integers greater than zero.
上述实现方式中,通过采用基于头发概率将面部结构素描图以及头发纹理素描图合成得到人脸图像的素描图像的方式,使得合成的素描图像不仅能够很好的保留面部结构信息,同时对于头发纹理信息也保留的比较好。In the above implementation manner, the synthetic sketch image not only retains the facial structure information well, but also the hair texture by adopting a method of synthesizing the facial structure sketch map and the hair texture sketch map based on the hair probability to obtain the sketch image of the face image. Information is also kept better.
可选的,通过第二深度卷积神经网络模型获取所述人脸图像中每个像素点的头发概率。Optionally, the hair probability of each pixel in the face image is obtained by using a second deep convolutional neural network model.
比如,如图4所示,所述第二深度卷积神经网络模型可以包括7个连接层,其中,第一连接层、第二连接层以及第三连接层,每一层均包括一个以ReLU为激活函数且卷积(Conv)核尺寸为5×5的卷积层、一个Conv尺寸为3×3的池化(pooling)层以及一个局部响应归一化(英文:Local Response Normalization,简称:LRN)层;第四连接层包括一个以ReLU为激活函数且Conv核尺寸为3×3的卷积层;第五连接层包括一个以ReLU为激活函数且Conv核尺寸为3×3的卷积层;第六连接层包括一个以ReLU为激活函数且Conv核尺寸为1×1的卷积层;第七连接层包括一个以ReLU为激活函数、Conv尺寸为1×1的卷积层。所述第二深度卷积神经网络模型可以预先通过海伦数据库Helen dataset中的样本图像进行训练。For example, as shown in FIG. 4, the second deep convolutional neural network model may include seven connection layers, wherein the first connection layer, the second connection layer, and the third connection layer each include one ReLU A convolution layer with a convolution (Conv) kernel size of 5×5, a pooling layer with a Conv size of 3×3, and a local response normalization (English: Local Response Normalization, for short: The LRN) layer; the fourth connection layer includes a convolution layer having a ReLU as an activation function and a Conv kernel size of 3×3; and the fifth connection layer includes a convolution with a ReLU as an activation function and a Conv kernel size of 3×3. The sixth connection layer includes a convolution layer having a ReLU as an activation function and a Conv core size of 1×1; and the seventh connection layer includes a convolution layer having a ReLU as an activation function and a Conv size of 1×1. The second deep convolutional neural network model can be trained in advance by sample images in the Helen dataset of the Helen database.
其中,第一连接层、第二连接层以及第三连接层,用于获取所述人脸图像的头发特征、面部特征以及背景特征;第四连接层以及第五连接层,用于针对所述获取了头发特征、面部特征以及背景特征的所述人脸图像,获取水平方向以及垂直方向的面部轮廓特征、头发轮廓特征以及背景轮廓特征;第六连接层以及第七连接层,用于针对获取了面部轮廓特征、头发轮廓特征以及背景轮廓特征的所述人脸图像,在水平方向和垂直方向上进行平滑处理。The first connection layer, the second connection layer, and the third connection layer are configured to acquire hair features, facial features, and background features of the face image; a fourth connection layer and a fifth connection layer, configured to Obtaining the facial image of the hair feature, the facial feature, and the background feature, acquiring the facial contour feature, the hair contour feature, and the background contour feature in the horizontal direction and the vertical direction; the sixth connection layer and the seventh connection layer are used for acquiring The face image of the facial contour feature, the hair contour feature, and the background contour feature are smoothed in the horizontal direction and the vertical direction.
在通过第二深度卷积神经网络模型获取所述人脸图像中每个像素点的头发概率时,针对所述人脸图像中每个像素点,在所述每个像素点位于头发轮廓涵盖的区域时,所述每个像素点的头发概率为1,面部概率以及背景概率均为0;在所述每个像素点位于面部轮廓涵盖的区域时,所述每个像素点的面部概率为1,头发概率以及背景概率均为0;在所述每个像素点位于背景轮廓涵盖的区域时,所述每个像素点的背景概率为1,头发概率以及面部概率均为0。面部轮廓涵盖的区域范围内的像素点均为面部特征点,头发轮廓涵盖的区域范围内的像素点均为头发特征点,背景轮廓涵盖的区域范围内的像素点均为背景特征点。When the hair probability of each pixel in the face image is acquired by the second depth convolutional neural network model, for each pixel point in the face image, the pixel contour is covered at each pixel point In the region, the hair probability of each pixel is 1, the face probability and the background probability are both 0; when the pixel is located in the area covered by the face contour, the face probability of each pixel is 1 The hair probability and the background probability are both 0; when each pixel point is located in an area covered by the background contour, the background probability of each pixel point is 1, and the hair probability and the face probability are both 0. The pixels in the area covered by the facial contour are facial feature points, the pixels in the area covered by the hair contour are hair feature points, and the pixels in the area covered by the background contour are background feature points.
为了更好的理解本申请实施例,下面以P为4为例,参阅图5所示,对素描图像的生成过程进行示例性说明:For a better understanding of the embodiment of the present application, the following takes P as 4 as an example. Referring to FIG. 5, an exemplary process of generating a sketch image is exemplified:
将人脸图像输入到第一深度卷积神经网络模型中,通过第一深度卷积神经网络模型的第一网络分支的4个卷积层获取所述人脸图像的面部结构素描图;通过第一深度卷积神经网络模型的第二网络分支4个卷积层获取所述人脸图像的头发纹理素描图。然后,根据每一个像素点为头发特征点的头发概率获取所述面部结构素描图的面部部分,根据每一个像素点为头发特征点的头发概率获取所述头发纹理素描图的头发部分,最后将所述面部部分和所述头发部分合成所述人脸图像的素描图像。Entering a face image into the first deep convolutional neural network model, and acquiring a facial structure sketch of the face image by using four convolution layers of the first network branch of the first deep convolutional neural network model; A second network branch of a deep convolutional neural network model, four convolutional layers, acquires a hair texture sketch of the face image. Then, the facial portion of the facial structure sketch map is obtained according to the hair probability of each pixel point as the hair feature point, and the hair portion of the hair texture sketch map is obtained according to the hair probability of each pixel point as the hair feature point, and finally The face portion and the hair portion synthesize a sketch image of the face image.
通过本申请实施例中提供的素描生成方法将人脸图像合成后得到素描图像,如图6A
所示,为四个待处理的人脸图像,如图6B所示,为图6A所示的四个待处理的人脸图像生成素描图像的效果图,即图6A所示的四个待处理的人脸图像分别经过第一深度卷积神经网络模型处理得到的素描图像。The face image is synthesized by the sketch generation method provided in the embodiment of the present application to obtain a sketch image, as shown in FIG. 6A.
As shown in FIG. 6B, there are four face images to be processed, as shown in FIG. 6B, and an effect diagram of generating a sketch image for the four face images to be processed shown in FIG. 6A, that is, four to-be-processed as shown in FIG. 6A. The face image is processed by the first deep convolutional neural network model to obtain a sketch image.
本申请实施例中采用的第一深度卷积神经网络模型可以预先通过训练样本数据库中的人脸样本图像针对初始化的第一深度卷积神经网络模型进行训练得到,所述训练样本数据库包括若干个人脸样本图像以及每个人脸样本图像对应的素描样本图像,所述初始化的第一深度卷积神经网络模型可以包括权重和偏置,当然还可以仅包括权重,偏置为0。The first deep convolutional neural network model used in the embodiment of the present application may be obtained by training the initial depth convolutional neural network model in the training sample database in advance, and the training sample database includes several individuals. The face sample image and the sketch sample image corresponding to each face sample image, the initialized first deep convolutional neural network model may include weights and offsets, and may of course include only weights, offset to zero.
下面,以图7所示的第一深度卷积神经网络模型为例,对第一深度卷积神经网络模型的训练过程进行具体描述,如图7所示的第一深度卷积神经网络模型中包括:所述第一网络分支与所述第二网络分支共用的第一深度卷积神经网络模型的前四个卷积层,分别为以ReLU为激活函数且Conv核尺寸为5×5的第一卷积层,以ReLU为激活函数且Conv核尺寸为5×5的第二卷积层,以ReLU为激活函数且Conv核尺寸为1×1的第三卷积层,以ReLU为激活函数且Conv核尺寸为1×1的第四卷积层;第一网络分支的后两个卷积层分别为以ReLU为激活函数且Conv核尺寸为3×3的第一网络分支的第五卷积层,以ReLU为激活函数且Conv核尺寸为3×3的第一网络分支的第六卷积层;第二网络分支的后两个卷积层分别为以ReLU为激活函数且Conv尺寸为3×3的第二网络分支的第五卷积层,以ReLU为激活函数且Conv核尺寸为3×3的第二网络分支的第六卷积层。上述卷积核的尺寸大小仅作为一种示例,并不对本申请中卷积核的大小的配置造成具体限定。本申请实施例中对卷积核的尺寸的大小不作具体限定。第一深度卷积神经网络模型的训练过程具体参见图8所示:Next, taking the first deep convolutional neural network model shown in FIG. 7 as an example, the training process of the first deep convolutional neural network model is specifically described, as shown in the first deep convolutional neural network model shown in FIG. The first four convolutional layers of the first deep convolutional neural network model shared by the first network branch and the second network branch are respectively a ReLU with an activation function and a Conv kernel size of 5×5. A roll of layers, a second convolutional layer with ReLU as the activation function and a Conv kernel size of 5×5, a third convolutional layer with ReLU as the activation function and a Conv kernel size of 1×1, with ReLU as the activation function And the Conv core size is 1×1 of the fourth convolutional layer; the last two convolutional layers of the first network branch are respectively the fifth volume of the first network branch with ReLU as the activation function and the Conv core size of 3×3 The sixth convolutional layer of the first network branch with the ReLU as the activation function and the Conv kernel size of 3×3; the last two convolutional layers of the second network branch have the ReLU as the activation function and the Conv size is The fifth convolutional layer of the 3×3 second network branch, with ReLU as the activation function and the Conv kernel size is 3× The sixth convolutional layer of the second network branch of 3. The size of the above convolution kernel is only an example, and does not specifically limit the configuration of the size of the convolution kernel in the present application. The size of the size of the convolution core is not specifically limited in the embodiment of the present application. The training process of the first deep convolutional neural network model is shown in Figure 8:
S801,将训练样本数据库中的若干个人脸样本图像输入初始化的第一深度卷积神经网络模型进行训练。S801. Train a plurality of personal face sample images in the training sample database into the initialized first deep convolutional neural network model for training.
可选的,所述初始化的第一深度卷积神经网络模型的权重配置符合均值为0,方差为0.01的高斯分布,偏置配置为0。Optionally, the weighted configuration of the initialized first deep convolutional neural network model conforms to a Gaussian distribution with a mean of 0 and a variance of 0.01, and the offset configuration is 0.
S802,在第K次训练过程中,将人脸样本图像和素描平均图中同一位置的像素点的像素值相加,得到人脸加强图像。S802. In the Kth training process, the face sample image and the pixel value of the pixel at the same position in the sketch average image are added to obtain a face enhancement image.
其中,针对所述素描平均图中的任一个像素点的像素值为:所述训练样本数据库中所有素描样本图像中与所述任一个像素点在同一位置的像素点的像素值的平均值。The pixel value of any one of the pixel points in the sketch average graph is an average value of pixel values of pixel points at the same position as the any one of the sketch sample images in the training sample database.
S803,通过经过K-1次调整的第一深度卷积神经网络模型中的第一卷积层以及第二卷积层,过滤所述人脸加强图像水平方向以及垂直方向的背景特征。S803, filtering the background feature of the face enhancement image in the horizontal direction and the vertical direction by the first convolution layer and the second convolution layer in the first depth convolutional neural network model adjusted by K-1 times.
S804,通过经过K-1次调整的第一深度卷积神经网络模型中的第三卷积层以及第四卷积层,针对过滤了背景特征的所述人脸加强图像,在水平方向以及垂直方向上进行平滑处理。S804, by using a third convolutional layer and a fourth convolution layer in the first depth convolutional neural network model adjusted by K-1 times, the face enhancement image for filtering the background feature is horizontally and vertically Smoothing in the direction.
S805,将所述人脸样本图像划分成若干个相互重叠的图像块,并从所述若干个相互重叠的图像块中获取包括面部特征信息的图像块以及包括头发特征信息的图像块。S805. The face sample image is divided into a plurality of mutually overlapping image blocks, and an image block including facial feature information and an image block including hair feature information are acquired from the plurality of mutually overlapping image blocks.
其中包括面部特征信息的图像块的数量为H,包括头发特征信息的图像块的数量为Q,所述H,Q均为正整数。The number of image blocks including facial feature information is H, and the number of image blocks including hair feature information is Q, and both H and Q are positive integers.
可选的,在步骤805中,从所述若干个相互重叠的图像块中获取包括面部特征信息的图像块,可以但不限于通过如下方式实现:Optionally, in step 805, acquiring an image block that includes facial feature information from the plurality of mutually overlapping image blocks may be implemented by:
实现方式一:针对所述若干个相互重叠的图像块中的每一个图像块,确定所述每一个
图像块中的每一个像素点为面部特征点的面部概率;在确定面部概率不为0的像素点的数量大于预设阈值时,则确定所述每一个图像块为包括面部特征信息的图像块。Implementation manner 1: determining each of the image blocks of the plurality of mutually overlapping image blocks
Each pixel in the image block is a face probability of the facial feature point; when it is determined that the number of pixel points whose face probability is not 0 is greater than a preset threshold, determining each of the image blocks as an image block including facial feature information .
实现方式二:通过特征识别的方法从所述若干个相互重叠的图像块中获取包括面部特征信息的图像块。特征识别的方法可以包括基于局部直方图的特征识别方法、基于二值化直方图的特征识别方法等等,本申请实施例对此不作具体限定。Implementation 2: Obtain an image block including facial feature information from the plurality of mutually overlapping image blocks by a feature recognition method. The method for feature recognition may include a feature recognition method based on a local histogram, a feature recognition method based on a binarized histogram, and the like, which are not specifically limited in this embodiment of the present application.
可选的,在步骤805中,从所述若干个相互重叠的图像块中获取包括头特征信息的图像块,可以但不限于通过如下方式实现:Optionally, in step 805, acquiring an image block including the header feature information from the plurality of mutually overlapping image blocks may be implemented by:
实现方式一:针对所述若干个相互重叠的图像块中的每一个图像块,确定所述每一个图像块中的每一个像素点为头发特征点的头发概率;在确定头发概率不为0的像素点的数量大于预设阈值时,则确定所述每一个图像块为包括头发特征信息的图像块。Embodiment 1 : determining, for each of the plurality of mutually overlapping image blocks, a hair probability that each pixel in each image block is a hair feature point; determining that the hair probability is not zero When the number of pixels is greater than a preset threshold, it is determined that each of the image blocks is an image block including hair feature information.
实现方式二:通过特征识别的方法从所述若干个相互重叠的图像块中获取包括头发特征信息的图像块。特征识别的方法可以包括基于局部直方图的特征识别方法、基于二值化直方图的特征识别方法等等,本申请实施例对此不作具体限定。Implementation 2: Obtaining an image block including hair feature information from the plurality of mutually overlapping image blocks by a feature recognition method. The method for feature recognition may include a feature recognition method based on a local histogram, a feature recognition method based on a binarized histogram, and the like, which are not specifically limited in this embodiment of the present application.
S806,针对第f个包括面部特征信息的图像块,确定所述第f个包括面部特征信息的图像块在所述人脸样本图像的人脸特征图中对应的目标区域,并将所述目标区域中的图像块和所述第f个包括面部特征信息的图像块中同一位置的像素点的像素值相加,参见如图9所示,得到第f个面部强特征图。S806. For the fth image block including the facial feature information, determining that the fth image block including the facial feature information is in a corresponding target region in the facial feature image of the facial sample image, and the target is The image block in the area and the pixel value of the pixel at the same position in the f-th image block including the face feature information are added, as shown in FIG. 9, the f-th face strong feature map is obtained.
其中,所述f为取遍不大于H的正整数。Wherein f is a positive integer that is not more than H.
S807,通过所述经过K-1次调整的第一深度卷积神经网络模型的第一网络分支的后M个卷积层,获取所述第f个面部加强特征图中的面部素描特征,得到第f个所述人脸样本图像的面部结构素描图。S807. Acquire a facial sketch feature in the fth facial enhancement feature map by using the last M convolution layers of the first network branch of the K-1th adjusted first deep convolutional neural network model. A sketch of the facial structure of the fth face sample image.
S808,针对第g个包括头发特征信息的图像块,确定所述第g个包括头发特征信息的图像块在所述人脸样本图像的人脸特征图中对应的目标区域,并将所述目标区域中的图像块和所述第g个包括头发特征信息的图像块中同一位置的像素点的像素值相加,得到第g个头发加强特征图。S808, determining, for the gth image block including hair feature information, that the gth image block including the hair feature information corresponds to a target region in the face feature image of the face sample image, and the target is The image block in the region and the pixel value of the pixel at the same position in the gth image block including the hair feature information are added to obtain a g-th hair enhancement feature map.
其中,所述g为取遍不大于Q的正整数。Wherein g is a positive integer that is not more than Q.
S809,通过所述经过K-1次调整的第一深度卷积神经网络模型的第一网络分支的后M个卷积层,获取所述第g个头发加强特征图中的头发素描特征,得到第g个所述人脸样本图像的头发纹理素描图。S809. Acquire a hair sketch feature in the gth hair enhancement feature map by using the last M convolution layers of the first network branch of the K-1th adjusted first deep convolutional neural network model. A hair texture sketch of the gth face sample image.
需要说明的是,步骤S806和步骤S808没有严格的先后顺序,可以先执行步骤S806后执行步骤S808,也可以先执行步骤S808再执行步骤S806,也可以同时执行步骤S806和步骤S808,本申请实施例在这里不做具体限定。It should be noted that, in step S806 and step S808, there is no strict sequence. Step S808 may be performed after step S806, or step S808 may be performed first, or step S806 may be performed first, or step S806 and step S808 may be performed simultaneously. The example is not specifically limited here.
S810,将第f个所述人脸样本图像的面部结构素描图以及第g个所述人脸样本图像的头发纹理素描图合成得到所述人脸样本图像的素描图像。S810, synthesizing the facial structure sketch map of the fth face sample image and the hair texture sketch map of the gth face sample image to obtain a sketch image of the face sample image.
S811,获取所述人脸样本图像的素描图像与所述人脸样本图像对应的素描样本图像之间的误差值。S811. Acquire an error value between the sketch image of the face sample image and the sketch sample image corresponding to the face sample image.
S812,基于所述人脸样本图像的素描图像与所述人脸样本图像对应的素描样本图像之间的误差值,调整第K+1次训练过程所使用的权重和偏置。S812, adjusting weights and offsets used in the K+1th training process based on an error value between the sketch image of the face sample image and the sketch sample image corresponding to the face sample image.
具体,根据所述人脸样本图像的素描图像与所述人脸样本图像对应的素描样本图像之间的误差值以及网络学习率确定权重和偏置的调整量,然后根据所述调整量调整第K+1次
训练过程所使用的权重和偏置。Specifically, an error value between the sketch image of the face sample image and the sketch sample image corresponding to the face sample image and an adjustment amount of the offset are determined according to the network learning rate, and then adjusted according to the adjustment amount. K+1 times
The weights and offsets used in the training process.
其中,网络学习率为权重和偏置每次调整的幅度,第一深度卷积神经网络模型的网络学习率可以为k×10-10,其中,k为不大于100的正整数,本申请实施例在这里不做具体限定。The network learning rate is the weight and the offset of each adjustment. The network learning rate of the first deep convolutional neural network model may be k×10 −10 , where k is a positive integer not greater than 100. The example is not specifically limited here.
S813,在第K次训练后,获取第一深度卷积神经网络模型的损失函数值。S813. After the Kth training, obtain a loss function value of the first deep convolutional neural network model.
若第一深度卷积神经网络模型的损失函数值大于预设阈值,则进行第K+1次训练;若第一深度卷积神经网络模型的损失函数值小于或等于预设阈值,则第一深度卷积神经网络模型训练完成。If the loss function value of the first deep convolutional neural network model is greater than a preset threshold, performing K+1th training; if the loss function value of the first deep convolutional neural network model is less than or equal to a preset threshold, the first Deep convolutional neural network model training is completed.
具体的,第一深度卷积神经网络模型的损失函数符合以下公式:Specifically, the loss function of the first deep convolutional neural network model conforms to the following formula:
Lg=Ls+αLt;L g =L s +αL t ;
其中,Lg为第一深度卷积神经网络模型的损失函数值;Ls为第一网络分支的损失函数值;Lt为第二网络分支的损失函数值;α为标量参数,用于保持第一网络分支的损失函数值和第二网络分支的损失函数值之间的平衡。Where L g is the loss function value of the first deep convolutional neural network model; L s is the loss function value of the first network branch; L t is the loss function value of the second network branch; α is a scalar parameter for maintaining A balance between the loss function value of the first network branch and the loss function value of the second network branch.
第一网络分支的损失函数值可以为第一网络分支的均方误差(英文:Mean Squared Error,简称:MSE)值,也可以为绝对误差和(英文:Sum of Absolute Difference,简称:SAD)值,也可以为平均绝对误差(英文:The mean absolute error,简称:MAD)值,也可以为其他的误差值,本申请实施例在这里不做具体限定。以第一网络分支的损失函数值为第一网络分支的MSE值为例,第一网络分支的损失函数值可以通过如下公式确定:The value of the loss function of the first network branch may be the mean square error of the first network branch (English: Mean Squared Error, MSE for short), or may be the absolute error and (Sum of Absolute Difference, SAD) value. The value of the mean absolute error (MAD) value may be used as the value of the average absolute error (MAD), and may be other error values. The embodiment of the present application is not specifically limited herein. Taking the loss function value of the first network branch as an example of the MSE value of the first network branch, the loss function value of the first network branch can be determined by the following formula:
其中,Ls为第一网络分支的损失函数值;ps为第f个所述包括面部特征信息的图像块;所述ss为在所述人脸样本图像对应的素描样本图像中,第f个所述包括面部特征信息的图像块对应的目标区域中包括的图像块;Ps为所有包括面部特征信息的图像块;|Ps|为所有包括面部特征信息的图像块的数量,即|Ps|等于H;为在第f个所述人脸样本图像的面部结构素描图中,第f个所述包括面部特征信息的图像块对应的目标区域中包括的图像块,即Wherein, L s is a loss function value of the first network branch; p s is an image block including the facial feature information of the fth; the s s is a sketch sample image corresponding to the face sample image, f image blocks included in the target region corresponding to the image block including the facial feature information; P s is all image blocks including facial feature information; |P s | is the number of all image blocks including facial feature information, ie |P s | equal to H; In the face structure sketch map of the fth face sample image, the image block included in the target region corresponding to the fth image block including the face feature information, that is,
wg为第一深度卷积神经网络模型的前N个卷积层的权重和偏置,ws为第一深度卷积神经网络模型的第一网络分支的后M个卷积层的权重和偏置。w g is the weight and offset of the first N convolutional layers of the first deep convolutional neural network model, w s is the weight of the last M convolutional layers of the first network branch of the first deep convolutional neural network model Offset.
第二网络分支的损失函数值可以为第二网络分支的MSE值和排序均方误差(英文:Sorted Matching Mean Square Error,简称:SM)值的加权,也可以为其他的误差值,本申请实施例在这里不做具体限定。其中,SM(·)=sort{MSE(·)},Sort()为排序函数。The value of the loss function of the second network branch may be the weight of the MSE value and the Sorted Matching Mean Square Error (SM) value of the second network branch, and may also be other error values. The example is not specifically limited here. Among them, SM (·) = sort {MSE (·)}, Sort () is a sort function.
以第二网络分支的损失函数值为第二网络分支的MSE值和SM值的加权为例,第二网络分支的损失函数值可以通过如下公式确定:Taking the loss function value of the second network branch as the weight of the MSE value and the SM value of the second network branch, the loss function value of the second network branch can be determined by the following formula:
其中,β为标量参数,Lt为第一网络分支的损失函数值;pt为第g个所述包括头发特
征信息的图像块;所述st为在所述人脸样本图像对应的素描样本图像中,第g个所述包括头发特征信息的图像块对应的目标区域中包括的图像块;Pt为所有包括头发特征信息的图像块;|Pt|为所有包括头发特征信息的图像块的数量,即|Pt|等于Q;为在第g个所述人脸样本图像的头发纹理素描图中,第g个所述包括头发特征信息的图像块对应的目标区域中包括的图像块,即Wherein β is a scalar parameter, L t is a loss function value of the first network branch; p t is the gth image block including hair feature information; and the s t is a sketch corresponding to the face sample image In the sample image, the image block included in the target region corresponding to the gth image block including the hair feature information; P t is all image blocks including hair feature information; |P t | is all images including hair feature information The number of blocks, ie |P t | is equal to Q; In the hair texture sketch map of the gth face sample image, the image block included in the target region corresponding to the gth image block including the hair feature information, ie
wg为第一深度卷积神经网络模型的前N个卷积层的权重和偏置,wt为第一深度卷积神经网络模型的第二网络分支的后M个卷积层的权重和偏置。w g is the weight and offset of the first N convolutional layers of the first deep convolutional neural network model, and w t is the weight of the last M convolutional layers of the second network branch of the first deep convolutional neural network model Offset.
以第一网络分支的损失函数值为第一网络分支的MSE值,第二网络分支的损失函数值为第二网络分支的MSE值和SM值的加权为例,第一深度卷积神经网络模型的损失函数值通过如下公式确定:The loss function function of the first network branch is the MSE value of the first network branch, and the loss function value of the second network branch is the weight of the MSE value and the SM value of the second network branch, for example, the first deep convolutional neural network model The loss function value is determined by the following formula:
本申请实施例中基于深度卷积神经网络的方式,通过设计包括用于生成包括人脸特征的第一网络分支,以及用于生成包括头发特征的第二网络分支的结构,从大量的训练样本中学习出有效的特征表达,训练出能够将原始图像生成准确自然的人脸素描图像的网络模型,实现人脸素描图像的自动生成。相比于现有技术中基于合成的人脸素描图像自动生成技术,基于深度卷积神经网络生成人脸素描图像的技术,不再依赖于样本数据库,而是通过深度卷积神经网络中的第一网络分支生成包括人脸特征的结构素描图,通过深度卷积神经网络中的第二网络分支生成包括头发特征的结构素描图,然后合成结构素描图和纹理素描图得到最终的人脸素描图像,提高了人脸素描图像生成技术的准确性以及泛化能力,并且减少了人脸素描图像生成过程中的工作量,从而提高了人脸素描图像生成的速度。In the embodiment of the present application, based on a deep convolutional neural network, a design includes a first network branch for generating a face feature, and a structure for generating a second network branch including a hair feature, from a large number of training samples. The effective feature expression is learned, and the network model which can generate the accurate and natural face sketch image of the original image is trained to realize the automatic generation of the face sketch image. Compared with the prior art synthetic face-based sketch image automatic generation technology, the technique of generating a face sketch image based on a deep convolutional neural network no longer depends on the sample database, but through the deep convolutional neural network. A network branch generates a structural sketch map including facial features, generates a structural sketch map including hair features through a second network branch in the deep convolutional neural network, and then synthesizes the structural sketch map and the texture sketch map to obtain a final human face sketch image. It improves the accuracy and generalization ability of the face sketch image generation technology, and reduces the workload in the face sketch image generation process, thereby improving the speed of face sketch image generation.
基于与方法实施例的同一发明构思,本发明实施例提供一种素描图像的生成装置10,具体用于实现图1至图5、图7以及图8所述的实施例描述的方法,该装置的结构如图10所示,包括获取模块11、深度卷积神经网络模型12和合成模块13,其中:Based on the same inventive concept as the method embodiment, the embodiment of the present invention provides a sketch image generating apparatus 10, specifically for implementing the method described in the embodiments described in FIG. 1 to FIG. 5, FIG. 7, and FIG. The structure is as shown in FIG. 10, and includes an acquisition module 11, a deep convolutional neural network model 12, and a synthesis module 13, wherein:
获取模块11,用于获取待处理的人脸图像。The obtaining module 11 is configured to acquire a face image to be processed.
深度卷积神经网络模型12,用于获取所述获取模块11获取的所述人脸图像中的面部结构素描图以及头发纹理素描图;其中,所述深度卷积神经网络模型12为预先训练的,包括第一网络分支模块121以及第二网络分支模块122,所述深度卷积神经网络模型12的结构如图11所述:a depth convolutional neural network model 12, configured to acquire a facial structure sketch map and a hair texture sketch map in the face image acquired by the obtaining module 11; wherein the deep convolutional neural network model 12 is pre-trained The first network branching module 121 and the second network branching module 122 are configured. The structure of the deep convolutional neural network model 12 is as shown in FIG. 11:
所述第一网络分支模块121,用于获取所述获取模块11获取的所述人脸图像中的面部素描特征,得到面部结构素描图;所述第一网络分支模块包括P个卷积层,其中,所述P为大于0的整数。The first network branching module 121 is configured to acquire a facial sketch feature in the face image acquired by the acquiring module 11 to obtain a facial structure sketch map, where the first network branching module includes P convolution layers. Wherein P is an integer greater than zero.
所述第二网络分支模块122,用于获取所述获取模块11获取的所述人脸图像中的头发素描特征,得到头发纹理素描图;所述第二网络分支模块包括P个卷积层。The second network branching module 122 is configured to obtain a hair sketch feature in the face image acquired by the obtaining module 11 to obtain a hair texture sketch map; and the second network branching module includes P convolution layers.
合成模块13,用于将所述第一网络分支模块121得到的所述面部结构素描图以及所述第二网络分支模块122得到的所述头发纹理素描图合成得到所述人脸图像的素描图像。a synthesizing module 13 configured to synthesize the facial structure sketch map obtained by the first network branching module 121 and the hair texture sketch map obtained by the second network branching module 122 to obtain a sketch image of the facial image .
在一种可能的实现方式中,所述第一网络分支模块121包括的P个卷积层中的前N个卷积层与所述第二网络分支模块122包括的P个卷积层中的前N个卷积层相同或者重合,
所述N为大于0小于P的整数。In a possible implementation manner, the first N convolution layers of the P convolution layers included in the first network branching module 121 and the P convolution layers included in the second network branching module 122 The first N convolutional layers are the same or coincident,
The N is an integer greater than 0 and less than P.
在一种可能的实现方式中,所述第一网络分支模块121,具体用于通过所述第一网络分支模块121的所述前N个卷积层,过滤所述人脸图像中的背景特征,得到人脸特征图,然后通过所述第一网络分支模块121的后M个卷积层,获取所述人脸特征图中的面部素描特征。所述第二网络分支模块122,具体用于通过所述深度卷积神经网络模型12中第二网络分支的所述前N个卷积层,过滤所述人脸图像中的背景特征,得到人脸特征图,然后通过所述第二网络分支的后M个卷积层,获取所述人脸特征图中的头发素描特征。其中,P=M+N。In a possible implementation, the first network branching module 121 is configured to filter background features in the face image by using the first N convolution layers of the first network branching module 121. Obtaining a face feature map, and then acquiring the face sketch feature in the face feature map by the last M convolution layers of the first network branching module 121. The second network branching module 122 is configured to filter background features in the face image by using the first N convolution layers of the second network branch in the deep convolutional neural network model 12 to obtain a person. The face feature map is then obtained by the last M convolution layers of the second network branch to obtain the hair sketch feature in the face feature map. Where P = M + N.
可选的,所述第一网络分支模块121的后M个卷积层的卷积核尺寸,与所述第二网络分支模块122的后M个卷积层的卷积核尺寸对应相等。Optionally, the convolution kernel size of the last M convolutional layers of the first network branching module 121 is equal to the convolution kernel size of the last M convolutional layers of the second network branching module 122.
在一种可能的实现方式中,所述N为4,所述第一网络分支模块121,在通过所述第一网络分支模块121的前N个卷积层,过滤所述人脸图像中的背景特征时,具体用于通过所述第一网络分支模块121的前N个卷积层中的第一卷积层以及第二卷积层,过滤所述人脸图像水平方向以及垂直方向的背景特征,然后通过所述第一网络分支模块121的前N个卷积层中的第三卷积层以及第四卷积层,针对过滤了背景特征的所述人脸图像,在水平方向以及垂直方向上进行平滑处理。In a possible implementation manner, the N is 4, and the first network branching module 121 filters the face image in the first N convolution layers of the first network branching module 121. The background feature is specifically configured to filter the background of the face image in the horizontal direction and the vertical direction by using the first convolution layer and the second convolution layer in the first N convolution layers of the first network branching module 121. a feature, then through the third convolutional layer and the fourth convolutional layer of the first N convolutional layers of the first network branching module 121, for the face image filtered by the background feature, horizontally and vertically Smoothing in the direction.
可选的,所述第一卷积层的卷积核尺寸与第二卷积层的卷积核尺寸相等,第三卷积层的卷积核尺寸和第四卷积层的卷积核尺寸相同。Optionally, the convolution kernel size of the first convolution layer is equal to the convolution kernel size of the second convolution layer, and the convolution kernel size of the third convolution layer and the convolution kernel size of the fourth convolution layer the same.
在一种可能的实现方式中,所述获取模块11,还用于获取所述人脸图像中每一个像素点为头发特征点的头发概率。所述合成模块13,具体用于将所述第一网络分支模块121得到的所述面部结构素描图以及所述第二网络分支模块122得到的所述头发纹理素描图合成得到所述人脸图像的素描图像,符合如下公式要求:In a possible implementation manner, the obtaining module 11 is further configured to acquire a hair probability that each pixel point in the face image is a hair feature point. The synthesizing module 13 is configured to synthesize the facial structure sketch map obtained by the first network branching module 121 and the hair texture sketch map obtained by the second network branching module 122 to obtain the facial image. The sketch image meets the following formula requirements:
S(i,j)=(1-Ph(i,j))×SS(i,j)+Ph(i,j)×St(i,j)
S (i,j) =(1-P h(i,j) )×S S(i,j) +P h(i,j) ×S t(i,j)
其中,所述S(i,j)为所述人脸图像的素描图像中第i行第j列的像素点的像素值,Ph(i,j)为所述人脸图像的素描图像中第i行第j列的像素点的头发概率,SS(i,j)为所述面部结构素描图像中第i行第j列的像素点的像素值,St(i,j)为所述头发纹理素描图中第i行第j列的像素点的像素值,所述i,j均为大于0的整数。Wherein the S (i, j) is a pixel value of a pixel point of the i-th row and the j-th column in the sketch image of the face image, and P h(i, j) is a sketch image of the face image The hair probability of the pixel of the i-th row and the j-th column, S S(i, j) is the pixel value of the pixel of the i-th row and the j-th column in the sketch image of the face structure, and S t(i, j) is A pixel value of a pixel of the i-th row and the j-th column in the hair texture sketch map, wherein i, j are integers greater than zero.
可选的,所述装置还包括:Optionally, the device further includes:
训练模块14,用于通过如下方式训练得到所述深度卷积神经网络模型12:The training module 14 is configured to train the deep convolutional neural network model 12 by:
将训练样本数据库中的若干个人脸样本图像输入初始化的深度卷积神经网络模型12进行训练;所述训练样本数据库包括若干个人脸样本图像以及每个人脸样本图像对应的素描样本图像,所述初始化的深度卷积神经网络模型12包括权重和偏置。Performing training by inputting a plurality of personal face sample images in the training sample database into the initialized deep convolutional neural network model 12; the training sample database includes a plurality of personal face sample images and a sketch sample image corresponding to each face sample image, the initialization The deep convolutional neural network model 12 includes weights and offsets.
在第K次训练过程中,通过经过K-1次调整的深度卷积神经网络模型12的前N个卷积层,过滤所述人脸样本图像中的背景特征,得到所述人脸样本图像的人脸特征图,所述K为大于0的整数。In the Kth training process, the background features in the face sample image are filtered by the first N convolution layers of the K-1 sub-depth convolutional neural network model 12 to obtain the face sample image. The face feature map, the K being an integer greater than zero.
通过所述经过K-1次调整的深度卷积神经网络模型12的第一网络分支模块121的后M个卷积层,获取所述人脸样本图像的人脸特征图中的面部素描特征,得到所述人脸样本图像的面部结构素描图。Obtaining a facial sketch feature in the face feature image of the face sample image by using the last M convolution layers of the first network branching module 121 of the K-1 sub-depended deep convolutional neural network model 12, A facial structure sketch map of the face sample image is obtained.
通过所述经过K-1次调整的深度卷积神经网络模型12的第二网络分支模块122的后M个卷积层,获取所述人脸样本图像的人脸特征图中的头发素描特征,得到所述人脸样本
图像的头发纹理素描图。Acquiring the hair sketch feature in the face feature image of the face sample image by using the last M convolution layers of the second network branching module 122 of the K-1 sub-depended deep convolutional neural network model 12, Getting the face sample
Image of hair texture sketch illustration.
将所述人脸样本图像的面部结构素描图以及所述人脸样本图像的头发纹理素描图合成得到所述人脸样本图像的素描图像。A face structure sketch map of the face sample image and a hair texture sketch map of the face sample image are combined to obtain a sketch image of the face sample image.
在第K次训练后,获取所述人脸样本图像的素描图像与所述人脸样本图像对应的素描样本图像之间的误差值。After the Kth training, an error value between the sketch image of the face sample image and the sketch sample image corresponding to the face sample image is acquired.
基于所述人脸样本图像的素描图像与所述人脸样本图像对应的素描样本图像之间的误差值,调整第K+1次训练过程所使用的权重和偏置。The weight and offset used in the K+1th training process are adjusted based on an error value between the sketch image of the face sample image and the sketch sample image corresponding to the face sample image.
可选的,所述训练模块14,在第K次训练过程中,通过经过K-1次调整的深度卷积神经网络模型12的前N个卷积层,过滤所述人脸样本图像中的背景特征时,具体用于:Optionally, the training module 14 filters the first N convolution layers of the deep convolutional neural network model 12 that has undergone K-1 adjustments during the Kth training process to filter the image in the face sample image. When the background feature is used, it is specifically used to:
将所述人脸样本图像和素描平均图中同一位置的像素点的像素值相加,得到人脸加强图像。其中,针对所述素描平均图中的任一个像素点的像素值为:所述训练样本数据库中所有素描样本图像中与所述任一个像素点在同一位置的像素点的像素值的平均值;通过经过K-1次调整的深度卷积神经网络模型12的前N个卷积层,过滤所述人脸加强图像中的背景特征。The face sample image and the pixel value of the pixel at the same position in the sketch average map are added to obtain a face enhancement image. The pixel value of any one of the pixel points in the sketch average map is an average value of pixel values of pixel points in the same position in the sketch sample image in the training sample database that are at the same position as the any one of the pixel points; The background features in the face enhancement image are filtered by the first N convolutional layers of the deep convolutional neural network model 12 that have been K-1 adjusted.
在一种可能的实现方式中,所述获取模块11,还用于将所述人脸样本图像划分成若干个相互重叠的图像块,并从所述若干个相互重叠的图像块中获取包括面部特征信息的图像块。所述训练模块14,在通过所述经过K-1次调整的深度卷积神经网络模型12的第一网络分支模块121的后M个卷积层,获取所述人脸样本图像的人脸特征图中的面部素描特征时,具体用于:In a possible implementation, the acquiring module 11 is further configured to divide the face sample image into a plurality of mutually overlapping image blocks, and obtain a face including the plurality of mutually overlapping image blocks. An image block of feature information. The training module 14 acquires the facial features of the face sample image by using the last M convolution layers of the first network branching module 121 of the deep convolutional neural network model 12 that has undergone K-1 adjustments. The facial sketch features in the figure are specifically used to:
针对所述获取模块11获取的每一个包括面部特征信息的图像块,确定所述每一个包括面部特征信息的图像块在所述人脸样本图像的人脸特征图中对应的目标区域,并将所述目标区域中的图像块和所述每一个包括面部特征信息的图像块中同一位置的像素点的像素值相加,得到面部加强特征图;针对每一个面部加强特征图,通过所述经过K-1次调整的深度卷积神经网络模型12的第一网络分支模块121的后M个卷积层,获取所述面部加强特征图中的面部素描特征。Determining, for each image block that includes the facial feature information acquired by the acquiring module 11, determining that each of the image blocks including the facial feature information corresponds to a target region in the facial feature image of the facial sample image, and The image block in the target area and the pixel values of the pixel points at the same position in each of the image blocks including the facial feature information are added to obtain a face enhancement feature map; for each face enhancement feature map, through the The rear M convolutional layers of the first network branching module 121 of the K-1 sub-adjusted deep convolutional neural network model 12 acquire the facial sketch features in the facial enhancement feature map.
在一种可能的实现方式中,所述获取模块11,还用于将所述人脸样本图像划分成若干个相互重叠的图像块,并从所述若干个相互重叠的图像块中获取包括头发特征信息的图像块。所述训练模块14,在通过所述经过K-1次调整的深度卷积神经网络模型12的第二网络分支模块122的后M个卷积层,获取所述人脸样本图像的人脸特征图中的头发素描特征时,具体用于:In a possible implementation, the acquiring module 11 is further configured to divide the face sample image into a plurality of overlapping image blocks, and obtain the hair including the plurality of mutually overlapping image blocks. An image block of feature information. The training module 14 acquires the facial features of the face sample image after the M M convolution layers of the second network branching module 122 of the deep convolutional neural network model 12 that has undergone K-1 adjustments. The hair sketch feature in the figure is specifically used to:
针对所述获取模块11获取的每一个包括头发特征信息的图像块,将所述人脸样本图像和所述包括头发特征信息的图像块中同一位置的像素点的像素值相加,得到头发加强特征图;针对每一个头发加强特征图,通过所述经过K-1次调整的深度卷积神经网络模型12的第二网络分支模块122的后M个卷积层,获取所述头发加强特征图中的头发素描特征。And for each image block including the hair feature information acquired by the obtaining module 11, adding the pixel values of the pixel positions at the same position in the image block including the hair feature information to the hair enhancement Feature map; for each hair enhancement feature map, the hair enhancement feature map is obtained by the last M convolution layers of the second network branch module 122 of the K-1 sub-depended deep convolutional neural network model 12 Hair sketch features.
可选的,所述获取模块11,在从所述若干个相互重叠的图像块中获取包括面部特征信息的图像块时,具体用于:Optionally, the acquiring module 11 is configured to: when acquiring an image block that includes facial feature information from the plurality of mutually overlapping image blocks, specifically:
针对所述若干个相互重叠的图像块中的每一个图像块,确定所述每一个图像块中的每一个像素点为面部特征点的面部概率;在确定面部概率不为0的像素点的数量大于预设阈值时,则确定所述每一个图像块为包括面部特征信息的图像块。Determining, for each of the plurality of mutually overlapping image blocks, a face probability of each pixel point in each of the image blocks as a facial feature point; determining a number of pixel points whose face probability is not 0 When it is greater than the preset threshold, it is determined that each of the image blocks is an image block including facial feature information.
在一种可能的设计中,所述获取模块11,在从所述若干个相互重叠的图像块中获取包
括头发特征信息的图像块时,具体用于:In a possible design, the obtaining module 11 acquires a packet from the plurality of mutually overlapping image blocks.
When an image block including hair feature information is used, it is specifically used to:
针对所述若干个相互重叠的图像块中的每一个图像块,确定所述每一个图像块中的每一个像素点为头发特征点的头发概率;在确定头发概率不为0的像素点的数量大于预设阈值时,则确定所述每一个图像块为包括头发特征信息的图像块。Determining, for each of the plurality of mutually overlapping image blocks, a hair probability of each pixel point in each of the image blocks as a hair feature point; determining a number of pixel points whose hair probability is not zero When it is greater than the preset threshold, it is determined that each of the image blocks is an image block including hair feature information.
本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,另外,在本申请各个实施例中的各功能模块可以集成在一个处理器中,也可以是单独物理存在,也可以两个或两个以上模块集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。The division of the modules in the embodiment of the present application is schematic, and is only a logical function division. In actual implementation, there may be another division manner. In addition, each functional module in each embodiment of the present application may be integrated into one processing. In the device, it can also be physically existed alone, or two or more modules can be integrated into one module. The above integrated modules can be implemented in the form of hardware or in the form of software functional modules.
其中,集成的模块既可以采用硬件的形式实现时,如图12所示,可以包括采集器1201,处理器1202以及存储器1203。深度卷积神经网络模型12、合成模块13和训练模块14对应的实体硬件可以是处理器1202。处理器1202,可以是一个中央处理单元(英文:central processing unit,简称CPU),或者为数字处理单元等等。处理器1202通过采集器1201获取待处理的人脸图像。存储器1203,用于存储处理器1202执行的程序。Wherein, when the integrated module can be implemented in the form of hardware, as shown in FIG. 12, the collector 1201, the processor 1202, and the memory 1203 can be included. The physical hardware corresponding to the deep convolutional neural network model 12, the synthesis module 13 and the training module 14 may be the processor 1202. The processor 1202 can be a central processing unit (English: central processing unit, CPU for short), or a digital processing unit or the like. The processor 1202 acquires a face image to be processed through the collector 1201. The memory 1203 is configured to store a program executed by the processor 1202.
本申请实施例中不限定上述采集器1201、处理器1202以及存储器1203之间的具体连接介质。本申请实施例在图12中以存储器1203、处理器1202以及采集器1201之间通过总线1204连接,总线在图12中以粗线表示,其它部件之间的连接方式,仅是进行示意性说明,并不引以为限。所述总线可以分为地址总线、数据总线、控制总线等。为便于表示,图12中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。The specific connection medium between the above-mentioned collector 1201, processor 1202 and memory 1203 is not limited in the embodiment of the present application. In the embodiment of the present application, the memory 1203, the processor 1202, and the collector 1201 are connected by a bus 1204 in FIG. 12, and the bus is indicated by a thick line in FIG. 12, and the connection manner between other components is only schematically illustrated. , not limited to. The bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one thick line is shown in Figure 12, but it does not mean that there is only one bus or one type of bus.
存储器1203可以是易失性存储器(英文:volatile memory),例如随机存取存储器(英文:random-access memory,缩写:RAM);存储器1203也可以是非易失性存储器(英文:non-volatile memory),例如只读存储器(英文:read-only memory,缩写:ROM),快闪存储器(英文:flash memory),硬盘(英文:hard disk drive,缩写:HDD)或固态硬盘(英文:solid-state drive,缩写:SSD)、或者存储器1203是能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。存储器1203可以是上述存储器的组合。The memory 1203 may be a volatile memory (English: volatile memory), such as a random access memory (English: random-access memory, abbreviation: RAM); the memory 1203 may also be a non-volatile memory (English: non-volatile memory) For example, read-only memory (English: read-only memory, abbreviation: ROM), flash memory (English: flash memory), hard disk (English: hard disk drive, abbreviation: HDD) or solid state drive (English: solid-state drive Abbreviation: SSD), or memory 1203 is any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited thereto. The memory 1203 may be a combination of the above memories.
处理器1202用于执行存储器1203存储的程序代码,具体用于执行上述图1至图9对应的实施例所述的方法,具体可以参照图1至图9对应的实施例实施,在此不再赘述。The processor 1202 is configured to execute the program code stored in the memory 1203, and is specifically configured to perform the method described in the foregoing embodiments corresponding to FIG. 1 to FIG. 9, and may be specifically implemented by referring to the corresponding embodiments in FIG. 1 to FIG. Narration.
此处所描述的实施例仅用于说明和解释本申请,并不用于限定本申请,并且在不冲突的情况下,本申请中的实施例及实施例中的功能模块可以相互组合。The embodiments described herein are for illustrative purposes only and are not intended to limit the present application, and the functional modules in the embodiments and embodiments of the present application may be combined with each other without conflict.
本申请实施例中基于深度卷积神经网络的方式,通过设计包括用于生成包括人脸特征的第一网络分支,以及用于生成包括头发特征的第二网络分支的结构,从大量的训练样本中学习出有效的特征表达,训练出能够将原始图像生成准确自然的人脸素描图像的网络模型,实现人脸素描图像的自动生成。相比于现有技术中基于合成的人脸素描图像自动生成技术,基于深度卷积神经网络生成人脸素描图像的技术,不再依赖于样本数据库,而是通过深度卷积神经网络中的第一网络分支生成包括人脸特征的结构素描图,通过深度卷积神经网络中的第二网络分支生成包括头发特征的结构素描图,然后合成结构素描图和纹理素描图得到最终的人脸素描图像,提高了人脸素描图像生成技术的准确性以及泛化能力,并且减少了人脸素描图像生成过程中的工作量,从而提高了人脸素描图像生成的速度。In the embodiment of the present application, based on a deep convolutional neural network, a design includes a first network branch for generating a face feature, and a structure for generating a second network branch including a hair feature, from a large number of training samples. The effective feature expression is learned, and the network model which can generate the accurate and natural face sketch image of the original image is trained to realize the automatic generation of the face sketch image. Compared with the prior art synthetic face-based sketch image automatic generation technology, the technique of generating a face sketch image based on a deep convolutional neural network no longer depends on the sample database, but through the deep convolutional neural network. A network branch generates a structural sketch map including facial features, generates a structural sketch map including hair features through a second network branch in the deep convolutional neural network, and then synthesizes the structural sketch map and the texture sketch map to obtain a final human face sketch image. It improves the accuracy and generalization ability of the face sketch image generation technology, and reduces the workload in the face sketch image generation process, thereby improving the speed of face sketch image generation.
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实
施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art will appreciate that embodiments of the present application can be provided as a method, system, or computer program product. Therefore, the present application may employ an entirely hardware embodiment, an entirely software embodiment, or a combination of software and hardware.
The form of the case. Moreover, the application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (system), and computer program products according to embodiments of the present application. It will be understood that each flow and/or block of the flowchart illustrations and/or FIG. These computer program instructions can be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing device to produce a machine for the execution of instructions for execution by a processor of a computer or other programmable data processing device. Means for implementing the functions specified in one or more of the flow or in a block or blocks of the flow chart.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。The computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device. The apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device. The instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.
显然,本领域的技术人员可以对本申请实施例进行各种改动和变型而不脱离本申请实施例的精神和范围。这样,倘若本申请实施例的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。
It is apparent that those skilled in the art can make various changes and modifications to the embodiments of the present application without departing from the spirit and scope of the embodiments of the present application. Thus, it is intended that the present invention cover the modifications and variations of the embodiments of the present invention.
Claims (28)
- 一种素描图像的生成方法,其特征在于,包括:A method for generating a sketch image, comprising:获取待处理的人脸图像;Obtaining a face image to be processed;通过预先训练的深度卷积神经网络模型中第一网络分支的P个卷积层,获取所述人脸图像中的面部素描特征,得到面部结构素描图,其中,所述P为大于0的整数;Obtaining a facial sketch feature in the face image by obtaining P convolution layers of the first network branch in the pre-trained deep convolutional neural network model, and obtaining a facial structure sketch map, wherein the P is an integer greater than 0 ;通过所述深度卷积神经网络模型中第二网络分支的P个卷积层,获取所述人脸图像中的头发素描特征,得到头发纹理素描图;Obtaining a hair sketch feature in the face image by using P convolution layers of the second network branch in the deep convolutional neural network model to obtain a hair texture sketch map;将所述面部结构素描图以及所述头发纹理素描图合成得到所述人脸图像的素描图像。The facial structure sketch map and the hair texture sketch map are combined to obtain a sketch image of the face image.
- 如权利要求1所述的方法,其特征在于,所述第一网络分支的前N个卷积层与所述第二网络分支的前N个卷积层相同或者重合,所述N为大于0小于P的整数。The method according to claim 1, wherein the first N convolutional layers of the first network branch are the same as or coincide with the first N convolutional layers of the second network branch, and the N is greater than 0. An integer less than P.
- 如权利要求2所述的方法,其特征在于,所述通过所述深度卷积神经网络模型中第一网络分支的P个卷积层,获取所述人脸图像中的面部素描特征,包括:The method of claim 2, wherein the acquiring the facial sketch features in the face image by the P convolution layers of the first network branch in the deep convolutional neural network model comprises:通过所述深度卷积神经网络模型中第一网络分支的所述前N个卷积层,过滤所述人脸图像中的背景特征,得到人脸特征图;Filtering background features in the face image by using the first N convolution layers of the first network branch in the deep convolutional neural network model to obtain a facial feature map;通过所述第一网络分支的后M个卷积层,获取所述人脸特征图中的面部素描特征;Obtaining a facial sketch feature in the face feature map by using the last M convolution layers of the first network branch;所述通过所述深度卷积神经网络模型中第二网络分支的P个卷积层,获取所述人脸图像中的头发素描特征,包括:Obtaining the hair sketch feature in the face image by using the P convolution layers of the second network branch in the deep convolutional neural network model, including:通过所述深度卷积神经网络模型中第二网络分支的所述前N个卷积层,过滤所述人脸图像中的背景特征,得到人脸特征图;Filtering background features in the face image by using the first N convolution layers of the second network branch in the deep convolutional neural network model to obtain a facial feature map;通过所述第二网络分支的后M个卷积层,获取所述人脸特征图中的头发素描特征;Obtaining a hair sketch feature in the face feature map by using the last M convolution layers of the second network branch;其中,P=M+N。Where P = M + N.
- 如权利要求3所述的方法,其特征在于,所述第一网络分支的后M个卷积层的卷积核尺寸,与所述第二网络分支的后M个卷积层的卷积核尺寸对应相等。The method of claim 3, wherein a convolution kernel size of the last M convolutional layers of the first network branch and a convolution kernel of the last M convolutional layers of the second network branch The dimensions correspond to each other.
- 如权利要求3或4所述的方法,其特征在于,所述N为4,所述通过深度卷积神经网络模型中第一网络分支的前N个卷积层,过滤所述人脸图像中的背景特征,包括:The method according to claim 3 or 4, wherein the N is 4, and the first N convolutional layers of the first network branch in the deep convolutional neural network model are filtered to filter the face image Background features, including:通过深度卷积神经网络模型中第一网络分支的前N个卷积层中的第一卷积层以及第二卷积层,过滤所述人脸图像水平方向以及垂直方向的背景特征;Filtering the background features in the horizontal direction and the vertical direction of the face image by the first convolution layer and the second convolution layer in the first N convolution layers of the first network branch in the deep convolutional neural network model;通过深度卷积神经网络模型中第一网络分支的前N个卷积层中的第三卷积层以及第四卷积层,针对过滤了背景特征的所述人脸图像,在水平方向以及垂直方向上进行平滑处理。Through the third convolutional layer and the fourth convolutional layer in the first N convolutional layers of the first network branch in the deep convolutional neural network model, the face image filtered for the background feature is horizontally and vertically Smoothing in the direction.
- 如权利要求5所述的方法,其特征在于,所述第一卷积层的卷积核尺寸与第二卷积层的卷积核尺寸相等,第三卷积层的卷积核尺寸和第四卷积层的卷积核尺寸相同。The method according to claim 5, wherein the convolution kernel size of the first convolutional layer is equal to the convolution kernel size of the second convolutional layer, and the convolution kernel size and the third convolutional layer The convolution kernel of the four-volume layer is the same size.
- 如权利要求1至6任一项所述的方法,其特征在于,所述方法还包括:The method of any of claims 1 to 6, wherein the method further comprises:获取所述人脸图像中每一个像素点为头发特征点的头发概率;Obtaining a hair probability that each pixel in the face image is a hair feature point;所述将所述面部结构素描图以及所述头发纹理素描图合成得到所述人脸图像的素描图像,符合如下公式要求:And combining the facial structure sketch map and the hair texture sketch map to obtain a sketch image of the face image, which meets the following formula requirements:S(i,j)=(1-Ph(i,j))×SS(i,j)+Ph(i,j)×St(i,j) S (i,j) =(1-P h(i,j) )×S S(i,j) +P h(i,j) ×S t(i,j)其中,所述S(i,j)为所述人脸图像的素描图像中第i行第j列的像素点的像素值,Ph(i,j)为所述人脸图像的素描图像中第i行第j列的像素点的头发概率,SS(i,j)为所述面部结构素描图像中第i行第j列的像素点的像素值,St(i,j)为所述头发纹理素描图中第i行第j列的像 素点的像素值,所述i,j均为大于0的整数。Wherein the S (i, j) is a pixel value of a pixel point of the i-th row and the j-th column in the sketch image of the face image, and P h(i, j) is a sketch image of the face image The hair probability of the pixel of the i-th row and the j-th column, S S(i, j) is the pixel value of the pixel of the i-th row and the j-th column in the sketch image of the face structure, and S t(i, j) is A pixel value of a pixel of the i-th row and the j-th column in the hair texture sketch map, wherein i, j are integers greater than zero.
- 如权利要求2-7任一项所述的方法,其特征在于,所述深度卷积神经网络模型通过如下方式训练得到:The method of any of claims 2-7, wherein the deep convolutional neural network model is trained by:将训练样本数据库中的若干个人脸样本图像输入初始化的深度卷积神经网络模型进行训练;所述训练样本数据库包括若干个人脸样本图像以及每个人脸样本图像对应的素描样本图像,所述初始化的深度卷积神经网络模型包括权重和偏置;Performing training by inputting a plurality of personal face sample images in the training sample database into the initialized deep convolutional neural network model; the training sample database includes a plurality of personal face sample images and a sketch sample image corresponding to each face sample image, the initialized Deep convolutional neural network models include weights and offsets;在第K次训练过程中,通过经过K-1次调整的深度卷积神经网络模型的前N个卷积层,过滤所述人脸样本图像中的背景特征,得到所述人脸样本图像的人脸特征图,所述K为大于0的整数;In the Kth training process, the background features in the face sample image are filtered by the first N convolution layers of the K-1 sub-depth convolutional neural network model to obtain the face sample image. a face feature map, the K being an integer greater than 0;通过所述经过K-1次调整的深度卷积神经网络模型的第一网络分支的后M个卷积层,获取所述人脸样本图像的人脸特征图中的面部素描特征,得到所述人脸样本图像的面部结构素描图;Obtaining a facial sketch feature in a face feature image of the face sample image by using the last M convolution layers of the first network branch of the K-1 adjusted deep convolutional neural network model, to obtain the a sketch of the facial structure of the face sample image;通过所述经过K-1次调整的深度卷积神经网络模型的第二网络分支的后M个卷积层,获取所述人脸样本图像的人脸特征图中的头发素描特征,得到所述人脸样本图像的头发纹理素描图;Acquiring the hair sketch feature in the face feature image of the face sample image by using the last M convolution layers of the second network branch of the K-1 adjusted deep convolutional neural network model, Sketch image of hair texture of face sample image;将所述人脸样本图像的面部结构素描图以及所述人脸样本图像的头发纹理素描图合成得到所述人脸样本图像的素描图像;Combining a facial structure sketch map of the face sample image and a hair texture sketch map of the face sample image to obtain a sketch image of the face sample image;在第K次训练后,获取所述人脸样本图像的素描图像与所述人脸样本图像对应的素描样本图像之间的误差值;Obtaining an error value between the sketch image of the face sample image and the sketch sample image corresponding to the face sample image after the Kth training;基于所述人脸样本图像的素描图像与所述人脸样本图像对应的素描样本图像之间的误差值,调整第K+1次训练过程所使用的权重和偏置。The weight and offset used in the K+1th training process are adjusted based on an error value between the sketch image of the face sample image and the sketch sample image corresponding to the face sample image.
- 如权利要求8所述的方法,其特征在于,在第K次训练过程中,通过经过K-1次调整的深度卷积神经网络模型的前N个卷积层,过滤所述人脸样本图像中的背景特征,包括:The method according to claim 8, wherein in the Kth training, the face sample image is filtered by the first N convolution layers of the K-1 sub-adjusted deep convolutional neural network model Background features in, including:将所述人脸样本图像和素描平均图中同一位置的像素点的像素值相加,得到人脸加强图像;Adding the face sample image and the pixel value of the pixel at the same position in the sketch average image to obtain a face enhancement image;其中,针对所述素描平均图中的任一个像素点的像素值为:所述训练样本数据库中所有素描样本图像中与所述任一个像素点在同一位置的像素点的像素值的平均值;The pixel value of any one of the pixel points in the sketch average map is an average value of pixel values of pixel points in the same position in the sketch sample image in the training sample database that are at the same position as the any one of the pixel points;通过经过K-1次调整的深度卷积神经网络模型的前N个卷积层,过滤所述人脸加强图像中的背景特征。The background features in the face enhancement image are filtered by the first N convolutional layers of the K-1 sub-depth convolutional neural network model.
- 如权利要求8或9所述的方法,其特征在于,通过所述经过K-1次调整的深度卷积神经网络模型的第一网络分支的后M个卷积层,获取所述人脸样本图像的人脸特征图中的面部素描特征,包括:The method according to claim 8 or 9, wherein the face sample is obtained by the last M convolution layers of the first network branch of the K-1 adjusted deep convolutional neural network model The facial sketch features in the face feature map of the image, including:将所述人脸样本图像划分成若干个相互重叠的图像块,并从所述若干个相互重叠的图像块中获取包括面部特征信息的图像块;Dividing the face sample image into a plurality of mutually overlapping image blocks, and acquiring image blocks including facial feature information from the plurality of mutually overlapping image blocks;针对每一个包括面部特征信息的图像块,确定所述每一个包括面部特征信息的图像块在所述人脸样本图像的人脸特征图中对应的目标区域,并将所述目标区域中的图像块和所述每一个包括面部特征信息的图像块中同一位置的像素点的像素值相加,得到面部加强特征图;Determining, for each of the image blocks including the facial feature information, the image block including the facial feature information corresponding to the target region in the facial feature map of the facial sample image, and the image in the target region And adding a pixel value of the pixel at the same position in each of the image blocks including the facial feature information to obtain a face enhancement feature map;针对每一个面部加强特征图,通过所述经过K-1次调整的深度卷积神经网络模型的第 一网络分支的后M个卷积层,获取所述面部加强特征图中的面部素描特征。For each facial enhancement feature map, through the K-1 adjusted deep convolutional neural network model A rear M convolutional layer of a network branch acquires a facial sketch feature in the facial enhancement feature map.
- 如权利要求8-10任一项所述的方法,其特征在于,通过所述经过K-1次调整的深度卷积神经网络模型的第二网络分支的后M个卷积层,获取所述人脸样本图像的人脸特征图中的头发素描特征,包括:The method according to any one of claims 8 to 10, wherein the latter M convolutional layer of the second network branch of the K-1 sub-adjusted deep convolutional neural network model is obtained The hair sketch features in the face feature map of the face sample image, including:将所述人脸样本图像划分成若干个相互重叠的图像块,并从所述若干个相互重叠的图像块中获取包括头发特征信息的图像块;Dividing the face sample image into a plurality of mutually overlapping image blocks, and acquiring image blocks including hair feature information from the plurality of mutually overlapping image blocks;针对每一个包括头发特征信息的图像块,将所述人脸样本图像和所述包括头发特征信息的图像块中同一位置的像素点的像素值相加,得到头发加强特征图;And adding, to each of the image blocks including the hair feature information, the pixel values of the pixel positions at the same position in the image block including the hair feature information to obtain a hair enhancement feature map;针对每一个头发加强特征图,通过所述经过K-1次调整的深度卷积神经网络模型的第二网络分支的后M个卷积层,获取所述头发加强特征图中的头发素描特征。For each hair enhancement feature map, the hair sketch feature in the hair enhancement feature map is obtained by the last M convolution layers of the second network branch of the K-1 adjusted deep convolutional neural network model.
- 如权利要求10所述的方法,其特征在于,所述从所述若干个相互重叠的图像块中获取包括面部特征信息的图像块,包括:The method of claim 10, wherein the obtaining an image block comprising facial feature information from the plurality of mutually overlapping image blocks comprises:针对所述若干个相互重叠的图像块中的每一个图像块,确定所述每一个图像块中的每一个像素点为面部特征点的面部概率;在确定面部概率不为0的像素点的数量大于预设阈值时,则确定所述每一个图像块为包括面部特征信息的图像块。Determining, for each of the plurality of mutually overlapping image blocks, a face probability of each pixel point in each of the image blocks as a facial feature point; determining a number of pixel points whose face probability is not 0 When it is greater than the preset threshold, it is determined that each of the image blocks is an image block including facial feature information.
- 如权利要求11所述的方法,其特征在于,所述从所述若干个相互重叠的图像块中获取包括头发特征信息的图像块,包括:The method of claim 11, wherein the obtaining an image block comprising hair feature information from the plurality of mutually overlapping image blocks comprises:针对所述若干个相互重叠的图像块中的每一个图像块,确定所述每一个图像块中的每一个像素点为头发特征点的头发概率;在确定头发概率不为0的像素点的数量大于预设阈值时,则确定所述每一个图像块为包括头发特征信息的图像块。Determining, for each of the plurality of mutually overlapping image blocks, a hair probability of each pixel point in each of the image blocks as a hair feature point; determining a number of pixel points whose hair probability is not zero When it is greater than the preset threshold, it is determined that each of the image blocks is an image block including hair feature information.
- 一种素描图像的生成装置,其特征在于,包括:A device for generating a sketch image, comprising:获取模块,用于获取待处理的人脸图像;An obtaining module, configured to obtain a face image to be processed;深度卷积神经网络模型,用于获取所述获取模块获取的所述人脸图像中的面部结构素描图以及头发纹理素描图;所述深度卷积神经网络模型为预先训练的,包括第一网络分支模块以及第二网络分支模块;a depth convolutional neural network model, configured to acquire a facial structure sketch map and a hair texture sketch map in the face image acquired by the obtaining module; the deep convolutional neural network model is pre-trained, including the first network a branch module and a second network branch module;其中,所述第一网络分支模块,用于获取所述获取模块获取的所述人脸图像中的面部素描特征,得到面部结构素描图;所述第一网络分支模块包括P个卷积层,其中,所述P为大于0的整数;The first network branching module is configured to acquire a facial sketch feature in the facial image acquired by the acquiring module, to obtain a facial structure sketch map, where the first network branching module includes P convolutional layers. Wherein P is an integer greater than 0;所述第二网络分支模块,用于获取所述获取模块获取的所述人脸图像中的头发素描特征,得到头发纹理素描图;所述第二网络分支模块包括P个卷积层;The second network branching module is configured to obtain a hair sketch feature in the face image acquired by the acquiring module, to obtain a hair texture sketch map; and the second network branching module includes P convolution layers;合成模块,用于将所述第一网络分支模块得到的所述面部结构素描图以及所述第二网络分支模块得到的所述头发纹理素描图合成得到所述人脸图像的素描图像。And a synthesizing module, configured to synthesize the facial structure sketch map obtained by the first network branching module and the hair texture sketch map obtained by the second network branching module to obtain a sketch image of the facial image.
- 如权利要求14所述的装置,其特征在于,所述第一网络分支模块包括的P个卷积层中的前N个卷积层与所述第二网络分支模块包括的P个卷积层中的前N个卷积层相同或者重合,所述N为大于0小于P的整数。The apparatus according to claim 14, wherein the first N convolution layers of the P convolutional layers included in the first network branching module and the P convolutional layers included in the second network branching module The first N convolutional layers are the same or coincident, and the N is an integer greater than 0 and less than P.
- 如权利要求15所述的装置,其特征在于,所述第一网络分支模块,具体用于:The device according to claim 15, wherein the first network branching module is specifically configured to:通过所述第一网络分支模块的所述前N个卷积层,过滤所述人脸图像中的背景特征,得到人脸特征图;Filtering background features in the face image by using the first N convolution layers of the first network branching module to obtain a facial feature map;通过所述第一网络分支模块的后M个卷积层,获取所述人脸特征图中的面部素描特征;Obtaining a facial sketch feature in the facial feature map by using the last M convolution layers of the first network branching module;所述第二网络分支模块,具体用于: The second network branch module is specifically configured to:通过所述深度卷积神经网络模型中第二网络分支的所述前N个卷积层,过滤所述人脸图像中的背景特征,得到人脸特征图;Filtering background features in the face image by using the first N convolution layers of the second network branch in the deep convolutional neural network model to obtain a facial feature map;通过所述第二网络分支的后M个卷积层,获取所述人脸特征图中的头发素描特征;Obtaining a hair sketch feature in the face feature map by using the last M convolution layers of the second network branch;其中,P=M+N。Where P = M + N.
- 如权利要求16所述的装置,其特征在于,所述第一网络分支模块的后M个卷积层的卷积核尺寸,与所述第二网络分支模块的后M个卷积层的卷积核尺寸对应相等。The apparatus according to claim 16, wherein a convolution kernel size of the last M convolutional layers of the first network branching module and a volume of the last M convolutional layers of the second network branching module The product core sizes are equal.
- 如权利要求16或17所述的装置,其特征在于,所述N为4,所述第一网络分支模块,在通过所述第一网络分支模块的前N个卷积层,过滤所述人脸图像中的背景特征时,具体用于:The apparatus according to claim 16 or 17, wherein said N is 4, said first network branching module filtering said person through said first N convolutional layers of said first network branching module When the background feature in the face image is used, it is specifically used to:通过所述第一网络分支模块的前N个卷积层中的第一卷积层以及第二卷积层,过滤所述人脸图像水平方向以及垂直方向的背景特征;Filtering background features in a horizontal direction and a vertical direction of the face image by using a first convolution layer and a second convolution layer in the first N convolution layers of the first network branching module;通过所述第一网络分支模块的前N个卷积层中的第三卷积层以及第四卷积层,针对过滤了背景特征的所述人脸图像,在水平方向以及垂直方向上进行平滑处理。Smoothing in the horizontal direction and the vertical direction for the face image filtered by the background feature by the third convolution layer and the fourth convolution layer in the first N convolution layers of the first network branching module deal with.
- 如权利要求18所述的装置,其特征在于,所述第一卷积层的卷积核尺寸与第二卷积层的卷积核尺寸相等,第三卷积层的卷积核尺寸和第四卷积层的卷积核尺寸相同。The apparatus according to claim 18, wherein a convolution kernel size of said first convolutional layer is equal to a convolution kernel size of said second convolutional layer, and a convolution kernel size and a third convolutional layer The convolution kernel of the four-volume layer is the same size.
- 如权利要求14至19任一项所述的装置,其特征在于,所述获取模块,还用于获取所述人脸图像中每一个像素点为头发特征点的头发概率;The device according to any one of claims 14 to 19, wherein the obtaining module is further configured to acquire a hair probability of each pixel point in the face image as a hair feature point;所述合成模块,具体用于:The synthesis module is specifically configured to:将所述第一网络分支模块得到的所述面部结构素描图以及所述第二网络分支模块得到的所述头发纹理素描图合成得到所述人脸图像的素描图像,符合如下公式要求:Combining the facial structure sketch map obtained by the first network branching module and the hair texture sketch map obtained by the second network branching module to obtain a sketch image of the face image, which meets the following formula requirements:S(i,j)=(1-Ph(i,j))×SS(i,j)+Ph(i,j)×St(i,j) S (i,j) =(1-P h(i,j) )×S S(i,j) +P h(i,j) ×S t(i,j)其中,所述S(i,j)为所述人脸图像的素描图像中第i行第j列的像素点的像素值,Ph(i,j)为所述人脸图像的素描图像中第i行第j列的像素点的头发概率,SS(i,j)为所述面部结构素描图像中第i行第j列的像素点的像素值,St(i,j)为所述头发纹理素描图中第i行第j列的像素点的像素值,所述i,j均为大于0的整数。Wherein the S (i, j) is a pixel value of a pixel point of the i-th row and the j-th column in the sketch image of the face image, and P h(i, j) is a sketch image of the face image The hair probability of the pixel of the i-th row and the j-th column, S S(i, j) is the pixel value of the pixel of the i-th row and the j-th column in the sketch image of the face structure, and S t(i, j) is A pixel value of a pixel of the i-th row and the j-th column in the hair texture sketch map, wherein i, j are integers greater than zero.
- 如权利要求14-20任一项所述的装置,其特征在于,还包括:The device according to any one of claims 14 to 20, further comprising:训练模块,用于通过如下方式训练得到所述深度卷积神经网络模型:a training module for training the deep convolutional neural network model by:将训练样本数据库中的若干个人脸样本图像输入初始化的深度卷积神经网络模型进行训练;所述训练样本数据库包括若干个人脸样本图像以及每个人脸样本图像对应的素描样本图像,所述初始化的深度卷积神经网络模型包括权重和偏置;Performing training by inputting a plurality of personal face sample images in the training sample database into the initialized deep convolutional neural network model; the training sample database includes a plurality of personal face sample images and a sketch sample image corresponding to each face sample image, the initialized Deep convolutional neural network models include weights and offsets;在第K次训练过程中,通过经过K-1次调整的深度卷积神经网络模型的前N个卷积层,过滤所述人脸样本图像中的背景特征,得到所述人脸样本图像的人脸特征图,所述K为大于0的整数;In the Kth training process, the background features in the face sample image are filtered by the first N convolution layers of the K-1 sub-depth convolutional neural network model to obtain the face sample image. a face feature map, the K being an integer greater than 0;通过所述经过K-1次调整的深度卷积神经网络模型的第一网络分支模块的后M个卷积层,获取所述人脸样本图像的人脸特征图中的面部素描特征,得到所述人脸样本图像的面部结构素描图;Obtaining facial sketch features in the facial feature map of the face sample image by using the last M convolution layers of the first network branching module of the K-1 sub-depended deep convolutional neural network model a sketch of the facial structure of the face sample image;通过所述经过K-1次调整的深度卷积神经网络模型的第二网络分支模块的后M个卷积层,获取所述人脸样本图像的人脸特征图中的头发素描特征,得到所述人脸样本图像的头发纹理素描图;Obtaining a hair sketch feature in the face feature image of the face sample image by using the last M convolution layers of the second network branch module of the K-1 sub-depended deep convolutional neural network model a sketch of the hair texture of the face sample image;将所述人脸样本图像的面部结构素描图以及所述人脸样本图像的头发纹理素描图合 成得到所述人脸样本图像的素描图像;Forming a face structure sketch of the face sample image and a hair texture sketch of the face sample image Obtaining a sketch image of the face sample image;在第K次训练后,获取所述人脸样本图像的素描图像与所述人脸样本图像对应的素描样本图像之间的误差值;Obtaining an error value between the sketch image of the face sample image and the sketch sample image corresponding to the face sample image after the Kth training;基于所述人脸样本图像的素描图像与所述人脸样本图像对应的素描样本图像之间的误差值,调整第K+1次训练过程所使用的权重和偏置。The weight and offset used in the K+1th training process are adjusted based on an error value between the sketch image of the face sample image and the sketch sample image corresponding to the face sample image.
- 如权利要求21所述的装置,其特征在于,所述训练模块,在第K次训练过程中,通过经过K-1次调整的深度卷积神经网络模型的前N个卷积层,过滤所述人脸样本图像中的背景特征时,具体用于:The apparatus according to claim 21, wherein said training module filters the first N convolutional layers of the deep convolutional neural network model subjected to K-1 adjustments during the Kth training period When describing the background features in the face sample image, it is specifically used to:将所述人脸样本图像和素描平均图中同一位置的像素点的像素值相加,得到人脸加强图像;Adding the face sample image and the pixel value of the pixel at the same position in the sketch average image to obtain a face enhancement image;其中,针对所述素描平均图中的任一个像素点的像素值为:所述训练样本数据库中所有素描样本图像中与所述任一个像素点在同一位置的像素点的像素值的平均值;The pixel value of any one of the pixel points in the sketch average map is an average value of pixel values of pixel points in the same position in the sketch sample image in the training sample database that are at the same position as the any one of the pixel points;通过经过K-1次调整的深度卷积神经网络模型的前N个卷积层,过滤所述人脸加强图像中的背景特征。The background features in the face enhancement image are filtered by the first N convolutional layers of the K-1 sub-depth convolutional neural network model.
- 如权利要求21或22所述的装置,其特征在于,所述获取模块,还用于将所述人脸样本图像划分成若干个相互重叠的图像块,并从所述若干个相互重叠的图像块中获取包括面部特征信息的图像块;The device according to claim 21 or 22, wherein the acquisition module is further configured to divide the face sample image into a plurality of mutually overlapping image blocks, and from the plurality of mutually overlapping images Obtaining an image block including facial feature information in the block;所述训练模块,在通过所述经过K-1次调整的深度卷积神经网络模型的第一网络分支模块的后M个卷积层,获取所述人脸样本图像的人脸特征图中的面部素描特征时,具体用于:The training module acquires the face feature map of the face sample image in the last M convolution layers of the first network branch module of the deep convolutional neural network model that has undergone K-1 adjustments. When the facial sketch feature is used, it is specifically used to:针对所述获取模块获取的每一个包括面部特征信息的图像块,确定所述每一个包括面部特征信息的图像块在所述人脸样本图像的人脸特征图中对应的目标区域,并将所述目标区域中的图像块和所述每一个包括面部特征信息的图像块中同一位置的像素点的像素值相加,得到面部加强特征图;Determining, for each image block that includes the facial feature information acquired by the acquiring module, that each of the image blocks including the facial feature information is in a corresponding target region in the facial feature map of the facial sample image, and Adding pixel values of the image points in the target area and the pixel points of the same position in each of the image blocks including the facial feature information to obtain a face enhancement feature map;针对每一个面部加强特征图,通过所述经过K-1次调整的深度卷积神经网络模型的第一网络分支模块的后M个卷积层,获取所述面部加强特征图中的面部素描特征。Obtaining facial sketch features in the facial enhancement feature map by using the last M convolutional layers of the first network branching module of the K-1 sub-depended deep convolutional neural network model for each facial enhancement feature map .
- 如权利要求21-23任一项所述的装置,其特征在于,所述获取模块,还用于将所述人脸样本图像划分成若干个相互重叠的图像块,并从所述若干个相互重叠的图像块中获取包括头发特征信息的图像块;The device according to any one of claims 21 to 23, wherein the acquisition module is further configured to divide the face sample image into a plurality of overlapping image blocks, and from the plurality of mutual Obtaining an image block including hair feature information in the overlapping image blocks;所述训练模块,在通过所述经过K-1次调整的深度卷积神经网络模型的第二网络分支模块的后M个卷积层,获取所述人脸样本图像的人脸特征图中的头发素描特征时,具体用于:The training module acquires the face feature map of the face sample image in the last M convolution layers of the second network branch module of the deep convolutional neural network model that has undergone K-1 adjustments. When sketching hair features, it is specifically used to:针对所述获取模块获取的每一个包括头发特征信息的图像块,将所述人脸样本图像和所述包括头发特征信息的图像块中同一位置的像素点的像素值相加,得到头发加强特征图;And adding, to the image block including the hair feature information acquired by the acquiring module, the pixel sample values of the face sample image and the pixel position of the same position in the image block including the hair feature information to obtain a hair enhancement feature Figure针对每一个头发加强特征图,通过所述经过K-1次调整的深度卷积神经网络模型的第二网络分支模块的后M个卷积层,获取所述头发加强特征图中的头发素描特征。Obtaining the hair sketch feature in the hair enhancement feature map by using the rear M convolution layers of the second network branching module of the K-1 sub-depth deep convolutional neural network model for each hair enhancement feature map .
- 如权利要求23所述的装置,其特征在于,所述获取模块,在从所述若干个相互重叠的图像块中获取包括面部特征信息的图像块时,具体用于:The device according to claim 23, wherein the obtaining module is configured to: when acquiring an image block including facial feature information from the plurality of mutually overlapping image blocks,针对所述若干个相互重叠的图像块中的每一个图像块,确定所述每一个图像块中的每一个像素点为面部特征点的面部概率;在确定面部概率不为0的像素点的数量大于预设阈 值时,则确定所述每一个图像块为包括面部特征信息的图像块。Determining, for each of the plurality of mutually overlapping image blocks, a face probability of each pixel point in each of the image blocks as a facial feature point; determining a number of pixel points whose face probability is not 0 Greater than the preset threshold At the time of the value, it is determined that each of the image blocks is an image block including facial feature information.
- 如权利要求24所述的装置,其特征在于,所述获取模块,在从所述若干个相互重叠的图像块中获取包括头发特征信息的图像块时,具体用于:The device according to claim 24, wherein the obtaining module is configured to: when acquiring an image block including hair feature information from the plurality of mutually overlapping image blocks,针对所述若干个相互重叠的图像块中的每一个图像块,确定所述每一个图像块中的每一个像素点为头发特征点的头发概率;在确定头发概率不为0的像素点的数量大于预设阈值时,则确定所述每一个图像块为包括头发特征信息的图像块。Determining, for each of the plurality of mutually overlapping image blocks, a hair probability of each pixel point in each of the image blocks as a hair feature point; determining a number of pixel points whose hair probability is not zero When it is greater than the preset threshold, it is determined that each of the image blocks is an image block including hair feature information.
- 一种素描图像的生成装置,其特征在于,包括采集器、存储器以及处理器;A device for generating a sketch image, comprising: a collector, a memory, and a processor;所述采集器,用于获取待处理的人脸图像;The collector is configured to acquire a face image to be processed;存储器,用于存储所述处理器执行的程序;a memory for storing a program executed by the processor;处理器,用于基于所述采集器获取到的所述人脸图像执行所述存储器存储的程序,以执行权利要求1至13任一项所述的方法。And a processor, configured to execute the memory stored program based on the face image acquired by the collector to perform the method of any one of claims 1 to 13.
- 一种计算机存储介质,其特征在于,所述计算机可读存储介质存储有计算机可执行指令,所述计算机可执行指令用于使所述计算机执行权利要求1至13任一项所述的方法。 A computer storage medium, characterized in that the computer readable storage medium stores computer executable instructions for causing the computer to perform the method of any one of claims 1 to 13.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2017/078637 WO2018176281A1 (en) | 2017-03-29 | 2017-03-29 | Sketch image generation method and device |
CN201780073000.6A CN110023989B (en) | 2017-03-29 | 2017-03-29 | Sketch image generation method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2017/078637 WO2018176281A1 (en) | 2017-03-29 | 2017-03-29 | Sketch image generation method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018176281A1 true WO2018176281A1 (en) | 2018-10-04 |
Family
ID=63675092
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2017/078637 WO2018176281A1 (en) | 2017-03-29 | 2017-03-29 | Sketch image generation method and device |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110023989B (en) |
WO (1) | WO2018176281A1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110069992A (en) * | 2019-03-18 | 2019-07-30 | 西安电子科技大学 | A kind of face image synthesis method, apparatus, electronic equipment and storage medium |
CN110163824A (en) * | 2019-05-22 | 2019-08-23 | 西安电子科技大学 | Based on bionic human face portrait synthetic method |
CN110188651A (en) * | 2019-05-24 | 2019-08-30 | 西安电子科技大学 | Face portrait synthesis method based on deep probabilistic graph model |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110580726B (en) * | 2019-08-21 | 2022-10-04 | 中山大学 | Dynamic convolution network-based face sketch generation model and method in natural scene |
CN113129410B (en) * | 2019-12-31 | 2024-06-07 | 深圳云天励飞技术有限公司 | Sketch image conversion method and related product |
CN111223164B (en) * | 2020-01-08 | 2023-10-24 | 杭州未名信科科技有限公司 | Face simple drawing generation method and device |
CN113139566B (en) * | 2020-01-20 | 2024-03-12 | 北京达佳互联信息技术有限公司 | Training method and device for image generation model, and image processing method and device |
CN112581358B (en) * | 2020-12-17 | 2023-09-26 | 北京达佳互联信息技术有限公司 | Training method of image processing model, image processing method and device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060034495A1 (en) * | 2004-04-21 | 2006-02-16 | Miller Matthew L | Synergistic face detection and pose estimation with energy-based models |
CN103456010A (en) * | 2013-09-02 | 2013-12-18 | 电子科技大学 | Human face cartoon generation method based on feature point localization |
CN104537630A (en) * | 2015-01-22 | 2015-04-22 | 厦门美图之家科技有限公司 | Method and device for image beautifying based on age estimation |
CN105678232A (en) * | 2015-12-30 | 2016-06-15 | 中通服公众信息产业股份有限公司 | Face image feature extraction and comparison method based on deep learning |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040228504A1 (en) * | 2003-05-13 | 2004-11-18 | Viswis, Inc. | Method and apparatus for processing image |
JP4397372B2 (en) * | 2005-12-28 | 2010-01-13 | トヨタ自動車株式会社 | 3D shape data creation method, 3D shape data creation device, and 3D shape data creation program |
CN101694720B (en) * | 2009-10-13 | 2012-02-08 | 西安电子科技大学 | Multi-temporal SAR Image Change Detection Method Based on Spatial Correlation Conditional Probability Fusion |
CN101777180B (en) * | 2009-12-23 | 2012-07-04 | 中国科学院自动化研究所 | Complex background real-time alternating method based on background modeling and energy minimization |
EP2613294A1 (en) * | 2010-09-03 | 2013-07-10 | Xiaogang Wang | System and method for synthesizing portrait sketch from photo |
CN102436637B (en) * | 2010-09-29 | 2013-08-21 | 中国科学院计算技术研究所 | Method and system for automatically segmenting hairs in head images |
CN101990081B (en) * | 2010-11-11 | 2012-02-22 | 宁波大学 | A Copyright Protection Method for Virtual Viewpoint Image |
WO2012159310A1 (en) * | 2011-06-29 | 2012-11-29 | 华为技术有限公司 | Method and apparatus for triggering user equipment |
CN103279936B (en) * | 2013-06-21 | 2016-04-27 | 重庆大学 | Human face fake photo based on portrait is synthesized and modification method automatically |
US10339685B2 (en) * | 2014-02-23 | 2019-07-02 | Northeastern University | System for beauty, cosmetic, and fashion analysis |
CN105869159A (en) * | 2016-03-28 | 2016-08-17 | 联想(北京)有限公司 | Image segmentation method and apparatus |
CN109359541A (en) * | 2018-09-17 | 2019-02-19 | 南京邮电大学 | A sketch face recognition method based on deep transfer learning |
-
2017
- 2017-03-29 CN CN201780073000.6A patent/CN110023989B/en active Active
- 2017-03-29 WO PCT/CN2017/078637 patent/WO2018176281A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060034495A1 (en) * | 2004-04-21 | 2006-02-16 | Miller Matthew L | Synergistic face detection and pose estimation with energy-based models |
CN103456010A (en) * | 2013-09-02 | 2013-12-18 | 电子科技大学 | Human face cartoon generation method based on feature point localization |
CN104537630A (en) * | 2015-01-22 | 2015-04-22 | 厦门美图之家科技有限公司 | Method and device for image beautifying based on age estimation |
CN105678232A (en) * | 2015-12-30 | 2016-06-15 | 中通服公众信息产业股份有限公司 | Face image feature extraction and comparison method based on deep learning |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110069992A (en) * | 2019-03-18 | 2019-07-30 | 西安电子科技大学 | A kind of face image synthesis method, apparatus, electronic equipment and storage medium |
CN110163824A (en) * | 2019-05-22 | 2019-08-23 | 西安电子科技大学 | Based on bionic human face portrait synthetic method |
CN110163824B (en) * | 2019-05-22 | 2022-06-10 | 西安电子科技大学 | Biomimicry-based face portrait synthesis method |
CN110188651A (en) * | 2019-05-24 | 2019-08-30 | 西安电子科技大学 | Face portrait synthesis method based on deep probabilistic graph model |
Also Published As
Publication number | Publication date |
---|---|
CN110023989A (en) | 2019-07-16 |
CN110023989B (en) | 2021-06-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2018176281A1 (en) | Sketch image generation method and device | |
JP6847910B2 (en) | Methods and systems for automatic chromosome classification | |
CN111127304B (en) | Cross-domain image conversion | |
JP7512262B2 (en) | Facial keypoint detection method, device, computer device and computer program | |
Iizuka et al. | Let there be color! joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification | |
WO2021036471A1 (en) | Sample generation method and apparatus, and computer device and storage medium | |
WO2020199478A1 (en) | Method for training image generation model, image generation method, device and apparatus, and storage medium | |
WO2018072102A1 (en) | Method and apparatus for removing spectacles in human face image | |
JP2023545565A (en) | Image detection method, model training method, image detection device, training device, equipment and program | |
US20230021661A1 (en) | Forgery detection of face image | |
Garrido et al. | Corrective 3D reconstruction of lips from monocular video. | |
CN115115676B (en) | Image registration method, device, equipment and storage medium | |
JP2008152530A (en) | Face recognition device, face recognition method, gabor filter applied device, and computer program | |
CN110334566B (en) | An OCT Internal and External Fingerprint Extraction Method Based on 3D Fully Convolutional Neural Network | |
CN109948467A (en) | Method, device, computer equipment and storage medium for face recognition | |
CN108021869A (en) | A kind of convolutional neural networks tracking of combination gaussian kernel function | |
CN113112518B (en) | Feature extractor generation method and device based on spliced image and computer equipment | |
CN115239861A (en) | Face data enhancement method and device, computer equipment and storage medium | |
US12112575B2 (en) | Method and apparatus for detecting liveness based on phase difference | |
CN116310008A (en) | An image processing method and related equipment based on few-shot learning | |
CN114648604A (en) | Image rendering method, electronic device, storage medium and program product | |
Bhattad et al. | Cut-and-paste object insertion by enabling deep image prior for reshading | |
Juneja | Multiple feature descriptors based model for individual identification in group photos | |
CN114359361A (en) | Depth estimation method, depth estimation device, electronic equipment and computer-readable storage medium | |
Azaza et al. | Context proposals for saliency detection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17904340 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 17904340 Country of ref document: EP Kind code of ref document: A1 |