CN115223004B - Method for generating image enhancement of countermeasure network based on improved multi-scale fusion - Google Patents
Method for generating image enhancement of countermeasure network based on improved multi-scale fusion Download PDFInfo
- Publication number
- CN115223004B CN115223004B CN202210692241.1A CN202210692241A CN115223004B CN 115223004 B CN115223004 B CN 115223004B CN 202210692241 A CN202210692241 A CN 202210692241A CN 115223004 B CN115223004 B CN 115223004B
- Authority
- CN
- China
- Prior art keywords
- image
- convolution
- feature
- layer
- low
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/30—Noise filtering
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Databases & Information Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an improved multiscale fusion generation countermeasure network image enhancement method based on the improvement, which comprises the steps of 1, establishing image data sets with different brightness, 2, establishing an improved generation countermeasure network image enhancement model, substituting a training set into the model for training to obtain a trained model, and 3, inputting a real low-illumination image to be processed into the trained model to obtain an enhanced image. The invention introduces a channel attention mechanism and residual dense blocks in the generator, improves the situation of local abrupt change of the enhanced picture, ensures that the network focuses more on the information of interest, and enhances the flexibility of the network. The multi-dimensional extraction of the picture features by using the multi-layer network allows each layer of network to transmit information to be reserved to the back network, the features of the shallow layer are fused with the features of the deep layer, more detail information can be extracted in image enhancement, and the problem of local distortion after enhancement is solved.
Description
Technical Field
The invention belongs to the field of image enhancement, and particularly relates to an improved multiscale fusion-based generation countermeasure network image enhancement method.
Background
With the wide application of image processing technology in machine vision, high-quality images become particularly important and also put higher demands on image preprocessing operations. In real life, problems such as detail blurring, low contrast, color distortion and the like of a shot image caused by underexposure or poor illumination condition are often caused, and a real outdoor scene is complex and changeable, so that the quality of the image is reduced, and the subsequent use of the image is seriously influenced. Therefore, the method has important theoretical research significance for enhancing the image with low visibility because of the complexity of the light source photographed at night. At present, the problems that a data set enhanced at night is single in light source, a real night scene cannot be simulated, and the traditional image enhancement technology cannot achieve an ideal effect exist.
The non-model image enhancement mainly comprises a histogram equalization algorithm and spatial filtering, wherein the histogram equalization algorithm is used for shooting an image with low overall brightness and contrast and a small dynamic range by comparing a clear image with a complex weather. And expanding the gray level of the image in the range set by adopting a spatial domain, and expanding the gray level range of the image to be in the whole gray level range. The method comprises the steps of calculating a mapping function based on a whole image, wherein the mapping function is based on the pixel distribution of a certain high-brightness area or low-brightness area originally in a centralized manner, the purpose of noise elimination is achieved by readjusting the noise distribution, the night environment image is possibly enhanced excessively, the noise in the dark area is possibly amplified, the spatial filtering is to carry out sliding treatment on each pixel in the image by using a filter, the mathematical statistical operation comprises inverse transformation, logarithmic change, power change and the like, the same operation is carried out on each pixel area by using a mobile filter until each area is finished, the size of a sliding window is difficult to select in the operation process, the filtering is invalid when the number of pixels in the filtering window is far smaller than the number of the noise in the image, the model-based enhancement method comprises a Retinex image enhancement method and a dark channel priori algorithm, the light intensity of the low-resolution image is estimated firstly, the light intensity component is removed, the influence on the image due to the fact that the light intensity factor is reduced, the reflection part of an object is separated and reserved in the image, although the natural degree is maintained, the fact that the actual light intensity is not balanced, the light intensity is not changed, the problem of the dark channel is solved, the actual light is solved, the problem of the dark channel is solved, the dark channel is not really is solved, and the problem is solved, and the dark channel is not is solved.
Disclosure of Invention
The invention aims to provide an improved multiscale fusion generation countermeasure network image enhancement method which is used for solving the problems in the prior art.
In order to realize the tasks, the invention adopts the following technical scheme:
An improved multiscale fusion-based method for generating an image enhancement of an countermeasure network, comprising the following steps:
step 1, establishing image data sets with different brightness;
step 2, establishing an improved generated countermeasure network image enhancement model, substituting the training set obtained in the step 1 into the improved generated countermeasure network image enhancement model to train to obtain a trained model, and testing through a testing set, wherein the improved generated countermeasure network image enhancement model comprises a generator and a discriminator;
The processing procedure of the generator for each image in the low-illumination image dataset input into the generator comprises the following sub-steps:
step 21, extracting shallow characteristic information of a low-illumination image of the training set by adopting a 7*7 convolution layer;
step 22, shallow characteristic information sequentially passes through two downsampling layers to obtain image information with different scales from an original image;
Step 23, sending the image information obtained in the step 22 into a depth feature extraction module, wherein the module comprises two parallel branches, and the two parallel branches obtain deep information of different receptive fields through convolution kernels of different sizes, wherein the two parallel branches have the same composition and comprise a convolution layer, an activation function, an alternate module layer and a Concat layer which are sequentially connected, and the output of the activation function and the output of the alternate module are commonly connected with the Concat layer, wherein the alternate module comprises three residual dense blocks and three channel attention modules which are alternately connected;
Step 24, carrying out visual fusion on the deep information of different receptive fields obtained by the two branches to obtain a fused characteristic diagram;
Step 25, sequentially performing convolution, activation, up-sampling twice and convolution again on the fused feature images to obtain output feature images, wherein the first convolution is used for further extracting feature information from the fused features, the up-sampling twice is used for restoring the image size, and the convolution again is used for image restoration;
Step 26, the output feature map obtained in the step 25 and the corresponding original image of the training set are connected together to Concat layers, so that a final output result is obtained;
And step 3, inputting the real low-illumination image to be processed into the trained model obtained in the step 3, and obtaining the enhanced image.
Further, the step 1 includes the following sub-steps:
Step 11, processing the original image I by using gamma correction to obtain a first low-illumination image I 1, wherein the calculation formula is as follows:
wherein, I represents an original image (ground truth), I 1 represents a first low-illumination image, gamma 1 represents a gamma value and takes 0.2;
Step 12, processing the original image I by using a camera response function to reduce the image brightness, so as to obtain a second low-illumination image I 2, wherein the calculation formula is as follows:
Wherein I 2 represents the second low-light image, k represents the virtual exposure rate, k= -5.33, two parameters of a and b as camera response functions;
And 13, manually adjusting the brightness of the original image I by using image processing software to obtain a third low-illumination image I 3.
Step 14, dividing the data set into a test set and a training set;
Further, in step 23, the residual dense block includes a dense connection block and a residual connection block, where the dense connection block is used to transmit information extracted from each layer of convolution layer to each subsequent convolution layer, after passing through the dense connection blocks corresponding to the three convolution layers, the features of the shallow layer and the deep layer are fused through a connection function, and then the number of channels of the feature map is restored by using the convolution layer with the convolution kernel of 1, and the residual connection block uses a jump connection mode to perform pixel-by-pixel addition on the features of the input residual dense block and the output after the convolution and activation of the dense connection block.
Further, in step 23, the convolution kernel sizes of the residual dense blocks of the two parallel branches are 3*3 and 5*5, respectively;
further, in the step 23, the processing of the feature map obtained by the channel attention module on the residual dense block includes the following sub-steps:
step 231, the feature graphs output by the residual dense blocks positioned in front of the channel attention module enter two branches of the channel attention module respectively;
Step 232, the first branch uses global average pooling to convert the feature map information obtained by the residual dense block into a channel descriptor and convert the feature image of CxH x W into a feature map of Cx1 x 1;
Step 233, the second branch uses global maximum pooling to obtain the feature map by taking the maximum value of all features in each channel of the feature map information obtained by the residual dense block as the representative feature of the channel;
Step 234, amplifying and shrinking the channel numbers with the same multiple through two convolution layers to obtain the feature weights corresponding to different channels, wherein an activation function layer is connected behind a first convolution layer to prevent the extracted feature data from diverging in the transmission process, and then the channel features with different sizes are fused to obtain channel feature information, and then the channel feature information is activated through a Sigmoid activation function;
Step 235, multiplying the feature map obtained by the previous residual dense block by the weights of all channels in the feature map obtained in step 234 by using an Element-wise product module to obtain a weighted feature map.
Further, in the step 2, the discriminator includes 6 convolution layers, an example normalization layer and a relu activation function are provided after the first 5 convolution layers to prevent gradient from disappearing, and a Sigmoid activation function is provided after the last 1 convolution layers.
Further, in the step 2, the loss function of the discriminator is:
Wherein, Refers to the generator generating the wasperstein distance between the distribution of samples and the distribution of the real picture,The representation picture is taken from a generated picture set output by the generator network,The representation picture is taken from a set of normal illumination images in the training set,The representation picture is taken from the region between the generated sample and the real sample, G (x) represents the image generated by the generator according to the original input image x, D (x) represents the discrimination evaluation of the discriminator on the original input image x, A represents the expected value expression, lambda is the constant coefficient of the gradient penalty term,In order to determine the gradient of the instrument,The gradient penalty term WGAN-GP is aimed at making the gradient of the discriminator not exceed 1, so that it can solve the problem of instability of gradient, at the same time can accelerate convergence.
Further, in the step 2, the total loss of the generator is:
LG=Lcondition+λ*Lcontent
Wherein lambda is a correction coefficient, taking 100, L condition as a conditional loss, and L content as a content loss;
the loss function of the conditional loss is:
Wherein B represents an input low-light picture, The conditional probability that the generated image at the output belongs to the originally input low-light image is expressed,The representative generator generates a picture to judge whether the picture is an average value of a real picture or not;
The content loss employs perceptual loss.
Compared with the prior art, the invention has the following technical characteristics:
(1) Aiming at the problems that the existing low-illumination image is introduced from an HDR data set of a fixed scene, the number is small, different scenes are difficult to contain and the light source is single, the low-illumination image is synthesized by adopting different brightness adjusting functions and parameters, and three methods of gamma correction, a camera response model and Photoshop manual adjustment are mainly used. Night scenes are simulated by covering more and more complex luminance transition curves.
(2) The improved generation of the invention introduces a channel attention mechanism in the antagonistic network model generator, improves the situation of local mutation of the enhanced picture, and the channel attention mechanism shows that different channel characteristics have completely different weighting information according to different brightness and contained information of each region on the same picture, so that the network focuses on the information of interest more by treating different channel characteristics unevenly, and the flexibility of the network is enhanced.
(3) The improved generation of the invention introduces a residual dense block into the countermeasure network model generator, utilizes multi-layer networks to extract image characteristics in a multi-dimension way, allows each layer of network to transmit information to be reserved to the back network, fuses the characteristics of the shallow layer with the characteristics of the deep layer, can extract more detail information in image enhancement, and solves the problem of local distortion after enhancement.
(4) The improved generation countermeasure network model of the invention adopts a network based on a Markov discriminant, which comprises 6 convolution layers, wherein an example normalization layer (IN) and a relu activation function are arranged behind the first 5 convolution layers to prevent gradient from disappearing, and the last 1 convolution layers are arranged behind the Sigmoid activation function to map the range of the pixel value of the output image between (0 and 1), thereby being beneficial to judging the true and false of the network discrimination generated image and the target image IN a certain area and solving the problem of local area denoising failure by scoring n small areas of the image.
(5) The improved generation of the invention optimizes the loss function against the network model, the gradient penalty item WGAN-GP is added in the loss function of the discriminator to prevent the GAN training from collapsing, the loss function of the generator uses the perceived loss as the content loss to solve the L1 loss and the L2 loss, and the generated image is blurred when the average value is made on the pixel space for the optimization of the unique target.
Drawings
FIG. 1 is a diagram of a network architecture for generating an countermeasure in the present invention;
FIG. 2 is a block diagram of a channel attention module according to the present invention;
FIG. 3 is a block diagram of a residual dense connection block in accordance with the present invention;
FIG. 4 is a diagram of a network of generators in the present invention;
FIG. 5 is a diagram of a network architecture of a arbiter in the present invention;
FIG. 6 is a graph of gamma function effects in an embodiment of the invention;
FIG. 7 is a diagram of a camera response model in an embodiment of the invention;
FIG. 8 is a Photoshop tuning map in an embodiment of the invention;
fig. 9 is a low-illuminance image enhancement structure contrast experiment 1 in an embodiment of the present invention, in which:
(a) Is a low-light image;
(b) Is an enhancement effect diagram of the SRIE algorithm;
(c) Is an enhancement effect diagram of the DC-GAN algorithm;
(d) Is an enhancement effect diagram of a Cycle-GAN algorithm;
(e) Is an enhancement effect diagram of the Lime algorithm;
(f) Is an enhancement effect diagram of the method of the present invention;
Fig. 10 is a low-light image enhancement structure contrast experiment 2 in an embodiment of the present invention, wherein:
(a) Is a low-light image;
(b) Is an enhancement effect diagram of the SRIE algorithm;
(c) Is an enhancement effect diagram of the DC-GAN algorithm;
(d) Is an enhancement effect diagram of a Cycle-GAN algorithm;
(e) Is an enhancement effect diagram of the Lime algorithm;
(f) Is an enhancement effect diagram of the method of the present invention;
FIG. 11 is an example of the result of enhancement of luminance images of different levels in an example of the present invention, wherein:
(a) The method is a low-illumination image corresponding enhancement effect diagram generated by a gamma function method;
(b) The low-illumination image corresponding enhancement effect map generated by the camera response function;
(c) The low-illumination image corresponding enhancement effect map is generated through manual adjustment of Photoshop;
Fig. 12 is a graph of different level illumination image enhancement results in an example of the present invention:
(a) The method is a low-illumination image corresponding enhancement effect diagram generated by a gamma function method;
(b) The low-illumination image corresponding enhancement effect map generated by the camera response function;
(c) The low-illumination image corresponding enhancement effect map is generated through manual adjustment of Photoshop;
the invention is explained in further detail below with reference to the drawings and examples.
Detailed Description
First, technical words appearing in the present invention are explained:
The generation of the countermeasure network is that Lan Goodfall et al put forward an excellent deep learning model in 2014, and the model consists of a generation network (GENERATIVE NETWORK) and a discrimination network (DISCRIMINATIVE NETWORK) together, as shown in figure 1. The idea that the antagonism network utilizes zero and game is generated, the antagonism learning of the generator network and the discriminator network and the mutual game are adopted to achieve Nash equilibrium points, and the network achieves the optimal effect.
Gamma correction, also known as gamma non-linearization or gamma encoding. Is used for nonlinear operation or inverse operation of luminance or tri-stimulus value of light in film or image system. The purpose of gamma correction for an image is to compensate for the characteristics of human vision. If the image is not gamma corrected, the data bits or bandwidth utilization may be unevenly distributed resulting in an image with an abnormal visual appearance. The gamma correction achieves the purposes of eliminating influence and correcting images by increasing RGB values in advance.
Camera response model-also known as camera response function. When the camera shoots an image, after the brightness L of object radiation in the scene passes through the lens of the camera, irradiance E is arranged on the surface of the image sensor, and the relation of the scene brightness L to E is as follows:
where h is the focal length of the camera lens, Is the angle between the incident light and the vertical plane of the image sensor, and d is the aperture size of the lens. Because the camera is required to be kept still when shooting multiple frames of images, h is in the shooting process,D are all constant quantities, the process of L mapping to E is a linear mapping process. After the shutter is pressed down, the total exposure reaching the image sensor within the exposure time delta t is converted by the light spot of the sensor to obtain an analog signal, and then the analog signal is subjected to the steps of analog-to-digital conversion, quantization rounding and the like to obtain a pixel value Z. While the mapping of irradiance E to pixel value Z is a nonlinear process, the nonlinear mapping is known as the camera response function.
The method for generating the image enhancement of the countermeasure network based on the improved multi-scale fusion provided by the embodiment comprises the following steps:
Step 1, establishing low-illumination image data sets with different brightness.
In the step, aiming at the problems of small data volume, single brightness and the like in the existing data set, three different modes are adopted to manually generate data to construct a low-illumination image data set, so that more and more complicated brightness transformation curves are covered. The method specifically comprises the following substeps:
Step 11, processing the original image I by using gamma correction to obtain a first low-illumination image I 1, wherein the calculation formula is as follows:
Where I represents the original image (ground truth), I 1 represents the first low-light image, γ 1 represents the gamma value, and 0.2 is taken.
Step 12, processing the original image I by using a camera response function to reduce the image brightness, so as to obtain a second low-illumination image I 2, wherein the calculation formula is as follows:
Where I 2 represents a second low-light image, k represents a virtual exposure rate, k= -5.33, and two parameters of a and b as camera response functions can be calculated from different exposure natural image pairs, and a= -0.32 and b=1.12 are used experimentally.
In this step, the camera response function simulates the generated image of the image at a low exposure level using a different exposure rate than the real image. The camera response function is more complex than the gamma function, and can cover different categories of low-illumination images through different parameters, thereby increasing the data coverage of the data set.
Step 13, the brightness of the original image I is manually adjusted by using image processing software (preferably Photoshop) to obtain a third low-illumination image I 3. The image brightness is changed by using the gamma correction function and the camera response model, and the obtained result may still be different from the real image, so that in order to increase the authenticity and reliability of the data set, the step is to manually adjust each original image, and 100 low-illumination images I 3 are obtained.
In particular, it creates three different brightness low-light image datasets as shown in fig. 6, 7, 8. In this embodiment, 100 original images are used, and 100 images corresponding to three low-illuminance images with different brightnesses are generated, respectively, and 300 pairs of images are taken in total.
Step 14, dividing the data set into a test set and a training set which are 2:8;
And 2, establishing an improved generated countermeasure network image enhancement model, substituting the training set obtained in the step 1 into the improved generated countermeasure network image enhancement model for training to obtain a trained model, and testing through a testing set. The improved generation of the antagonism network image enhancement model comprises a generator and a discriminator (see figure 1), wherein the generator fuses a multi-scale residual error dense connection network based on an attention mechanism, the discriminator is a discriminator based on PatchGan, and the loss function is an optimized loss function;
Specifically, as shown in fig. 4, the processing procedure of the generator for each image in the low-illuminance image data set inputted thereto includes the following sub-steps:
step 21, extracting shallow characteristic information of a low-illumination image of the training set by adopting a 7*7 convolution layer;
step 22, shallow characteristic information sequentially passes through two downsampling layers to obtain image information with different scales from an original image;
Step 23, sending the image information obtained in the step 22 to a depth feature extraction module, wherein the depth feature extraction module comprises two parallel branches, and the two parallel branches obtain deep information of different receptive fields through convolution kernels of different sizes.
The two parallel branches have the same composition, namely, the two parallel branches comprise a convolution layer, an activation function and alternating module layer and Concat layers which are sequentially connected, and the output of the activation function and the output of the alternating module are commonly connected with Concat layers. Wherein the alternating module comprises three residual dense blocks and three channel attention modules which are alternately connected. In the two parallel branches, the convolution kernel sizes of the residual dense blocks are 3*3 and 5*5 respectively, the residual dense blocks and the channel attention module are used for learning complex detail features, and Concat layers which are commonly connected with the output of the activation function and the output of the alternation module are used for copying shallow information to a deep layer so as to prevent low-dimensional information loss in the convolution process.
Specifically, as shown in fig. 3, the residual dense block includes a dense connection block and a residual connection block (the upper connection line in the figure represents dense connection, and the lower connection line represents residual connection), where the dense connection block is used to transmit information extracted by each layer of convolution layers to each subsequent convolution layer, after passing through the dense connection blocks corresponding to the three convolution layers, the features of the shallow layer and the deep layer are fused through a connection function, and then the number of feature map channels is recovered by using the convolution layer with the convolution kernel of1, and the residual connection block uses a jump connection manner to perform pixel-by-pixel addition on the features of the input residual dense block and the output after being convolved and activated by the dense connection block, so that the training complexity can be reduced while increasing the accuracy by increasing the network depth. Characteristic diagram of residual dense block output (size is C.H.W, C is channel number, H, W are width and height of characteristic diagram respectively)
Specifically, as shown in fig. 2, the processing of the feature map obtained by the channel attention module on the residual dense block includes the following sub-steps:
step 231, the feature graphs output by the residual dense blocks positioned in front of the channel attention module enter two branches of the channel attention module respectively;
step 232, the first branch uses global average pooling to convert the feature map information obtained by the residual dense block into a channel descriptor (representing the background feature of the picture), and converts the feature image of CxH x W into a feature map of Cx1 x 1;
Step 233, the second branch uses global maximum pooling to obtain feature images by taking the maximum value of all features in each channel of the feature image information obtained by the residual dense block as the representative feature (the texture detail information representing the picture) of the channel;
Step 234, the feature graphs obtained in step 232 and step 233 are respectively amplified and reduced by the same multiple channel number through two convolution layers to obtain feature weights corresponding to different channels, wherein an activation function layer is connected behind a first convolution layer to prevent the extracted feature data from diverging in the transmission process, and then the channel features with different sizes are fused (namely, added pixel by pixel) to obtain channel feature information to obtain new feature graph weights, and then the new feature graph weights are activated through a Sigmoid activation function to prevent the feature data from diverging in the transmission process.
The method aims at achieving the characteristics that the details of the enhanced image remain complete and the whole image is harmonious and consistent.
Step 235, multiplying the feature map obtained by the previous residual dense block by the weights of all channels in the feature map obtained in step 234 by using an Element-wise product module to obtain a weighted feature map.
This step can enhance the feature learning capabilities of the network.
Step 24, performing visual fusion (namely pixel-by-pixel addition) on the deep information of different receptive fields obtained by the two branches to obtain a fused characteristic diagram;
Step 25, sequentially performing convolution, activation, up-sampling twice and convolution again on the fused feature images to obtain output feature images, wherein the first convolution is used for further extracting feature information from the fused features, the up-sampling twice is used for restoring the image size, and the convolution again is used for image restoration;
And 26, commonly connecting the output characteristic diagram obtained in the step 25 with the corresponding original image of the training set to Concat layers to obtain a final output result.
Specifically, as shown in fig. 5, the optimization result of the discriminator in step 2 adopts a network based on PatchGan discriminators (markov discriminators), which comprises 6 convolution layers, wherein an example normalization layer and a relu activation function are arranged behind the first 5 convolution layers and used for preventing gradient from disappearing, and a Sigmoid activation function is arranged behind the last 1 convolution layers and used for mapping the range of pixel values of an output image between (0, 1), so that the discrimination of true and false of a network discrimination generated image and a target image in a certain area is facilitated, and the problem of local area denoising failure is solved by scoring n small areas of the image.
Specifically, in step 2, the loss function of the discriminator is:
Wherein, Refers to the generator generating the wasperstein distance between the distribution of samples and the distribution of the real picture,The representation picture is taken from a generated picture set output by the generator network,The representation picture is taken from a set of normal illumination images in the training set,The representation picture is taken from the region between the generated sample and the real sample, G (x) represents the image generated by the generator according to the original input image x, D (x) represents the discrimination evaluation of the discriminator on the original input image x, A represents the expected value expression, lambda is the constant coefficient of the gradient penalty term,In order to determine the gradient of the instrument,The gradient penalty term WGAN-GP is aimed at making the gradient of the discriminator not exceed 1, so that it can solve the problem of instability of gradient, at the same time can accelerate convergence.
The total loss of the generator is the sum of conditional loss and optimized content loss, and the expression is as follows;
LG=Lcondition+λ*Lcontent
λ is a correction coefficient, which is fixed to 100 in this embodiment. Where L condition is a conditional penalty, emphasis is placed on maintaining the dependency of the generator output on the input low-light image, and L content is a content penalty, emphasis is placed on ensuring the authenticity of the generated picture.
The conditional loss objective is to maximize the probability that the arbiter will determine the generated image as a true image, its loss function is:
B represents an inputted picture of low illumination, The conditional probability that the generated image at the output belongs to the originally input low-light image is expressed,The representative generator generates a picture to judge whether the picture is an average value of a real picture or not;
Because the L1 loss and the L2 loss of the classical content loss are subjected to average value taking on the pixel space, the generated image is blurred when the L1 loss and the L2 loss are optimized as the only targets, the invention uses the perception loss as the content loss, the perception loss is the characteristic diagram difference after the conv3-3 layer which inputs the real clear image and the generated image obtained by the generator into the VGG-19 network is activated, the enhanced image edge information is ensured to be sharper, and the calculation formula of the perception loss is as follows:
The method is characterized in that the method comprises the steps of representing a feature map obtained after an image passes through an ith convolution layer of a VGG-19 network and before the image passes through the jth maximum pooling layer of the VGG-19 network, wherein W i,j and H i,j respectively represent the width and the height of the feature map, S is a clear image, and G (B) is an image generated by a generator.
And step 3, inputting the real low-illumination image to be processed into the trained model obtained in the step 3, and obtaining the enhanced image.
In order to verify the feasibility and effectiveness of the method of the invention, the procedure and results of the test performed in step 3 using the test set are given:
The same training set in the data set is used for training on two deep learning algorithm models of DC-Gan and Cycle-Gan respectively to obtain a trained network model, and then the same testing set in the data set is used for performing a comparison experiment on the SRIE, the Lime algorithm, the trained DC-Gan network and the Cycle-Gan network, so that the method has obvious influence on the improvement of the image enhancement effect.
Specifically, an Adam optimizer is adopted for training, after 3 gradient descent is carried out on each pair of discriminators, the generator is updated once, and the total training batch is 300 times. The initial learning rate of both the generator and the arbiter is set to 1×10 -4. The comparison experiment effect diagrams are shown in fig. 9 and 10, and are the results of the low-illumination image, the SRIE, the DC-Gan network, the Cycle-Gan network and the Lime algorithm in sequence from left to right, and finally the result of the method is the result of the invention.
The enhancement graphs of different levels of illumination are shown in fig. 11 and 12, and the enhancement results of three different low-illumination pictures, namely a gamma function, a camera response model and Photoshop, are sequentially shown from left to right.
As can be seen from the figures 9 and 10, the method of the invention well maintains the original color of the image compared with the images enhanced by other classical algorithms, restores the detail information of the image to the greatest extent, improves the brightness and contrast of the image, ensures the soft color, proves that the model disclosed by the invention effectively improves the visual effect of enhancing the image at night, and the figures 11 and 12 reflect the enhancement effect of the low-illumination images with three different illuminations so as to simulate the phenomenon that the brightness of the imaged images is different due to the complex light source at night.
In addition, in order to test the superiority of the method compared with the general method, the same data set is used for testing the SRIE, the Lime algorithm, the trained DC-Gan network and the Cycle-Gan network respectively, and the model of the invention, and the average performance indexes of the algorithms are shown in the following table.
Table 1 different algorithms enhance experimental results
The PSNR of the improved multiscale fusion generation countermeasure network image enhancement algorithm is up to 23.43, and the SSIM is up to 0.82, compared with the traditional image enhancement and the image enhancement of a common Gan network, the PSNR of the improved multiscale fusion generation countermeasure network image enhancement algorithm is improved by 17.12%, the SSIM is improved by 17.56%, and a good foundation is laid for the subsequent use of pictures.
Claims (8)
1. An improved multiscale fusion-based method for generating an image enhancement of an countermeasure network is characterized by comprising the following steps:
step 1, establishing image data sets with different brightness;
step 2, establishing an improved generated countermeasure network image enhancement model, substituting the training set obtained in the step 1 into the improved generated countermeasure network image enhancement model to train to obtain a trained model, and testing through a testing set, wherein the improved generated countermeasure network image enhancement model comprises a generator and a discriminator;
The processing procedure of the generator for each image in the low-illumination image dataset input into the generator comprises the following sub-steps:
step 21, extracting shallow characteristic information of a low-illumination image of the training set by adopting a 7*7 convolution layer;
step 22, shallow characteristic information sequentially passes through two downsampling layers to obtain image information with different scales from an original image;
Step 23, sending the image information obtained in the step 22 into a depth feature extraction module, wherein the module comprises two parallel branches, and the two parallel branches obtain deep information of different receptive fields through convolution kernels of different sizes, wherein the two parallel branches have the same composition and comprise a convolution layer, an activation function, an alternate module layer and a Concat layer which are sequentially connected, and the output of the activation function and the output of the alternate module are commonly connected with the Concat layer, wherein the alternate module comprises three residual dense blocks and three channel attention modules which are alternately connected;
Step 24, carrying out visual fusion on the deep information of different receptive fields obtained by the two branches to obtain a fused characteristic diagram;
Step 25, sequentially performing convolution, activation, up-sampling twice and convolution again on the fused feature images to obtain output feature images, wherein the first convolution is used for further extracting feature information from the fused features, the up-sampling twice is used for restoring the image size, and the convolution again is used for image restoration;
Step 26, the output feature map obtained in the step 25 and the corresponding original image of the training set are connected together to Concat layers, so that a final output result is obtained;
and step 3, inputting the real low-illumination image to be processed into the trained model obtained in the step 2, and obtaining the enhanced image.
2. The improved multiscale fusion-based generation of countermeasure network image enhancement method of claim 1, wherein step 1 includes the sub-steps of:
Step 11, processing the original image I by using gamma correction to obtain a first low-illumination image I 1, wherein the calculation formula is as follows:
wherein, I represents an original image (ground truth), I 1 represents a first low-illumination image, gamma 1 represents a gamma value and takes 0.2;
Step 12, processing the original image I by using a camera response function to reduce the image brightness, so as to obtain a second low-illumination image I 2, wherein the calculation formula is as follows:
Wherein I 2 represents the second low-light image, k represents the virtual exposure rate, k= -5.33, two parameters of a and b as camera response functions;
Step 13, manually adjusting the brightness of the original image I by using image processing software to obtain a third low-illumination image I 3;
the data set is divided into a test set and a training set, step 14.
3. The method for generating an image enhancement of an countermeasure network based on improved multi-scale fusion according to claim 1, wherein in the step 23, the residual dense block includes a dense connection block and a residual connection block, wherein the dense connection block is used for transmitting information extracted by each layer of convolution layers to each subsequent convolution layer, after passing through the dense connection block corresponding to each of the three convolution layers, features of a shallow layer and a deep layer are fused through a connection function, and then the number of channels of the feature map is restored by using the convolution layer with a convolution kernel of 1, and the residual connection block performs pixel-by-pixel addition on the features of the input residual dense block and the output after the convolution and activation of the dense connection block in a jump connection manner.
4. The improved multiscale fusion-based generation of countermeasure network image enhancement method of claim 1, wherein in step 23, the convolution kernel sizes of residual dense blocks of the two parallel branches are 3*3 and 5*5, respectively.
5. The method for generating an countermeasure network image enhancement based on improved multiscale fusion according to claim 1, wherein in step 23, the processing of the feature map obtained by the residual dense block by the channel attention module includes the sub-steps of:
step 231, the feature graphs output by the residual dense blocks positioned in front of the channel attention module enter two branches of the channel attention module respectively;
Step 232, the first branch uses global average pooling to convert the feature map information obtained by the residual dense block into a channel descriptor and convert the feature image of CxH x W into a feature map of Cx1 x 1;
Step 233, the second branch uses global maximum pooling to obtain the feature map by taking the maximum value of all features in each channel of the feature map information obtained by the residual dense block as the representative feature of the channel;
Step 234, amplifying and shrinking the channel numbers with the same multiple through two convolution layers to obtain the feature weights corresponding to different channels, wherein an activation function layer is connected behind a first convolution layer to prevent the extracted feature data from diverging in the transmission process, and then the channel features with different sizes are fused to obtain channel feature information, and then the channel feature information is activated through a Sigmoid activation function;
Step 235, multiplying the feature map obtained by the previous residual dense block by the weights of all channels in the feature map obtained in step 234 by using an Element-wise product module to obtain a weighted feature map.
6. The improved multiscale fusion-based generation of countermeasure network image enhancement method of claim 1, wherein in step 2 the discriminator comprises 6 convolutional layers, an instance normalization layer and a relu activation function are provided after the first 5 convolutional layers to prevent gradient extinction, and a Sigmoid activation function is provided after the last 1 convolutional layers.
7. The improved multiscale fusion-based generation of countermeasure network image enhancement method of claim 1, wherein in step 2, the loss function of the arbiter is:
Wherein, Refers to the generator generating the wasperstein distance between the distribution of samples and the distribution of the real picture,The representation picture is taken from a generated picture set output by the generator network,The representation picture is taken from a set of normal illumination images in the training set,The representation picture is taken from the region between the generated sample and the real sample, G (x) represents the image generated by the generator according to the original input image x, D (x) represents the discrimination evaluation of the discriminator on the original input image x, A represents the expected value expression, lambda is the constant coefficient of the gradient penalty term,In order to determine the gradient of the instrument,The gradient penalty term WGAN-GP is aimed at making the gradient of the discriminator not exceed 1, so that it can solve the problem of instability of gradient, at the same time can accelerate convergence.
8. The improved multiscale fusion-based generation of countermeasure network image enhancement method of claim 1, wherein in step 2, the overall penalty of the generator is:
LG=Lcondition+λ*Lcontent
Wherein lambda is a correction coefficient, taking 100, L condition as a conditional loss, and L content as a content loss;
the loss function of the conditional loss is:
Wherein B represents an input low-light picture, The conditional probability that the generated image at the output belongs to the originally input low-light image is expressed,The representative generator generates a picture to judge whether the picture is an average value of the real picture, and the content loss adopts a perception loss.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202210692241.1A CN115223004B (en) | 2022-06-17 | 2022-06-17 | Method for generating image enhancement of countermeasure network based on improved multi-scale fusion |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202210692241.1A CN115223004B (en) | 2022-06-17 | 2022-06-17 | Method for generating image enhancement of countermeasure network based on improved multi-scale fusion |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN115223004A CN115223004A (en) | 2022-10-21 |
| CN115223004B true CN115223004B (en) | 2025-06-10 |
Family
ID=83608823
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202210692241.1A Active CN115223004B (en) | 2022-06-17 | 2022-06-17 | Method for generating image enhancement of countermeasure network based on improved multi-scale fusion |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN115223004B (en) |
Families Citing this family (17)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN115713469A (en) * | 2022-11-08 | 2023-02-24 | 大连海事大学 | Underwater image enhancement method for generating countermeasure network based on channel attention and deformation |
| CN115661001B (en) * | 2022-12-14 | 2023-04-07 | 临沂大学 | Single-channel coal rock image enhancement method based on generation of countermeasure network |
| CN115797750B (en) * | 2023-02-02 | 2023-04-25 | 天津滨海迅腾科技集团有限公司 | Large-size image rapid transmission method based on deep learning algorithm |
| CN116342409B (en) * | 2023-03-08 | 2025-10-03 | 清华大学深圳国际研究生院 | A low-light image enhancement method based on transformer |
| CN116029947B (en) * | 2023-03-30 | 2023-06-23 | 之江实验室 | A complex optical image enhancement method, device and medium for harsh environments |
| CN116433517A (en) * | 2023-03-31 | 2023-07-14 | 中国人民解放军国防科技大学 | Low-illumination image enhancement method and system based on noise autoregressive model |
| CN116433770B (en) * | 2023-04-27 | 2024-01-30 | 东莞理工学院 | Positioning method, positioning device and storage medium |
| CN116797485B (en) * | 2023-06-30 | 2024-07-16 | 中国人民解放军军事科学院系统工程研究院 | Low-illumination image enhancement method and device based on data synthesis |
| CN117408893B (en) * | 2023-12-15 | 2024-04-05 | 青岛科技大学 | An underwater image enhancement method based on shallow neural network |
| CN117893413B (en) * | 2024-03-15 | 2024-06-11 | 博创联动科技股份有限公司 | Vehicle-mounted terminal man-machine interaction method based on image enhancement |
| CN117911798B (en) * | 2024-03-19 | 2024-05-28 | 青岛奥克生物开发有限公司 | Stem cell quality classification method and system based on image enhancement |
| CN118469842B (en) * | 2024-05-07 | 2025-01-28 | 广东工业大学 | A remote sensing image dehazing method based on generative adversarial network |
| CN119067900B (en) * | 2024-08-12 | 2025-05-13 | 哈尔滨师范大学 | Image enhancement network based on multi-scale feature fusion |
| CN119296735B (en) * | 2024-09-27 | 2025-09-09 | 大连医科大学 | Multi-mode tumor image feature extraction and fusion analysis system and method thereof |
| CN119444610B (en) * | 2024-11-04 | 2025-09-23 | 西安电子科技大学 | Feature fusion network structure, method and related reconstructed spectrum image system |
| CN119559072A (en) * | 2025-01-27 | 2025-03-04 | 华东交通大学 | Polarization image fusion method and system based on multi-scale feature fusion |
| CN120339136A (en) * | 2025-06-16 | 2025-07-18 | 天津理工大学 | A speckle image restoration method and system based on generative adversarial network |
Family Cites Families (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20220004709A1 (en) * | 2018-11-14 | 2022-01-06 | North Carolina State University | Deep neural network with compositional grammatical architectures |
| CN111739635A (en) * | 2020-06-10 | 2020-10-02 | 四川大学华西医院 | A diagnostic auxiliary model and image processing method for acute ischemic stroke |
| CN111915526B (en) * | 2020-08-05 | 2024-05-31 | 湖北工业大学 | Photographing method of low-illumination image enhancement algorithm based on brightness attention mechanism |
-
2022
- 2022-06-17 CN CN202210692241.1A patent/CN115223004B/en active Active
Non-Patent Citations (1)
| Title |
|---|
| "融合多尺度密集块的低照度交通图像增强模型";王炜昊等;《计算机工程与应用》;20230912;第1-10页 * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN115223004A (en) | 2022-10-21 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN115223004B (en) | Method for generating image enhancement of countermeasure network based on improved multi-scale fusion | |
| Wang et al. | Low-light image enhancement based on virtual exposure | |
| Lee et al. | Deep chain hdri: Reconstructing a high dynamic range image from a single low dynamic range image | |
| CN112288658A (en) | An underwater image enhancement method based on multi-residual joint learning | |
| CN115393225B (en) | A low-light image enhancement method based on multi-level feature extraction and fusion | |
| CN107798661B (en) | An Adaptive Image Enhancement Method | |
| Bi et al. | Haze removal for a single remote sensing image using low-rank and sparse prior | |
| CN118302788A (en) | High dynamic range view synthesis from noisy raw images | |
| KR102277005B1 (en) | Low-Light Image Processing Method and Device Using Unsupervised Learning | |
| CN114638764B (en) | Multi-exposure image fusion method and system based on artificial intelligence | |
| Dhara et al. | Exposedness-based noise-suppressing low-light image enhancement | |
| CN108416745A (en) | Image self-adaptive defogging enhancement method with color constancy | |
| Shutova et al. | NTIRE 2023 challenge on night photography rendering | |
| CN115661012B (en) | A multi-exposure image fusion system based on global-local aggregation learning | |
| CN116681627B (en) | Cross-scale fusion self-adaptive underwater image generation countermeasure enhancement method | |
| CN118195980B (en) | A dark detail enhancement method based on grayscale transformation | |
| CN113284061B (en) | Underwater image enhancement method based on gradient network | |
| CN105513015B (en) | A kind of clearness processing method of Misty Image | |
| Parihar et al. | A comprehensive analysis of fusion-based image enhancement techniques | |
| CN115861113B (en) | A semi-supervised dehazing method based on fusion of depth map and feature mask | |
| CN115018717B (en) | Improved Retinex-Net low-illumination and scotopic vision image enhancement method | |
| Tang et al. | A local flatness based variational approach to retinex | |
| Yuan et al. | Image dehazing based on a transmission fusion strategy by automatic image matting | |
| Lv et al. | Two adaptive enhancement algorithms for high gray-scale RAW infrared images based on multi-scale fusion and chromatographic remapping | |
| CN115035011B (en) | A low-light image enhancement method based on adaptive RetinexNet under a fusion strategy |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |