CN116645569B

CN116645569B - A method and system for colorizing infrared images based on generative adversarial networks

Info

Publication number: CN116645569B
Application number: CN202211680002.0A
Authority: CN
Inventors: 陈宇; 詹伟达; 于永吉; 洪洋; 韩登; 李国宁
Original assignee: Changchun University of Science and Technology
Current assignee: Changchun University of Science and Technology
Priority date: 2022-12-27
Filing date: 2022-12-27
Publication date: 2024-09-27
Anticipated expiration: 2042-12-27
Also published as: CN116645569A

Abstract

The present invention belongs to the field of image processing technology, and is particularly a method for colorizing infrared images based on a generative adversarial network. The specific steps are as follows: constructing a network model: constructing a generative adversarial network including a generator and a discriminator; preparing a data set: pre-training the generative adversarial network according to a first infrared image data set; training the network model: training the network model using the first infrared image data set until a preset threshold is reached; fine-tuning the model: retraining and fine-tuning the network model using a second infrared image data set to obtain a final model; saving the model: solidifying the parameters of the final model and saving the model. The network structure adopted by the present invention is a generative adversarial network structure, which utilizes the game characteristics between the generator and the discriminator to strengthen the extraction of deep information of the image, enhance the naturalness and authenticity of the colorized image, and dynamically improve the quality of the colorized image.

Description

A method and system for colorizing infrared images based on generative adversarial networks

技术领域Technical Field

本发明涉及图像处理技术领域，具体为一种基于生成对抗网络的红外图像彩色化方法和系统。The present invention relates to the technical field of image processing, and in particular to an infrared image colorization method and system based on a generative adversarial network.

背景技术Background Art

红外图像在很多复杂场景下有着重要的意义，例如：在雾霾或雨雾天气时，红外图像比可见光的穿透力更强，可以穿透密度较大的空气捕获到清晰的图像；同时在军事作战时，红外图像也能更好地识别伪装的敌人，从而增加战争胜利的希望；然而，红外图像不利于人直接观看并从中获取信息，会降低人的辨识能力，增加人对图像理解的难度，人眼对彩色场景的辨识能力远远大于对只有灰度场景的辨识能力；因此，将红外图像转换成彩色化图像不仅非常有实际意义，还有非常广阔的应用，可以为多种复杂场景提供多种解决方案，但不同于仅仅估计图像色度信息的灰度图像彩色化算法，红外图像彩色化需要同时估计图像的亮度和色度信息；同时红外图像的热特性与其对应可见光图像的特征并没有必然的联系，所以红外图像彩色化的结果存在严重的细节模糊和纹理扭曲问题，这些客观存在的问题加剧了实现红外图像彩色化的难度。Infrared images are of great significance in many complex scenes. For example, in haze or rainy and foggy weather, infrared images have stronger penetrating power than visible light and can penetrate denser air to capture clear images. At the same time, in military operations, infrared images can also better identify camouflaged enemies, thereby increasing the hope of victory in the war. However, infrared images are not conducive to people's direct viewing and obtaining information from them, which will reduce people's recognition ability and increase the difficulty of people's understanding of images. The human eye's recognition ability for color scenes is much greater than that for grayscale scenes. Therefore, converting infrared images into colorized images is not only very practical, but also has very broad applications. It can provide multiple solutions for a variety of complex scenes. However, unlike the grayscale image colorization algorithm that only estimates the image chromaticity information, infrared image colorization requires simultaneous estimation of the image's brightness and chromaticity information. At the same time, the thermal characteristics of infrared images are not necessarily related to the characteristics of their corresponding visible light images, so the results of infrared image colorization have serious problems of blurred details and texture distortion. These objective problems have exacerbated the difficulty of realizing infrared image colorization.

中国专利申请号为“CN202210199669.2”，名称为“基于注意力机制的多尺度神经网络红外图像彩色化方法”，该方法首先利用二维卷积神经网络在不同分辨率对输入的红外图像对进行特征提取；将处理后的图像输入主干网络；利用主干网络对输入图像进行编码，得到输入特征；然后通过注意力机制对提取出的高维特征信息进行凝练处理；最终将多尺度信息进行融合处理得到预测的彩色红外图像；该方法得到的彩色化图像无法准确保留结构信息和语义细节，图像质量差，对比度较低，颜色不自然，不符合人眼视觉效果，同时计算复杂度较高且效率低下；因此，如何克服上述缺陷是本领域技术人员亟需解决的问题。The Chinese patent application number is "CN202210199669.2", and its name is "Multi-scale neural network infrared image colorization method based on attention mechanism". This method first uses a two-dimensional convolutional neural network to extract features of the input infrared image pair at different resolutions; the processed image is input into the backbone network; the input image is encoded using the backbone network to obtain the input features; the extracted high-dimensional feature information is then condensed through the attention mechanism; finally, the multi-scale information is fused to obtain a predicted color infrared image; the colorized image obtained by this method cannot accurately retain structural information and semantic details, and has poor image quality, low contrast, unnatural colors, and does not conform to the visual effects of the human eye. At the same time, the computational complexity is high and the efficiency is low; therefore, how to overcome the above-mentioned defects is a problem that technicians in this field urgently need to solve.

发明内容Summary of the invention

(一)解决的技术问题1. Technical issues to be solved

针对现有技术的不足，本发明提供了一种基于生成对抗网络的红外图像彩色化方法和系统，解决了现有的红外图像彩色化方法得到的图像质量差，对比度较低，颜色不自然和效率低下的问题。In view of the shortcomings of the prior art, the present invention provides an infrared image colorization method and system based on a generative adversarial network, which solves the problems of poor image quality, low contrast, unnatural colors and low efficiency obtained by the existing infrared image colorization methods.

(二)技术方案(II) Technical solution

本发明为了实现上述目的具体采用以下技术方案：In order to achieve the above-mentioned purpose, the present invention specifically adopts the following technical solutions:

一种基于生成对抗网络的红外图像彩色化方法，具体步骤为：A method for colorizing infrared images based on a generative adversarial network, the specific steps are as follows:

构建网络模型：构建包括生成器和鉴别器的生成对抗网络；Build a network model: Build a generative adversarial network including a generator and a discriminator;

准备数据集：根据第一红外图像数据集对生成对抗网络进行预训练；Prepare the dataset: pre-train the generative adversarial network based on the first infrared image dataset;

训练网络模型：利用第一红外图像数据集对网络模型进行训练，直至达到预设阈值；Training the network model: using the first infrared image data set to train the network model until a preset threshold is reached;

微调模型：利用第二红外图像数据集对网络模型进行再次训练和微调，获得最终模型；Fine-tuning the model: retraining and fine-tuning the network model using the second infrared image dataset to obtain the final model;

保存模型：将获得最终模型的参数进行固化，保存模型。Save model: Solidify the parameters of the final model and save the model.

进一步地，生成器包括多层密集模块、下采样模块、特征融合模块、颜色感知注意力模块、上采样模块、图像重建模块；Furthermore, the generator includes a multi-layer dense module, a downsampling module, a feature fusion module, a color-aware attention module, an upsampling module, and an image reconstruction module;

所述多层密集模块，用于利用卷积块对图片进行特征提取；The multi-layer dense module is used to extract features from images using convolution blocks;

所述下采样模块，由多层密集模块和卷积块组成；用于不断降低特征图大小并提取图像深层次的语义信息；The downsampling module is composed of multiple layers of dense modules and convolution blocks; it is used to continuously reduce the size of feature maps and extract deep semantic information of images;

所述特征融合模块，用于减少编码器下采样造成的信息损失；The feature fusion module is used to reduce the information loss caused by encoder downsampling;

所述颜色感知注意力模块，用于通过增强重要的特征和抑制不必要的特征来有效地提高网络的彩色化质量，同时它加强感知信息，捕获更多的语义细节，关注更多的关键对象；The color-aware attention module is used to effectively improve the colorization quality of the network by enhancing important features and suppressing unnecessary features. At the same time, it strengthens the perceptual information, captures more semantic details, and focuses on more key objects;

所述上采样模块，用于逐步恢复特征图大小；The upsampling module is used to gradually restore the size of the feature map;

所述图像重建模块，用于利用卷积块和T型函数重建红外彩色化图像。The image reconstruction module is used to reconstruct the infrared colorized image using a convolution block and a T-type function.

进一步地，鉴别器包括多个卷积块、颜色感知注意力模块、归一化层和L型函数，用于加强感知信息，有利于快速收敛。Furthermore, the discriminator includes multiple convolutional blocks, a color-aware attention module, a normalization layer, and an L-type function to enhance the perceptual information and facilitate fast convergence.

进一步地，第一红外图像数据集为KAIST数据集。Furthermore, the first infrared image dataset is a KAIST dataset.

进一步地，在训练网络模型中预设阈值包括损失函数预设值、训练次数预设值。Furthermore, the preset thresholds in the training network model include a preset value of the loss function and a preset value of the number of training times.

进一步地，损失函数为复合损失函数，生成器采用的损失函数包括合成损失，对抗损失，特征损失、总体变化损失和梯度损失；鉴别器采用对抗损失。Furthermore, the loss function is a composite loss function. The loss functions adopted by the generator include synthetic loss, adversarial loss, feature loss, overall change loss and gradient loss; the discriminator adopts adversarial loss.

进一步地，在训练网络模型过程中还包括通过评价指标评估算法彩色化结果的质量和图像失真程度。Furthermore, the process of training the network model also includes evaluating the quality of the algorithm colorization results and the degree of image distortion through evaluation indicators.

进一步地，第二红外图像数据集为FLIR数据集。Further, the second infrared image data set is a FLIR data set.

一种基于生成对抗网络的红外图像彩色化系统，所述系统包括：An infrared image colorization system based on a generative adversarial network, the system comprising:

图像获取模块，用于获取训练数据和深度神经网络；所述训练数据包括输入图像信息和输出图像信息；An image acquisition module, used to acquire training data and a deep neural network; the training data includes input image information and output image information;

图像处理模块，用于对训练数据中的每个图像进行预处理；所述预处理包括几何校正、图像增强和图像滤波；An image processing module, used for preprocessing each image in the training data; the preprocessing includes geometric correction, image enhancement and image filtering;

特征提取模块，用于对每个所述图像进行特征提取得到每个所述图像的特征图集合；所述特征图集合包括大小不同的多个特征图；A feature extraction module, used for extracting features from each of the images to obtain a feature map set for each of the images; the feature map set includes a plurality of feature maps of different sizes;

模型训练模块，用于对每个所述图像的特征图集合中指定大小的特征图进行有监督的模型训练，得到红外图像彩色化模型；A model training module, used for performing supervised model training on a feature map of a specified size in a feature map set of each image to obtain an infrared image colorization model;

图像彩色化模块，用于将待彩色化红外图像输入所述红外图像彩色化模型以进行红外图像彩色化处理；An image colorization module, used for inputting the infrared image to be colorized into the infrared image colorization model to perform infrared image colorization processing;

存储介质，用于存储红外图像彩色化系统；Storage medium, used for storing infrared image colorization system;

进一步地，所述图像处理模块还用于训练数据集的数据增强，常用方法有：图像裁剪、图像翻转和图像平移等；Furthermore, the image processing module is also used for data enhancement of the training data set, and commonly used methods include: image cropping, image flipping and image translation;

进一步地，所述特征提取模块还用于对每个所述图像的特征图集合中指定大小的特征图，待训练模型的指定大小的特征图有不少于2个，所述指定大小为1/8原图大小；Furthermore, the feature extraction module is also used for extracting a feature map of a specified size in the feature map set of each image, wherein there are no less than 2 feature maps of the specified size of the model to be trained, and the specified size is 1/8 of the original image size;

进一步地，所述模型训练模块还用于将每个图像从数据集的任意大小调整到固定大小256×256；生成器和鉴别器都用200个epoch，批量大小为1的策略来训练；最初，在前100个时期，学习率被设置为0.0002，并且在接下来的100个时期，学习率线性下降到0；生成器和鉴别器中的第一卷积层中的滤波器数量被设置为64；我们使用Adam优化器，β₁＝0.5，β₂＝0.999；鉴别器和发生器被交替训练，直到模型收敛；Furthermore, the model training module is also used to resize each image from any size of the dataset to a fixed size of 256×256; both the generator and the discriminator are trained with a batch size of 1 for 200 epochs; initially, the learning rate is set to 0.0002 in the first 100 epochs, and the learning rate is linearly decreased to 0 in the next 100 epochs; the number of filters in the first convolutional layer in the generator and the discriminator is set to 64; we use the Adam optimizer with β ₁ = 0.5, β ₂ = 0.999; the discriminator and the generator are trained alternately until the model converges;

进一步地，所述图像彩色化模块还用于获取待彩色化图像，将待彩色化图像信息输入到红外图像彩色化模型，利用前向传播算法生成红外图像彩色化结果。Furthermore, the image colorization module is also used to obtain the image to be colored, input the information of the image to be colored into the infrared image colorization model, and generate the infrared image colorization result by using the forward propagation algorithm.

(三)有益效果(III) Beneficial effects

与现有技术相比，本发明提供了一种基于生成对抗网络的红外图像彩色化方法和系统，具备以下有益效果：Compared with the prior art, the present invention provides an infrared image colorization method and system based on a generative adversarial network, which has the following beneficial effects:

本发明采用的网络结构是生成对抗网络结构，利用生成器和鉴别器之间的博弈特性，加强对图像深层信息的提取，增强彩色化图像的自然度和真实度，动态提高彩色化图像质量。The network structure adopted in the present invention is a generative adversarial network structure, which utilizes the game characteristics between the generator and the discriminator to strengthen the extraction of deep information of the image, enhance the naturalness and authenticity of the colorized image, and dynamically improve the quality of the colorized image.

本发明在多层密集模块一、下采样模块一、下采样模块二、下采样模块三、上采样模块一、上采样模块二、上采样模块三和多层密集模块二中采用了密集残差结构对特征图进行特征提取，以较少的参数和计算成本提取局部和上下文信息，增强了特征提取能力和提高了细节恢复能力，产生更高质量的彩色红外图像。The present invention adopts a dense residual structure in the multi-layer dense module 1, the down-sampling module 1, the down-sampling module 2, the down-sampling module 3, the up-sampling module 1, the up-sampling module 2, the up-sampling module 3 and the multi-layer dense module 2 to extract features from the feature map, extracts local and context information with fewer parameters and computational costs, enhances the feature extraction capability and improves the detail recovery capability, and produces a higher quality color infrared image.

本发明在上采样模块一、上采样模块二和上采样模块三中采用了颜色感知注意力模块通过增强重要的特征和抑制不必要的特征来有效地提高网络的彩色化质量，同时它加强感知信息，捕获更多的语义细节，关注更多的关键对象。The present invention adopts a color-aware attention module in upsampling module one, upsampling module two and upsampling module three to effectively improve the colorization quality of the network by enhancing important features and suppressing unnecessary features. At the same time, it strengthens perceptual information, captures more semantic details, and pays attention to more key objects.

本发明提出的特征融合模块应用于编码器和解码器之间，它减少编码器下采样造成的信息损失，提高对图像细微色彩的预测能力。The feature fusion module proposed in the present invention is applied between the encoder and the decoder, which reduces the information loss caused by the downsampling of the encoder and improves the prediction ability of the subtle colors of the image.

本发明提出了一种由合成损失、对抗损失、特征损失、总体变化损失和梯度损失组成的复合损失函数，它可以提高彩色化图像的质量，生成精细的局部细节，恢复语义和纹理信息。The present invention proposes a composite loss function composed of synthesis loss, adversarial loss, feature loss, overall variation loss and gradient loss, which can improve the quality of colorized images, generate fine local details, and restore semantic and texture information.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为一种基于生成对抗网络的红外图像彩色化方法流程图；FIG1 is a flow chart of an infrared image colorization method based on a generative adversarial network;

图2为一种基于生成对抗网络的红外图像彩色化方法中生成对抗网络的生成器和鉴别器结构图；FIG2 is a structural diagram of a generator and a discriminator of a generative adversarial network in an infrared image colorization method based on a generative adversarial network;

图3为本发明所述下采样模块一、下采样模块二和下采样模块三中每一个模块的具体组成示意图；FIG3 is a schematic diagram of the specific composition of each module in the downsampling module 1, the downsampling module 2 and the downsampling module 3 according to the present invention;

图4为本发明所述上采样模块一、上采样模块二和上采样模块三中每一个模块的具体组成示意图；FIG4 is a schematic diagram of the specific composition of each module in the upsampling module 1, the upsampling module 2 and the upsampling module 3 according to the present invention;

图5为本发明所述所有的多层密集模块的具体组成示意图；FIG5 is a schematic diagram showing the specific composition of all multi-layer dense modules described in the present invention;

图6为本发明所述特征融合模块的具体组成示意图；FIG6 is a schematic diagram of the specific composition of the feature fusion module of the present invention;

图7为本发明所述颜色感知注意力模块的具体组成示意图；FIG7 is a schematic diagram of the specific composition of the color perception attention module of the present invention;

图8为本发明所述所有的卷积块的具体构成示意图；FIG8 is a schematic diagram of the specific structure of all convolution blocks described in the present invention;

图9为本发明所述所有的转置卷积块的具体构成示意图；FIG9 is a schematic diagram showing the specific structure of all transposed convolution blocks described in the present invention;

图10为本发明提出方法的相关指标对比示意图；FIG10 is a schematic diagram showing a comparison of related indicators of the method proposed in the present invention;

图11为本发明所述红外图像彩色化系统的主要模块示意图；FIG11 is a schematic diagram of the main modules of the infrared image colorization system of the present invention;

图12为本发明所述实现红外图像彩色化方法的电子设备的内部结构示意图。FIG. 12 is a schematic diagram of the internal structure of an electronic device for implementing the infrared image colorization method described in the present invention.

具体实施方式DETAILED DESCRIPTION

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The following will be combined with the drawings in the embodiments of the present invention to clearly and completely describe the technical solutions in the embodiments of the present invention. Obviously, the described embodiments are only part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of the present invention.

实施例Example

实施例1Example 1

如图1所示，一种基于生成对抗网络的红外图像彩色化方法的流程图，该方法具体包括如下步骤：As shown in FIG1 , a flowchart of an infrared image colorization method based on a generative adversarial network is shown. The method specifically includes the following steps:

步骤1，构建网络模型；整个生成对抗网络包括生成器和鉴别器；生成器由多层密集模块一、下采样模块一、下采样模块二、下采样模块三、特征融合模块、上采样模块一、上采样模块二、上采样模块三、拼接操作、多层密集模块二、图像重建模块组成；鉴别器由卷积层一、卷积层二、卷积层三、卷积层四、卷积层五、颜色感知注意模块、归一化层和L型函数组成，在中间的卷积层加入颜色感知注意力模块，加强感知信息，有利于快速收敛；将生成器生成的红外彩色化图像与数据集中的可见光彩色图像输入鉴别器中，鉴别器输出真假概率信息判断输入图像是否真实；多层密集模块一对图像进行浅层特征提取；下采样模块一、下采样模块二和下采样模块三不断降低特征图大小并提取图像深层次的语义信息；特征融合模块减少编码器下采样造成的信息损失；上采样模块一、上采样模块二和上采样模块三采用颜色感知注意力模块使网络更加关注图像的语义信息和颜色信息，捕获更多的语义细节，并且逐步恢复特征图大小；拼接操作将浅层信息与深层信息进行特征复用；多层密集模块二、图像重建模块重建红外彩色化图像；下采样模块由多层密集模块和卷积块组成；上采样模块由多层密集模块、转置卷积块、卷积块和颜色感知注意力模块组成；多层密集模块由卷积块一、卷积块二、卷积块三、卷积块四、卷积块五、卷积块六、卷积块七、卷积块八和拼接操作组成；特征融合模块由卷积块一、卷积块二、卷积块三、卷积块四、拼接操作、加法操作和双线性插值上采样操作组成；颜色感知注意力模块由卷积块一、卷积块二、卷积块三、卷积块四、卷积块五、卷积块六、卷积块七、卷积块八、卷积块九、加法操作、乘法操作、Softmax操作、平均池化操作和扩展操作组成；其中每个转置卷积块由转置卷积层、归一化层和L型函数组成，卷积核的大小统一为n×n；图像重建模块由卷积层一、卷积层二和T型函数组成；每个卷积块由卷积层、归一化层和L型函数组成，卷积核的大小统一为n×n；最后得到特征图的大小与输入图像大小保持一致；Step 1, build a network model; the entire generative adversarial network includes a generator and a discriminator; the generator consists of a multi-layer dense module 1, a downsampling module 1, a downsampling module 2, a downsampling module 3, a feature fusion module, an upsampling module 1, an upsampling module 2, an upsampling module 3, a splicing operation, a multi-layer dense module 2, and an image reconstruction module; the discriminator consists of a convolutional layer 1, a convolutional layer 2, a convolutional layer 3, a convolutional layer 4, a convolutional layer 5, a color perception attention module, a normalization layer, and an L-type function. A color perception attention module is added to the middle convolutional layer to enhance the perception information, which is conducive to rapid convergence; the infrared image generated by the generator is The colorized image and the visible light color image in the data set are input into the discriminator, and the discriminator outputs the true and false probability information to determine whether the input image is real; the multi-layer dense module 1 performs shallow feature extraction on the image; the downsampling module 1, the downsampling module 2 and the downsampling module 3 continuously reduce the size of the feature map and extract the deep semantic information of the image; the feature fusion module reduces the information loss caused by the encoder downsampling; the upsampling module 1, the upsampling module 2 and the upsampling module 3 use the color perception attention module to make the network pay more attention to the semantic information and color information of the image, capture more semantic details, and gradually restore the size of the feature map; splicing The operation reuses the shallow information and the deep information for feature reuse; the multi-layer dense module 2 and the image reconstruction module reconstruct the infrared colorized image; the downsampling module consists of a multi-layer dense module and a convolution block; the upsampling module consists of a multi-layer dense module, a transposed convolution block, a convolution block and a color perception attention module; the multi-layer dense module consists of a convolution block 1, a convolution block 2, a convolution block 3, a convolution block 4, a convolution block 5, a convolution block 6, a convolution block 7, a convolution block 8 and a splicing operation; the feature fusion module consists of a convolution block 1, a convolution block 2, a convolution block 3, a convolution block 4, a splicing operation, an addition operation and a bilinear interpolation upsampling operation; the color perception attention The intention module consists of convolution block 1, convolution block 2, convolution block 3, convolution block 4, convolution block 5, convolution block 6, convolution block 7, convolution block 8, convolution block 9, addition operation, multiplication operation, Softmax operation, average pooling operation and expansion operation; each transposed convolution block consists of a transposed convolution layer, a normalization layer and an L-type function, and the size of the convolution kernel is unified to n×n; the image reconstruction module consists of convolution layer 1, convolution layer 2 and a T-type function; each convolution block consists of a convolution layer, a normalization layer and an L-type function, and the size of the convolution kernel is unified to n×n; finally, the size of the feature map is consistent with the size of the input image;

步骤2，准备数据集；对整个生成对抗网络先用红外图像数据集一进行训练；预训练过程中红外图像数据集使用KAIST数据集；通过对数据集中图像进行有监督的训练；Step 2, prepare the data set; first train the entire generative adversarial network with the infrared image data set 1; during the pre-training process, the infrared image data set uses the KAIST data set; supervised training is performed on the images in the data set;

步骤3，训练网络模型；训练红外图像彩色化模型，将步骤2中准备好的数据集进行预处理，调整数据集中每个图像的尺寸，固定输入图像的大小，将处理好的数据集输入到步骤1中构建好的网络模型中进行训练；Step 3, training the network model; training the infrared image colorization model, preprocessing the data set prepared in step 2, adjusting the size of each image in the data set, fixing the size of the input image, and inputting the processed data set into the network model constructed in step 1 for training;

步骤4，选择最小化损失函数值和最优评估指标；通过最小化网络输出图像与标签的损失函数，直到训练次数达到设定阈值或损失函数的值到达设定范围内即可认为模型参数已预训练完成，保存模型参数；同时选择最优评估指标来衡量算法的精度，评估系统的性能；在训练过程中损失函数选择使用复合损失函数，生成器采用合成损失，对抗损失，特征损失、总体变化损失和梯度损失，鉴别器采用对抗损失；损失函数的选择影响着模型的好坏，能够真实地体现出预测值与真值差异，并且能够正确地反馈模型的质量；合适的评估指标选择峰值信噪比(PSNR)、结构相似性(SSIM)、感知图像相似度(LPIPS)和自然图像质量评估(NIQE)，能够有效地评估算法彩色化结果的质量和图像失真程度，衡量彩色化网络的作用；Step 4, select the minimized loss function value and the optimal evaluation index; by minimizing the loss function of the network output image and the label, until the number of training times reaches the set threshold or the value of the loss function reaches the set range, the model parameters can be considered to have been pre-trained and the model parameters can be saved; at the same time, the optimal evaluation index is selected to measure the accuracy of the algorithm and evaluate the performance of the system; during the training process, the loss function uses a composite loss function, the generator uses synthetic loss, adversarial loss, feature loss, overall change loss and gradient loss, and the discriminator uses adversarial loss; the choice of loss function affects the quality of the model, can truly reflect the difference between the predicted value and the true value, and can correctly feedback the quality of the model; suitable evaluation indicators include peak signal-to-noise ratio (PSNR), structural similarity (SSIM), perceptual image similarity (LPIPS) and natural image quality evaluation (NIQE), which can effectively evaluate the quality of the algorithm colorization results and the degree of image distortion, and measure the role of the colorization network;

步骤5，微调模型；用红外图像数据集二对模型进行训练和微调，得到稳定可用的模型参数，进一步提高模型的红外图像彩色化能力；最终使得模型对红外图像彩色化的效果更好；在微调模型参数过程中使用FLIR数据集；Step 5, fine-tune the model; use infrared image dataset 2 to train and fine-tune the model to obtain stable and usable model parameters, further improve the model's infrared image colorization capability, and ultimately make the model more effective in infrared image colorization; use the FLIR dataset in the process of fine-tuning the model parameters;

步骤6，保存模型；将最终确定的模型参数进行固化；Step 6, save the model; solidify the finalized model parameters;

如果进行红外图像彩色化操作时，直接将图像输入到网络中即可得到最终的彩色化图像；If the infrared image is colorized, the final colorized image can be obtained by directly inputting the image into the network;

本发明还提供了一种红外图像彩色化系统，所述系统包括：The present invention also provides an infrared image colorization system, the system comprising:

本发明还提供了一种红外图像彩色化的电子设备，所述设备包括：一个或多个处理器；存储系统，用于存储一个或多个程序；当所述一个或多个程序被所述一个或多个处理器执行，使得所述一个或多个处理器实现本发明所提供的红外图像彩色化方法；The present invention also provides an electronic device for colorizing infrared images, the device comprising: one or more processors; a storage system for storing one or more programs; when the one or more programs are executed by the one or more processors, the one or more processors implement the infrared image colorization method provided by the present invention;

本发明还提供了一种计算机可读存储介质，其特征在于，其存储有计算机程序，运行所述计算机程序可执行本发明所提供的红外图像彩色化方法。The present invention also provides a computer-readable storage medium, characterized in that it stores a computer program, and running the computer program can execute the infrared image colorization method provided by the present invention.

实施例2Example 2

如图1所示，一种基于生成对抗网络的红外图像彩色化方法，该方法具体包括如下步骤：As shown in FIG1 , a method for colorizing an infrared image based on a generative adversarial network specifically includes the following steps:

步骤1，构建网络模型；Step 1, build a network model;

如图2所示，生成器由多层密集模块一、下采样模块一、下采样模块二、下采样模块三、特征融合模块、上采样模块一、上采样模块二、上采样模块三、拼接操作、多层密集模块二和图像重建模块组成；其中下采样模块由密集模块和卷积块组成，每一个下采样模块的具体构成如图3所示；上采样模块由多层密集模块、转置卷积块、卷积块和颜色感知注意力模块组成，每一个上采样模块的具体构成如图4所示；多层密集模块由卷积块一、卷积块二、卷积块三、卷积块四、卷积块五、卷积块六、卷积块七、卷积块八和拼接操作组成，卷积核大小为3×3，步长为1，每一个多层密集模块的具体构成如图5所示；特征融合模块由卷积块一、卷积块二、卷积块三、卷积块四、拼接操作、加法操作和双线性插值上采样操作组成，卷积核大小为3×3，步长分别为2和1，每一个特征融合模块的具体构成如图6所示；颜色感知注意力模块由卷积块一、卷积块二、卷积块三、卷积块四、卷积块五、卷积块六、卷积块七、卷积块八、卷积块九、加法操作、乘法操作、Softmax操作、平均池化操作和扩展操作组成，卷积核大小分别为1×1和3×3，步长为1，卷积块三和卷积块四扩张率为3，每一个颜色感知注意力模块的具体构成如图7所示；转置卷积块由转置卷积层，归一化层和L型函数组成，卷积核大小为4×4，步长为2，每一个转置卷积块的具体构成如图9所示；图像重建模块由卷积层一、卷积层二和T型函数组成，卷积核大小为1×1，步长为1；卷积块由卷积层，归一化层和L型函数组成，卷积核大小和步长视情况而定，每一个卷积块的具体构成如图8所示；As shown in Figure 2, the generator consists of a multi-layer dense module 1, a downsampling module 1, a downsampling module 2, a downsampling module 3, a feature fusion module, an upsampling module 1, an upsampling module 2, an upsampling module 3, a splicing operation, a multi-layer dense module 2 and an image reconstruction module; the downsampling module consists of a dense module and a convolution block, and the specific structure of each downsampling module is shown in Figure 3; the upsampling module consists of a multi-layer dense module, a transposed convolution block, a convolution block and a color perception attention module, and the specific structure of each upsampling module is shown in Figure 4; the multi-layer dense module consists of a convolution block 1, a convolution block 2, a convolution block 3, a convolution block 4, a convolution block 5, a convolution block 6, a convolution block 7, a convolution block 8 and a splicing operation, the convolution kernel size is 3×3, and the step size is 1. The specific structure of each multi-layer dense module is shown in Figure 5; the feature fusion module consists of a convolution block 1, a convolution block 2, a convolution block 3, a convolution block 4, a splicing operation, an addition operation and a bilinear interpolation upsampling operation, and the convolution kernel size is 3 ×3, the step sizes are 2 and 1 respectively. The specific structure of each feature fusion module is shown in Figure 6; the color perception attention module consists of convolution block 1, convolution block 2, convolution block 3, convolution block 4, convolution block 5, convolution block 6, convolution block 7, convolution block 8, convolution block 9, addition operation, multiplication operation, Softmax operation, average pooling operation and expansion operation. The convolution kernel sizes are 1×1 and 3×3 respectively, the step size is 1, the expansion rate of convolution block 3 and convolution block 4 is 3, and each color perception The specific structure of the attention module is shown in Figure 7; the transposed convolution block consists of a transposed convolution layer, a normalization layer and an L-type function, the convolution kernel size is 4×4, and the step length is 2. The specific structure of each transposed convolution block is shown in Figure 9; the image reconstruction module consists of a convolution layer 1, a convolution layer 2 and a T-type function, the convolution kernel size is 1×1, and the step length is 1; the convolution block consists of a convolution layer, a normalization layer and an L-type function, the convolution kernel size and step length depend on the situation, and the specific structure of each convolution block is shown in Figure 8;

鉴别器由卷积层一、卷积层二、卷积层三、卷积层四、卷积层五、颜色感知注意模块、归一化层和L型函数组成，在中间的卷积层加入颜色感知注意力模块，加强感知信息，有利于快速收敛；将生成器生成的红外彩色化图像与数据集中的可见光图像输入鉴别器中，鉴别器输出真假概率信息判断输入图像是否真实；The discriminator consists of convolutional layer 1, convolutional layer 2, convolutional layer 3, convolutional layer 4, convolutional layer 5, color perception attention module, normalization layer and L-type function. The color perception attention module is added to the middle convolutional layer to strengthen the perception information, which is conducive to rapid convergence. The infrared colorized image generated by the generator and the visible light image in the data set are input into the discriminator, and the discriminator outputs true and false probability information to determine whether the input image is real.

总的来说，彩色化过程是输入红外图像，经过3次下采样操作提取特征，再经过3次上采样操作恢复特征图大小重建红外彩色化图像，最后将输出的彩色化图像与可见光图像一起输入鉴别器判断是否真实；In general, the colorization process is to input the infrared image, extract the features through 3 downsampling operations, and then restore the feature map size through 3 upsampling operations to reconstruct the infrared colorized image. Finally, the output colorized image and the visible light image are input into the discriminator to determine whether it is real.

为了保证网络的鲁棒性，保留更多的结构信息，充分提取图像特征，本发明使用三种激活函数，分别为L型函数、T型函数和S型函数，生成器最后一层为T型函数，鉴别器最后一层为S型函数，除此之外，生成器和鉴别器的所有损失函数均为L型函数函数；L型函数、T型函数和S型函数定义如下所示：In order to ensure the robustness of the network, retain more structural information, and fully extract image features, the present invention uses three activation functions, namely L-type function, T-type function and S-type function. The last layer of the generator is the T-type function, and the last layer of the discriminator is the S-type function. In addition, all loss functions of the generator and the discriminator are L-type functions; the definitions of L-type function, T-type function and S-type function are as follows:

步骤2，准备数据集；红外图像数据集使用KAIST数据集；KAIST行人数据集总共包括95328张图像，每张图像都包含RGB彩色图像和红外图像两个版本；数据集分别在白天和晚上捕获了包括校园、街道以及乡下的各种常规交通场景；图像大小为640×480；虽然KAIST数据集本身说明是配准的，但是经过我们细心观察，发现部分红外图像与可见光图像还是有部分偏差；而且因为KAIST数据集是取自视频连续帧图像，相邻图像相差不大；我们经过清洗操作，训练集中选择了4755张白天图像，2846张夜晚图像；测试集中选择了1455张白天图像，797张夜晚图像；我们仅用白天的训练集进行训练；将这4755张图像尺寸调整为256×256作为整个网络的输入；KAIST数据集的对抗训练可以确定一组初始化参数，加快后续网络训练进程；Step 2, prepare the data set; the infrared image data set uses the KAIST data set; the KAIST pedestrian data set includes a total of 95,328 images, each of which contains two versions: RGB color image and infrared image; the data set captures various regular traffic scenes including campus, streets and countryside during the day and night; the image size is 640×480; although the KAIST data set itself indicates that it is registered, after careful observation, we found that some infrared images and visible light images still have some deviations; and because the KAIST data set is taken from continuous frame images of the video, the adjacent images are not much different; after cleaning operations, we selected 4755 daytime images and 2846 nighttime images in the training set; 1455 daytime images and 797 nighttime images in the test set; we only use the daytime training set for training; resize these 4755 images to 256×256 as the input of the entire network; adversarial training of the KAIST data set can determine a set of initialization parameters to speed up the subsequent network training process;

所述步骤3中对数据集的图像进行图像增强，将同一张图像中进行随机衍射变换，并且裁剪到输入图像的大小，作为整个网络的输入，将数据集中做好标注的图像作为标签；其中随机大小和位置通过软件算法可以实现；其中使用数据集中做好标注的图像作为标签是为了让网络学习更好的特征提取能力，最终达到更好的彩色化效果；In step 3, the image of the data set is enhanced, a random diffraction transformation is performed on the same image, and it is cropped to the size of the input image as the input of the entire network, and the annotated image in the data set is used as a label; wherein the random size and position can be achieved through a software algorithm; wherein the annotated image in the data set is used as a label in order to allow the network to learn better feature extraction capabilities, and ultimately achieve a better colorization effect;

所述步骤4中网络的输出与标签计算损失函数，通过最小化损失函数达到更好的融合效果；在训练过程中损失函数选择使用复合损失函数，生成器采用合成损失，对抗损失，特征损失、总体变化损失和梯度损失，鉴别器采用对抗损失；The output of the network in step 4 and the label calculate the loss function, and achieve a better fusion effect by minimizing the loss function; during the training process, the loss function uses a composite loss function, the generator uses synthetic loss, adversarial loss, feature loss, overall change loss and gradient loss, and the discriminator uses adversarial loss;

合成损失实际是L1损失，通过增加合成损失，可以有效地最小化彩色图像和Ground Truth之间的亮度和对比度差异；如果GAN过于关注合成损失，则除了红外图像中的亮度和对比度将会丢失；为了防止生成器过度表征像素到像素的关系，将添加合成损失的适当权重；合成损失可以表示为：The synthesis loss is actually the L1 loss. By adding the synthesis loss, the brightness and contrast differences between the color image and the ground truth can be effectively minimized. If GAN pays too much attention to the synthesis loss, the brightness and contrast in the infrared image will be lost. In order to prevent the generator from over-representing the pixel-to-pixel relationship, an appropriate weight of the synthesis loss will be added. The synthesis loss can be expressed as:

其中W和H分别表示红外图像的高度和宽度，I_ir表示输入的红外图像，I_vis表示Ground Truth，G(·)表示生成器，||·||₁表示L1范数；Where W and H represent the height and width of the infrared image, respectively, I _ir represents the input infrared image, I _vis represents the Ground Truth, G(·) represents the generator, and ||·|| ₁ represents the L1 norm;

使用合成损失的彩色化结果将丢失一部分细节内容；为了鼓励网络输出具有更真实细节的彩色结果，采用了对抗性损失；对抗性损失用于使彩色化图像与Ground Truth无法区分，定义为：The colorization result using synthetic loss will lose some details; in order to encourage the network to output color results with more realistic details, adversarial loss is used; adversarial loss is used to make the colorized image indistinguishable from the ground truth, defined as:

其中输入红外图像I_ir不仅是生成器的输入，也是鉴别器的输入，作为条件项；The input infrared image I _ir is not only the input of the generator, but also the input of the discriminator as a conditional item;

L_synthesized和L_adv有时无法确保感知质量和客观指标之间的一致性；为了缓解这个问题，我们使用特征损失，它比较了由专用预训练模型等工具提取的特征图之间的差异；在本文中，使用预训练的VGG-16CNN模型来识别和提取输入图像的特征；为了合成特征损失，VGG-16CNN模型提取彩色结果和GroundTruth的特征图，然后计算这对特征图之间的欧几里德距离；特征损失表示为：L _synthesized and L _adv sometimes fail to ensure consistency between perceptual quality and objective metrics; to alleviate this problem, we use feature loss, which compares the differences between feature maps extracted by tools such as dedicated pre-trained models; in this paper, a pre-trained VGG-16CNN model is used to identify and extract features of the input image; to synthesize feature loss, the VGG-16CNN model extracts feature maps of the color result and GroundTruth, and then calculates the Euclidean distance between the pair of feature maps; the feature loss is expressed as:

其中φ_n(·)表示VGG-16网络中第n层特征图，C_n、H_n和W_n分别表示该层的通道数、高度和宽度；Where φ _n (·) represents the feature map of the nth layer in the VGG-16 network, and C _n , H _n and W _n represent the number of channels, height and width of the layer respectively;

除了使用特征损失来恢复高级内容外，我们还采用总体变化损失来增强彩色红外图像的空间平滑度，总体变化损失定义为：In addition to using feature loss to recover high-level content, we also adopt overall variation loss to enhance the spatial smoothness of color infrared images. The overall variation loss is defined as:

其中|·|表示给定输入的逐元素绝对值；where |·| represents the element-wise absolute value of the given input;

我们采用了梯度损失来增强彩色化图像质量；网络中利用可见光图像的梯度损失和红外图像的梯度损失对网络进行约束，获取图像的纹理信息和亮度信息；梯度损失有利于网络更快的收敛；可见光图像的梯度损失和红外图像的梯度损失定义为：We use gradient loss to enhance the quality of colorized images. The network uses the gradient loss of visible light images and the gradient loss of infrared images to constrain the network and obtain the texture information and brightness information of the image. Gradient loss is conducive to faster convergence of the network. The gradient loss of visible light images and the gradient loss of infrared images are defined as:

L_G2＝L_gradient1+L_gradient2 L _G2 =L _gradient1 +L _gradient2

其中表示梯度；in represents the gradient;

因此生成器的总损失定义为：Therefore the total loss of the generator is defined as:

L_total＝λ_advL_adv+λ_featureL_feature+λ_synthesizedL_synthesized+λ_tvL_tv+λ_G2L_G2 L _total ＝λ _adv L _adv +λ _feature L _feature +λ _synthesized L _synthesized +λ _tv L _tv +λ _G2 L _G2

其中λ_adv、λ_feature、λ_synthesized、λ_tv和λ_G2分别表示控制完整损失函数中不同损失份额的权重；权重的设置基于对训练数据集的初步实验；Where λ _adv , λ _feature , λ _synthesized , λ _tv and λ _G2 represent the weights controlling different loss shares in the complete loss function; the weights are set based on preliminary experiments on the training dataset;

鉴别器的损失函数定义为：The loss function of the discriminator is defined as:

通过优化生成器和鉴别器损失函数，有助于网络学习更清晰的边缘和更详细的纹理，使得彩色化图像的颜色自然，真实度更高，视觉效果更好；By optimizing the generator and discriminator loss functions, the network can learn clearer edges and more detailed textures, making the colorized images more natural, more realistic, and more visually appealing.

所述步骤4中合适的评估指标选择峰值信噪比(PSNR)、结构相似性(SSIM)、感知图像相似度(LPIPS)和自然图像质量评估(NIQE)，峰值信噪比是基于对应像素点间的误差，即基于误差敏感的图像质量评价；结构相似性则是从亮度、对比度和结构三方面度量图像相似性，是一种用以衡量两张数位影像相似程度的指标；感知图像相似度学习生成图像到Ground Truth的反向映射强制生成器学习从假图像中重构真实图像的反向映射，并优先处理它们之间的感知相似度；自然图像质量评估基于一组“质量感知”特征，并将其拟合到多元高斯模型中；质量感知特征源于一个简单但高度正则化的自然统计特征模型；然后，将给定的测试图像的自然图像质量评估指标表示为从测试图像中提取的自然统计特征的多元高斯模型与从自然图像语料中提取的质量感知特征的多元高斯模型之间的距离；峰值信噪比、结构相似性、感知图像相似度和自然图像质量评估定义如下：The appropriate evaluation indicators in step 4 are peak signal-to-noise ratio (PSNR), structural similarity (SSIM), perceptual image similarity (LPIPS) and natural image quality evaluation (NIQE). The peak signal-to-noise ratio is based on the error between corresponding pixels, that is, an error-sensitive image quality evaluation; structural similarity measures image similarity from three aspects: brightness, contrast and structure, and is an indicator used to measure the similarity between two digital images; perceptual image similarity learns the reverse mapping of the generated image to the ground truth to force the generator to learn the reverse mapping of reconstructing the real image from the fake image, and prioritizes the perceptual similarity between them; natural image quality evaluation is based on a set of "quality-aware" features and fits them to a multivariate Gaussian model; the quality-aware features are derived from a simple but highly regularized natural statistical feature model; then, the natural image quality evaluation indicator of a given test image is expressed as the distance between the multivariate Gaussian model of the natural statistical features extracted from the test image and the multivariate Gaussian model of the quality-aware features extracted from the natural image corpus; peak signal-to-noise ratio, structural similarity, perceptual image similarity and natural image quality evaluation are defined as follows:

其中μ_x，μ_y分别表示图像x和y的均值和方差，和分别表示图像x和y的标准差，σ_xy表示图像x和y的协方差，C₁和C₂为常数，d为x₀与x之间的距离，w_l为可训练权重参数，v₁,v₂,∑₁和∑₂分别表示自然多元高斯模型与失真图像多元高斯模型的均值向量和协方差矩阵；Where μ _x and μ _y represent the mean and variance of images x and y respectively. and Represent the standard deviation of images x and y respectively, σ _xy represents the covariance of images x and y, C ₁ and C ₂ are constants, d is the distance between x ₀ and x, w _l is a trainable weight parameter, v ₁ ,v ₂ ,∑ ₁ and ∑ ₂ represent the mean vector and covariance matrix of the natural multivariate Gaussian model and the distorted image multivariate Gaussian model respectively;

设定训练次数为200，前100次训练过程的学习率设置为0.0002，后100次训练过程的学习率从0.0002逐渐递减到0；每次输入到网络图像数量大小的上限主要是根据计算机图形处理器性能决定，一般每次输入到网络图像数量在8-16区间内，可以使网络训练更加稳定且训练结果更好，能保证网络快速拟合；网络参数优化器选择Adam优化器；它的优点主要在于实现简单，计算高效，对内存需求少，参数的更新不受梯度的伸缩变换影响，使得参数比较平稳；当鉴别器判断假图的能力与生成器生成图像欺骗过鉴别器的能力平衡时，认为网络已基本训练完成；The number of training times is set to 200, the learning rate of the first 100 training processes is set to 0.0002, and the learning rate of the next 100 training processes is gradually reduced from 0.0002 to 0; the upper limit of the number of images input to the network each time is mainly determined by the performance of the computer graphics processor. Generally, the number of images input to the network each time is in the range of 8-16, which can make the network training more stable and the training results better, and can ensure the rapid fitting of the network; the Adam optimizer is selected as the network parameter optimizer; its advantages are mainly simple implementation, efficient calculation, low memory requirements, and parameter updates are not affected by the scaling transformation of the gradient, making the parameters relatively stable; when the discriminator's ability to judge fake images is balanced with the generator's ability to generate images that deceive the discriminator, the network is considered to have been basically trained;

步骤5，微调模型；用红外图像数据集二对模型进行训练和微调，在微调模型参数过程中使用FLIR数据集；FLIR数据集有8862个未对齐的可见光和红外图像对，包含丰富的场景，如道路、车辆、行人等等；这些图像是FLIR视频中极具代表性的场景；在这个数据集中，我们通过cpselect算法选择特征点形成特征点矩阵，将FLIR数据集进行人工配准；经过筛选配准后，我们将3918个图像对用于训练，428个图像对用于测试；Step 5, fine-tune the model; use the infrared image dataset 2 to train and fine-tune the model, and use the FLIR dataset in the process of fine-tuning the model parameters; the FLIR dataset has 8862 unaligned visible light and infrared image pairs, containing a variety of scenes, such as roads, vehicles, pedestrians, etc.; these images are very representative scenes in FLIR videos; in this dataset, we select feature points through the cpselect algorithm to form a feature point matrix, and manually align the FLIR dataset; after screening and alignment, we use 3918 image pairs for training and 428 image pairs for testing;

所述步骤6中将网络训练完成后，需要将网络中所有参数保存，之后用将要彩色化的红外图像输入到网络中就可以得到彩色化好的图像；该网络对输入图像大小没有要求，任意尺寸均可；After the network training is completed in step 6, all parameters in the network need to be saved, and then the infrared image to be colored is input into the network to obtain a colored image; the network has no requirements on the size of the input image, and any size is acceptable;

其中，卷积、激活函数、拼接操作和批归一化等的实现是本领域技术人员公知的算法，具体流程和方法可在相应的教科书或者技术文献中查阅到；Among them, the implementation of convolution, activation function, splicing operation and batch normalization are algorithms well known to those skilled in the art, and the specific processes and methods can be found in corresponding textbooks or technical literature;

本发明通过构建一种基于生成对抗网络的红外图像彩色化方法，可以将红外图像直接生成彩色化图像，不再经过中间其他步骤，避免了人工手动设计相关彩色化规则；在相同条件下，通过计算与现有方法得到图像的相关指标，进一步验证了该方法的可行性和优越性；现有技术和本发明提出方法的相关指标对比如图10所示；The present invention constructs an infrared image colorization method based on a generative adversarial network, which can directly generate a colorized image from an infrared image without going through other intermediate steps, thus avoiding the manual design of relevant colorization rules; under the same conditions, the relevant indexes of the image obtained by calculating the existing method are further verified to verify the feasibility and superiority of the method; the comparison of the relevant indexes of the existing technology and the method proposed in the present invention is shown in FIG10;

从图10中可知，本发明提出的方法比现有方法拥有更高的峰值信噪比、更高的结构相似性、更低的感知图像相似度、更低的自然图像质量评估和更少的生成器参数，这些指标也进一步说明了本发明提出的方法具有更好的彩色化质量和更低的计算复杂度；As can be seen from FIG10 , the method proposed in the present invention has a higher peak signal-to-noise ratio, higher structural similarity, lower perceived image similarity, lower natural image quality assessment, and fewer generator parameters than the existing method. These indicators further illustrate that the method proposed in the present invention has better colorization quality and lower computational complexity.

如图11所示，本发明还提供了一种红外图像彩色化系统，主要包括图像获取模块、图像处理模块、特征提取模块、模型训练模块和图像彩色化模块和存储介质；As shown in FIG11 , the present invention also provides an infrared image colorization system, which mainly includes an image acquisition module, an image processing module, a feature extraction module, a model training module, an image colorization module and a storage medium;

图像彩色化模块，用于将待彩色化图像输入所述红外图像彩色化模型以进行红外图像彩色化处理；An image colorization module, used for inputting the image to be colorized into the infrared image colorization model to perform infrared image colorization processing;

进一步地，所述图像彩色化模块还用于获取待彩色化图像，将待彩色化图像信息输入到红外图像彩色化模型，利用前向传播算法生成红外图像彩色化结果；Furthermore, the image colorization module is also used to obtain the image to be colored, input the information of the image to be colored into the infrared image colorization model, and generate the infrared image colorization result by using the forward propagation algorithm;

如图12所示，本发明还提供了一种红外图像彩色化电子设备，主要包括存储器、处理器、通信接口和总线；其中，存储器、处理器、通信接口通过总线实现彼此之间的通信连接；As shown in FIG12 , the present invention also provides an infrared image colorization electronic device, which mainly includes a memory, a processor, a communication interface and a bus; wherein the memory, the processor and the communication interface are connected to each other through the bus;

存储器可以是ROM，静态存储设备，动态存储设备或者RAM；存储器可以存储程序，当存储器中存储的程序被处理器执行时，处理器和通信接口用于执行本发明实施例的红外图像彩色化网络的训练方法的各个步骤；The memory may be a ROM, a static storage device, a dynamic storage device or a RAM; the memory may store a program, and when the program stored in the memory is executed by the processor, the processor and the communication interface are used to execute the various steps of the training method for the infrared image colorization network of the embodiment of the present invention;

处理器可以采用CPU，微处理器，ASIC，GPU或者一个或多个集成电路，用于执行相关程序，以实现本发明的红外图像彩色化训练系统中的单元所需执行的功能，或者执行本发明的红外图像彩色化训练方法；The processor may be a CPU, a microprocessor, an ASIC, a GPU or one or more integrated circuits, and is used to execute relevant programs to implement the functions required to be performed by the units in the infrared image colorization training system of the present invention, or to perform the infrared image colorization training method of the present invention;

处理器还可以是一种集成电路芯片，具有信号的处理能力；在实现过程中，本发明的红外图像彩色化训练方法的各个步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成；上述的处理器，还可以是通用处理器、DSP、ASIC、FPGA或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件；可以实现或者执行本发明的红外图像彩色化方法、步骤及逻辑框图；通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等；结合本发明的红外图像彩色化方法的步骤可以直接体现为硬件译码处理器执行完成，或者用译码处理器中的硬件及软件模块组合执行完成；软件模块可以位于随机存储器，闪存、只读存储器，可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中；该存储介质位于存储器，处理器读取存储器中的信息，结合其硬件完成本发明的红外图像彩色化训练系统中包括的单元所需执行的功能，或者执行本发明的红外图像彩色化训练方法；The processor may also be an integrated circuit chip with signal processing capability; in the implementation process, each step of the infrared image colorization training method of the present invention may be completed by hardware integrated logic circuits or software instructions in the processor; the above-mentioned processor may also be a general-purpose processor, DSP, ASIC, FPGA or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components; the infrared image colorization method, steps and logic block diagram of the present invention may be implemented or executed; the general-purpose processor may be a microprocessor or the processor may also be any conventional processor, etc.; the steps of the infrared image colorization method of the present invention may be directly embodied as being executed by a hardware decoding processor, or may be executed by a combination of hardware and software modules in the decoding processor; the software module may be located in a mature storage medium in the art such as a random access memory, a flash memory, a read-only memory, a programmable read-only memory or an electrically erasable programmable memory, a register, etc.; the storage medium is located in a memory, the processor reads the information in the memory, and in combination with its hardware completes the functions required to be executed by the units included in the infrared image colorization training system of the present invention, or executes the infrared image colorization training method of the present invention;

通信接口使用例如但不限于收发器一类的收发系统，来实现系统与其他设备或通信网络之间的通信；例如，可以通过通信接口获取待处理图像或者获取待处理图像的初始特征图；The communication interface uses a transceiver system such as, but not limited to, a transceiver to achieve communication between the system and other devices or communication networks; for example, the image to be processed or the initial feature map of the image to be processed can be obtained through the communication interface;

总线可包括在系统各个部件(例如，存储器、处理器、通信接口)之间传送信息的通路；A bus may include a pathway that transfers information between various components of a system (e.g., memory, processor, communication interface);

本发明还提供了一种红外图像彩色化的计算机可读存储介质，该计算机可读存储介质可以是上述实施方式中所述系统中所包含的计算机可读存储介质；也可以是单独存在，未装配入设备中的计算机可读存储介质；计算机可读存储介质存储有一个或者一个以上程序，所述程序被一个或者一个以上的处理器用来执行描述于本发明提供的方法；The present invention also provides a computer-readable storage medium for colorization of infrared images, which may be a computer-readable storage medium included in the system in the above-mentioned embodiment; or a computer-readable storage medium that exists independently and is not assembled into a device; the computer-readable storage medium stores one or more programs, and the programs are used by one or more processors to execute the method described in the present invention;

应注意，尽管图12所示的电子设备仅仅示出了存储器、处理器、通信接口，但是在具体实现过程中，本领域的技术人员应当理解，系统还包括实现正常运行所必须的其他器件；同时，根据具体需要，本领域的技术人员应当理解，系统还可包括实现其他附加功能的硬件器件；此外，本领域的技术人员应当理解，系统也可仅仅包括实现本发明实施例所必须的器件，而不必包括图12中所示的全部器件。It should be noted that although the electronic device shown in FIG12 only shows a memory, a processor, and a communication interface, during the specific implementation process, those skilled in the art should understand that the system also includes other devices necessary for normal operation; at the same time, according to specific needs, those skilled in the art should understand that the system may also include hardware devices for implementing other additional functions; in addition, those skilled in the art should understand that the system may also include only the devices necessary to implement the embodiments of the present invention, without necessarily including all the devices shown in FIG12.

最后应说明的是：以上所述仅为本发明的优选实施例而已，并不用于限制本发明，尽管参照前述实施例对本发明进行了详细的说明，对于本领域的技术人员来说，其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分技术特征进行等同替换。凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。Finally, it should be noted that the above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. Although the present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art can still modify the technical solutions described in the aforementioned embodiments or replace some of the technical features therein by equivalents. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included in the protection scope of the present invention.

Claims

1. A method for colorizing infrared images based on a generative adversarial network, characterized in that the specific steps are:

Build a network model: Build a generative adversarial network including a generator and a discriminator;

The generator includes a multi-layer dense module, a downsampling module, a feature fusion module, a color-aware attention module, an upsampling module, and an image reconstruction module;

The multi-layer dense module is used to extract features from images using convolution blocks;

The multi-layer dense module consists of convolution block 1, convolution block 2, convolution block 3, convolution block 4, convolution block 5, convolution block 6, convolution block 7, convolution block 8 and splicing operation. The convolution kernel size is 3× 3 and the step size is 1.

The downsampling module is composed of multiple layers of dense modules and convolution blocks; it is used to continuously reduce the size of feature maps and extract deep semantic information of images;

The feature fusion module is used to reduce the information loss caused by encoder downsampling;

The color-aware attention module is used to effectively improve the colorization quality of the network by enhancing important features and suppressing unnecessary features. At the same time, it strengthens the perceptual information, captures more semantic details, and focuses on more key objects;

The color perception attention module consists of convolutional blocks 1, 2, 3, 4, 5, 6, 7, 8, 9, addition, multiplication, Softmax, average pooling and expansion. The convolution kernel sizes are 1 × 1 and 3 × 3 respectively, with a step size of 1. The expansion rates of convolutional blocks 3 and 4 are 3.

The upsampling module is used to gradually restore the size of the feature map;

The convolutional block consists of a convolutional layer, a normalization layer, and an L-type function;

The image reconstruction module is used to reconstruct the infrared colorized image using a convolution block and a T-type function;

The discriminator includes multiple convolutional blocks, color-aware attention modules, normalization layers, and L-type functions to enhance perceptual information and facilitate fast convergence.

Prepare the dataset: pre-train the generative adversarial network based on the first infrared image dataset;

Training the network model: using the first infrared image data set to train the network model until a preset threshold is reached;

Fine-tuning the model: retraining and fine-tuning the network model using the second infrared image dataset to obtain the final model;

Save model: Solidify the parameters of the final model and save the model.

2. According to claim 1, a method for infrared image colorization based on a generative adversarial network is characterized in that the first infrared image dataset is a KAIST dataset.

3. According to claim 1, a method for colorizing infrared images based on a generative adversarial network is characterized in that the preset thresholds in the training network model include a preset value of the loss function and a preset value of the number of training times.

4. According to claim 1, a method for colorizing infrared images based on a generative adversarial network is characterized in that the loss function is a composite loss function, the loss function used by the generator includes synthetic loss, adversarial loss, feature loss, overall change loss and gradient loss; the discriminator uses adversarial loss.

5. According to claim 1, a method for colorizing infrared images based on a generative adversarial network is characterized in that the process of training the network model also includes evaluating the quality of the algorithm colorization results and the degree of image distortion through evaluation indicators.

6. According to claim 1, the infrared image colorization method based on generative adversarial network is characterized in that the second infrared image dataset is a FLIR dataset.

7. According to claim 1, an application system of an infrared image colorization method based on a generative adversarial network, characterized in that the system comprises:

An image acquisition module, used to acquire training data and a deep neural network; the training data includes input image information and output image information;

An image processing module, used for preprocessing each image in the training data; the preprocessing includes geometric correction, image enhancement and image filtering;

A feature extraction module, used for extracting features from each of the images to obtain a feature map set for each of the images; the feature map set includes a plurality of feature maps of different sizes;

A model training module, used for performing supervised model training on a feature map of a specified size in a feature map set of each image to obtain an infrared image colorization model;

An image colorization module, used for inputting the infrared image to be colorized into the infrared image colorization model to perform infrared image colorization processing;

Storage medium, used for storing infrared image colorization system;

The image processing module is also used for data enhancement of the training data set, and common methods include: image cropping, image flipping and image translation;

The feature extraction module is also used for the feature maps of the specified size in the feature map set of each image, and there are no less than 2 feature maps of the specified size for the model to be trained, and the specified size is 1/8 of the original image size;

The model training module is also used to resize each image from an arbitrary size in the dataset to a fixed size of 256×256; both the generator and the discriminator are trained with a batch size of 1 for 200 epochs; initially, the learning rate is set to 0.0002 in the first 100 epochs, and the learning rate is linearly decreased to 0 in the next 100 epochs; the number of filters in the first convolutional layer in the generator and the discriminator is set to 64; we use the Adam optimizer, , ;The discriminator and generator are trained alternately until the model converges;

The image colorization module is also used to obtain the image to be colored, input the information of the image to be colored into the infrared image colorization model, and generate the infrared image colorization result by using the forward propagation algorithm.