CN118470440A

CN118470440A - An early tumor recognition system based on deep learning and hyperspectral images

Info

Publication number: CN118470440A
Application number: CN202410916942.8A
Authority: CN
Inventors: 李玮; 张延冰; 雷晟暄; 刘尚明; 刘洪彬; 孟密密; 姜浩; 王立言; 王伟; 宋峻林; 赵晗竹; 韩浩宇; 吴世豪; 韩景泓; 张彦霖; 党广虹; 顾夏铭
Original assignee: Shandong University
Current assignee: Shandong University
Priority date: 2024-07-10
Filing date: 2024-07-10
Publication date: 2024-08-09
Anticipated expiration: 2044-07-10
Also published as: CN118470440B

Abstract

The invention discloses a tumor early recognition system based on deep learning and hyperspectral images, and relates to the technical field of tumor early recognition. Comprising the following steps: the image dimension reduction module is used for acquiring a hyperspectral image and extracting spectral information in the hyperspectral image; the characteristic wave band selection module is used for carrying out differential analysis on the extracted spectrum information and determining a characteristic wave band according to an analysis result; the deep learning model module is used for carrying out preliminary tumor recognition on the spectrum information under the selected characteristic wave band by utilizing the tumor recognition model; and the integrated learning module is used for carrying out integrated learning on the preliminary tumor recognition result, optimizing a tumor recognition model according to the integrated learning result, and recognizing the hyperspectral image to be detected by adopting the optimized tumor recognition model to obtain a final tumor recognition result. The method combines the advantages of the spectrum information of the hyperspectral image and the integration of the prediction results of various deep learning models, and realizes the early and accurate identification of tumors.

Description

An early tumor recognition system based on deep learning and hyperspectral images

技术领域Technical Field

本发明涉及肿瘤早期识别技术领域，尤其涉及一种基于深度学习与高光谱图像的肿瘤早期识别系统。The present invention relates to the technical field of early tumor recognition, and in particular to an early tumor recognition system based on deep learning and hyperspectral images.

背景技术Background Art

本部分的陈述仅仅是提供了与本发明相关的背景技术信息，不必然构成在先技术。The statements in this section merely provide background information related to the present invention and do not necessarily constitute prior art.

癌症是世界上最常见的疾病之一，近年来，其发病率和死亡率不断上升，严重危害人们的身体健康，癌症已经成为人类健康的“头号杀手”。现阶段，随着癌症早筛工作的逐年推进，在实际工作中出现多种问题。首先，对肿瘤的早期筛查缺乏全面性和规范性，即有些筛查只针对特定癌种，筛查无法做到全面彻底，容易漏掉早期发现其他癌症的可能。并且目前尚未建立常见恶性肿瘤精准的筛查措施及早诊早治全程健康管理体系。其次，目前肿瘤早期识别手段多采用肿瘤标记物的方法，具有造价高和等待时间长的缺点，且该方法需要进行标记，容易对细胞产生污染。因此，急需一种全面、快速和无标记的肿瘤早期识别系统应用于肿瘤早期识别领域。Cancer is one of the most common diseases in the world. In recent years, its morbidity and mortality rates have continued to rise, seriously endangering people's health. Cancer has become the "number one killer" of human health. At this stage, with the advancement of early cancer screening year by year, many problems have arisen in actual work. First, the early screening of tumors lacks comprehensiveness and standardization, that is, some screenings are only for specific types of cancer, and the screening cannot be comprehensive and thorough, which can easily miss the possibility of early detection of other cancers. In addition, accurate screening measures for common malignant tumors and a full-course health management system for early diagnosis and treatment have not yet been established. Secondly, the current means of early tumor identification mostly use tumor markers, which have the disadvantages of high cost and long waiting time, and this method requires labeling, which is easy to contaminate cells. Therefore, there is an urgent need for a comprehensive, rapid and label-free tumor early identification system to be applied in the field of early tumor identification.

发明内容Summary of the invention

针对现有技术存在的不足，本发明的目的是提供一种基于深度学习与高光谱图像的肿瘤早期识别系统，结合高光谱图像自身光谱信息的优势以及多种深度学习模型预测结果的集成，实现肿瘤早期精确识别。In view of the shortcomings of the prior art, the purpose of the present invention is to provide an early tumor identification system based on deep learning and hyperspectral images, combining the advantages of the spectral information of hyperspectral images themselves and the integration of prediction results of multiple deep learning models to achieve accurate early tumor identification.

为了实现上述目的，本发明是通过如下的技术方案来实现：In order to achieve the above object, the present invention is implemented through the following technical solutions:

本发明第一方面提供了一种基于深度学习与高光谱图像的肿瘤早期识别系统，包括：The first aspect of the present invention provides a system for early tumor recognition based on deep learning and hyperspectral images, comprising:

图像降维模块，用于获取高光谱图像，提取高光谱图像中的光谱信息，将三维高光谱图像转换为一维数据；Image dimension reduction module, used to obtain hyperspectral images, extract spectral information in hyperspectral images, and convert three-dimensional hyperspectral images into one-dimensional data;

特征波段选取模块，用于对提取得到的光谱信息进行差异性分析，根据分析结果确定特征波段；The characteristic band selection module is used to perform difference analysis on the extracted spectral information and determine the characteristic band according to the analysis results;

深度学习模型模块，用于构建肿瘤识别模型，并利用肿瘤识别模型对选取的特征波段下的光谱信息进行肿瘤初步识别；A deep learning model module is used to build a tumor recognition model and use the tumor recognition model to perform preliminary tumor recognition on the spectral information under the selected characteristic bands;

集成学习模块，用于对肿瘤初步识别结果进行集成学习，并根据集成学习结果优化肿瘤识别模型，采用优化后的肿瘤识别模型对待测高光谱图像进行识别，得到最终的肿瘤识别结果。The integrated learning module is used to perform integrated learning on the preliminary tumor recognition results, optimize the tumor recognition model according to the integrated learning results, and use the optimized tumor recognition model to recognize the hyperspectral image to be tested to obtain the final tumor recognition result.

进一步的，图像降维模块中，提取高光谱图像中的光谱信息的具体步骤为：Furthermore, in the image dimension reduction module, the specific steps of extracting spectral information from the hyperspectral image are as follows:

采用等间隔采样的方式选取部分像素点进行像素点采样；Select some pixels for pixel sampling by adopting equal interval sampling method;

提取所选择像素点的光谱信息。Extract the spectral information of the selected pixel.

更进一步的，基于不同肿瘤区域所占原图的比例选择缩小因子，从而确定采样间隔。Furthermore, the reduction factor is selected based on the proportion of the original image occupied by different tumor regions, thereby determining the sampling interval.

更进一步的，缩小因子选择过程中需考虑肿瘤区域长度和宽度，进而确定水平方向和垂直方向的采样间隔。Furthermore, the length and width of the tumor region need to be considered in the process of shrinkage factor selection to determine the sampling intervals in the horizontal and vertical directions.

进一步的，特征波段选取模块中，对提取得到的光谱信息进行差异性分析的具体步骤为：Furthermore, in the feature band selection module, the specific steps of performing difference analysis on the extracted spectral information are as follows:

对光谱信息通过多种特征选择方法进行特征的分别提取；The spectral information is extracted separately through a variety of feature selection methods;

选择每种特征选择方法中光谱信息差异性最大的波段作为候选特征波段；The band with the largest spectral information difference in each feature selection method is selected as the candidate feature band;

选取所有选取的候选特征波段的交集作为合适的特征波段。The intersection of all selected candidate feature bands is selected as the appropriate feature band.

更进一步的，多种特征选择方法包括两种线性特征选择方法和两种非线性特征选择方法。Furthermore, the multiple feature selection methods include two linear feature selection methods and two nonlinear feature selection methods.

进一步的，深度学习模型模块中，所述肿瘤识别模型由两层构成，第一层模型包括一维卷积神经网络和二维卷积神经网络，第二层模型包括自注意残差网络，选取的特征波段下的光谱信息依次经过一维卷积神经网络、二维卷积神经网络和自注意残差网络。Furthermore, in the deep learning model module, the tumor recognition model consists of two layers. The first layer model includes a one-dimensional convolutional neural network and a two-dimensional convolutional neural network, and the second layer model includes a self-attention residual network. The spectral information under the selected feature band passes through the one-dimensional convolutional neural network, the two-dimensional convolutional neural network and the self-attention residual network in sequence.

更进一步的，利用肿瘤识别模型对选取的特征波段下的光谱信息进行肿瘤初步识别的具体步骤为：Furthermore, the specific steps of using the tumor recognition model to perform preliminary tumor recognition on the spectral information under the selected characteristic band are as follows:

利用一维卷积神经网络根据特征波段范围选择对应个数的像素点光谱信息排列组成二维图像；A one-dimensional convolutional neural network is used to select a corresponding number of pixel point spectral information according to the characteristic band range to form a two-dimensional image;

利用二维卷积神经网络捕捉二维图像中的上下文信息，得到特征增强后的二维图像；Use a two-dimensional convolutional neural network to capture contextual information in a two-dimensional image and obtain a two-dimensional image with enhanced features;

利用自注意残差网络中的自注意力机制对二维图像进行自注意力计算，根据计算结果得到肿瘤初步识别结果。The self-attention mechanism in the self-attention residual network is used to perform self-attention calculation on the two-dimensional image, and the preliminary tumor recognition result is obtained based on the calculation result.

更进一步的，所述自注意力计算的具体步骤为：Furthermore, the specific steps of the self-attention calculation are:

将二维图像映射到三个不同的空间，并进行自注意力计算；Map the two-dimensional image into three different spaces and perform self-attention calculations;

根据注意力得分对二维图像的不同部分分配不同的权重；Assign different weights to different parts of the 2D image based on the attention score;

按照分配的权重对二维图像的不同区域进行加权处理。Different regions of the two-dimensional image are weighted according to the assigned weights.

进一步的，集成学习模块中，对肿瘤初步识别结果进行集成学习的具体步骤为：Furthermore, in the integrated learning module, the specific steps of integrated learning of the preliminary tumor recognition results are as follows:

利用肿瘤初步识别结果分别对一维卷积神经网络和二维卷积神经网络进行训练；The preliminary tumor recognition results were used to train the one-dimensional convolutional neural network and the two-dimensional convolutional neural network respectively;

将训练后的两个模型进行组合形成第一集成模型，采用加权投票的方法对两个模型的预测结果进行汇总作为第一层预测结果；The two trained models are combined to form the first integrated model, and the prediction results of the two models are summarized as the first-level prediction results using the weighted voting method;

利用与二维卷积神经网络相同的训练集训练自注意残差网络，采用加权投票的方法对训练后的自注意残差网络模型的预测结果进行汇总，并将自注意残差网络模型和第一集成模型的两个输出结果进行集成，得到优化后的肿瘤识别模型。The self-attention residual network was trained using the same training set as the two-dimensional convolutional neural network. The prediction results of the trained self-attention residual network model were summarized using the weighted voting method. The two output results of the self-attention residual network model and the first integrated model were integrated to obtain the optimized tumor recognition model.

以上一个或多个技术方案存在以下有益效果：One or more of the above technical solutions have the following beneficial effects:

本发明公开了一种基于深度学习与高光谱图像的肿瘤早期识别系统，通过对高光谱图像中光谱信息的准确提取和双层深度学习网络的构建实现肿瘤早期的准确识别。本发明还通过集成学习的方式，对于双层深度学习网络构成的肿瘤识别模型进行了进一步的优化，提高了肿瘤识别的精确程度，克服了现有技术中对肿瘤标记造成的造价高、时间长和容易污染的问题。The present invention discloses an early tumor recognition system based on deep learning and hyperspectral images, which realizes accurate early tumor recognition by accurately extracting spectral information from hyperspectral images and constructing a double-layer deep learning network. The present invention further optimizes the tumor recognition model composed of the double-layer deep learning network by means of integrated learning, thereby improving the accuracy of tumor recognition and overcoming the problems of high cost, long time and easy contamination caused by tumor marking in the prior art.

本发明附加方面的优点将在下面的描述中部分给出，部分将从下面的描述中变得明显，或通过本发明的实践了解到。Advantages of additional aspects of the present invention will be given in part in the following description, and in part will become obvious from the following description, or will be learned through practice of the present invention.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

构成本发明的一部分的说明书附图用来提供对本发明的进一步理解，本发明的示意性实施例及其说明用于解释本发明，并不构成对本发明的不当限定。The accompanying drawings in the specification, which constitute a part of the present invention, are used to provide a further understanding of the present invention. The exemplary embodiments of the present invention and their descriptions are used to explain the present invention and do not constitute improper limitations on the present invention.

图1为本发明中基于深度学习与高光谱图像的肿瘤早期识别系统的识别过程框架图。FIG1 is a framework diagram of the recognition process of the early tumor recognition system based on deep learning and hyperspectral images in the present invention.

具体实施方式DETAILED DESCRIPTION

应该指出，以下详细说明都是示例性的，旨在对本发明提供进一步的说明。除非另有指明，本文使用的所有技术和科学术语具有与本发明所属技术领域的普通技术人员通常理解的相同含义。It should be noted that the following detailed descriptions are exemplary and are intended to provide further explanation of the present invention. Unless otherwise specified, all technical and scientific terms used herein have the same meanings as those commonly understood by those skilled in the art to which the present invention belongs.

应当说明的是，本发明实施例中，涉及到高光谱图像等相关的数据，当本发明以上实施例运用到具体产品或技术中时，需要获得用户许可或者同意，且相关数据的收集、使用和处理需要遵守相关国家和地区的相关法律法规和标准。It should be noted that in the embodiments of the present invention, data related to hyperspectral images are involved. When the above embodiments of the present invention are applied to specific products or technologies, user permission or consent is required, and the collection, use and processing of relevant data need to comply with relevant laws, regulations and standards of relevant countries and regions.

需要注意的是，这里所使用的术语仅是为了描述具体实施方式，而非意图限制根据本发明的示例性实施方式。如在这里所使用的，除非上下文另外明确指出，否则单数形式也意图包括复数形式，此外，还应当理解的是，当在本说明书中使用术语“包含”和/或“包括”时，其指明存在特征、步骤、操作、器件、组件和/或它们的组合；It should be noted that the terms used herein are only for describing specific embodiments and are not intended to limit the exemplary embodiments according to the present invention. As used herein, unless the context clearly indicates otherwise, the singular form is also intended to include the plural form. In addition, it should be understood that when the terms "include" and/or "include" are used in this specification, it indicates the presence of features, steps, operations, devices, components and/or their combinations;

实施例一：Embodiment 1:

本发明实施例一提供了一种基于深度学习与高光谱图像的肿瘤早期识别系统，包括图像降维模块、特征波段选取模块、深度学习模型模块和集成学习模块。Embodiment 1 of the present invention provides an early tumor recognition system based on deep learning and hyperspectral images, including an image dimension reduction module, a feature band selection module, a deep learning model module and an integrated learning module.

如图1所示，图像降维模块，用于获取高光谱图像，提取高光谱图像中的光谱信息，将三维高光谱图像转换为一维数据。As shown in Figure 1, the image dimension reduction module is used to obtain hyperspectral images, extract spectral information in hyperspectral images, and convert three-dimensional hyperspectral images into one-dimensional data.

特征波段选取模块，用于对提取得到的光谱信息进行差异性分析，根据分析结果确定特征波段。The characteristic band selection module is used to perform difference analysis on the extracted spectral information and determine the characteristic band according to the analysis results.

深度学习模型模块，用于构建肿瘤识别模型，并利用肿瘤识别模型对选取的特征波段下的光谱信息进行肿瘤初步识别。The deep learning model module is used to build a tumor recognition model and use the tumor recognition model to perform preliminary tumor recognition on the spectral information under the selected characteristic bands.

在一种具体的实施方式中，图像降维模块中，提取高光谱图像中的光谱信息的具体步骤为：In a specific implementation, in the image dimension reduction module, the specific steps of extracting spectral information from the hyperspectral image are:

采用等间隔采样的方式选取部分像素点进行像素点采样，其中，基于不同肿瘤区域所占原图的比例选择缩小因子，从而确定采样间隔。缩小因子选择过程中需考虑肿瘤区域长度和宽度，进而确定水平方向和垂直方向的采样间隔。The pixel points are selected by equal-interval sampling, where the reduction factor is selected based on the proportion of the original image occupied by different tumor areas to determine the sampling interval. The length and width of the tumor area should be considered in the selection of the reduction factor to determine the sampling interval in the horizontal and vertical directions.

具体的，设原图的大小为W*H，肿瘤区域所占像素点数为N。基于不同肿瘤选择合适的宽度和长度的缩小因子，保证该缩小因子下在肿瘤区域内选择出的像素点总数不小于3N/4，分别设为k1和k2，那么采样间隔为：W/k1,W/k2。即在原图的水平方向每隔W/k1，在垂直方向每隔W/k2取一个像素，以保证所取像素点代表原图像特征。提取所选择像素点的光谱信息，进行特征波段选择。Specifically, let the size of the original image be W*H, and the number of pixels occupied by the tumor area be N. Select appropriate width and length reduction factors based on different tumors to ensure that the total number of pixels selected in the tumor area under the reduction factor is not less than 3N/4, set as k1 and k2 respectively, then the sampling interval is: W/k1, W/k2. That is, take a pixel every W/k1 in the horizontal direction of the original image and every W/k2 in the vertical direction to ensure that the taken pixel represents the characteristics of the original image. Extract the spectral information of the selected pixel and select the characteristic band.

在一种具体的实施方式中，特征波段选取模块中，对提取得到的光谱信息进行差异性分析的具体步骤为：In a specific implementation, in the characteristic band selection module, the specific steps of performing difference analysis on the extracted spectral information are as follows:

对光谱信息通过多种特征选择方法进行特征的分别提取，其中，多种特征选择方法包括两种线性特征选择方法和两种非线性特征选择方法。The features of the spectral information are extracted respectively by using a variety of feature selection methods, wherein the multiple feature selection methods include two linear feature selection methods and two nonlinear feature selection methods.

本实施例选择两种传统的线性特征选择方法：主成分分析方法和t检验分析方法，选择两种新型非线性特征选择方法：核主成分分析方法和遗传算法来进行特征波段的选择，每种选择方法中选择光谱信息差异性最大的波段作为特征波段，例如：t检验下选择p值＜0.05的波段作为此检验方法下最适特征波段。选择两种线性方法和两种非线性方法的4项特征波段结果的交集，作为最适合的特征波段，将该特征波段下的光谱信息作为输入送入后续深度学习模型，以提高深度学习模型的精确度。This embodiment selects two traditional linear feature selection methods: principal component analysis method and t-test analysis method, and selects two new nonlinear feature selection methods: kernel principal component analysis method and genetic algorithm to select feature bands. In each selection method, the band with the largest difference in spectral information is selected as the feature band. For example, under the t-test, the band with a p value of <0.05 is selected as the most suitable feature band under this test method. The intersection of the four feature band results of the two linear methods and the two nonlinear methods is selected as the most suitable feature band, and the spectral information under the feature band is sent as input to the subsequent deep learning model to improve the accuracy of the deep learning model.

在一种具体的实施方式中，深度学习模型模块中，所述肿瘤识别模型由两层构成，第一层模型包括一维卷积神经网络和二维卷积神经网络，第二层模型包括自注意残差网络，选取的特征波段下的一维光谱信息经过一维卷积神经网络；将一维光谱信息处理为二维图像后送入二维卷积神经网络和自注意残差网络，再后续进行集成学习，以便提高鲁棒性，减少过拟合现象的发生。In a specific embodiment, in the deep learning model module, the tumor recognition model is composed of two layers, the first layer model includes a one-dimensional convolutional neural network and a two-dimensional convolutional neural network, the second layer model includes a self-attention residual network, and the one-dimensional spectral information under the selected feature band passes through the one-dimensional convolutional neural network; the one-dimensional spectral information is processed into a two-dimensional image and then sent to the two-dimensional convolutional neural network and the self-attention residual network, and then integrated learning is performed subsequently to improve robustness and reduce the occurrence of overfitting.

利用肿瘤识别模型对选取的特征波段下的光谱信息进行肿瘤初步识别的具体步骤为：The specific steps of using the tumor recognition model to perform preliminary tumor recognition on the spectral information under the selected characteristic band are as follows:

利用一维卷积神经网络根据特征波段范围选择对应个数的像素点光谱信息排列组成二维图像。具体的，将图像降维模块中提取出的像素点处于所选特征波段范围内光谱信息扩大至图像灰度范围，即取光谱信息中最大的值为I _max，则对任意光谱值I，其扩大后的值P=I*255/I _max，将提取的像素点其排列组成二维图像，如：若共提取1024个像素点，筛选的特征波段范围为400-700nm，则排列成的二维图像大小为1024×300。A one-dimensional convolutional neural network is used to select the corresponding number of pixel spectral information according to the characteristic band range to arrange and form a two-dimensional image. Specifically, the spectral information of the pixel points extracted in the image dimension reduction module within the selected characteristic band range is expanded to the image grayscale range, that is, the maximum value in the spectral information is taken as I _max , then for any spectral value I , its expanded value P = I *255/ I _max , and the extracted pixel points are arranged to form a two-dimensional image, such as: if a total of 1024 pixels are extracted, and the selected characteristic band range is 400-700nm, then the size of the arranged two-dimensional image is 1024×300.

利用二维卷积神经网络捕捉二维图像中的上下文信息，得到特征增强后的二维图像。具体的，在卷积神经网络中的上下文信息可以理解为卷积核在图像上滑动做卷积操作后提取得到的特征信息。正常的卷积核是相连的一个n×n的核，而空洞卷积是在卷积核的元素之间添加间隔(空洞)，具体间隔数目由参数膨胀率控制。二维卷积神经网络在网络结构的卷积层中引入空洞卷积，使得感受野具有可变尺度，在不增加神经网络的参数和计算复杂度的情况下获得更大的感受野，从而更好地捕捉光谱图像中的上下文信息。A two-dimensional convolutional neural network is used to capture the contextual information in a two-dimensional image and obtain a two-dimensional image with enhanced features. Specifically, the contextual information in a convolutional neural network can be understood as the feature information extracted after the convolution kernel slides on the image to perform a convolution operation. A normal convolution kernel is a connected n×n kernel, while a dilated convolution adds gaps (holes) between the elements of the convolution kernel. The specific number of gaps is controlled by the parameter expansion rate. The two-dimensional convolutional neural network introduces dilated convolution in the convolution layer of the network structure, so that the receptive field has a variable scale, and a larger receptive field is obtained without increasing the parameters and computational complexity of the neural network, thereby better capturing the contextual information in the spectral image.

其中，自注意力计算的具体步骤为：Among them, the specific steps of self-attention calculation are:

将二维图像映射到三个不同的空间，并进行自注意力计算；其中，三个空间即为自注意力机制计算时的Q、K、V三个空间。The two-dimensional image is mapped to three different spaces and self-attention calculation is performed; the three spaces are Q, K, and V spaces when the self-attention mechanism is calculated.

根据注意力得分对二维图像的不同部分分配不同的权重。Different weights are assigned to different parts of the 2D image according to the attention scores.

按照分配的权重对二维图像的不同区域进行加权处理，提高模型分类的精度。Different regions of the two-dimensional image are weighted according to the assigned weights to improve the accuracy of model classification.

具体的，通过应用自注意力机制，得到二维图像上不同位置的自注意力得分，选取全部位置所得分数的中位数作为阈值T，在模型进行分类时，舍弃得分小于T/2的区域内的特征，将得分处于T/2到3T/2之间的区域的特征权重设为1，将得分大于3T/2的区域的特征权重设为2，以便使模型使用的特征更具代表性。Specifically, by applying the self-attention mechanism, the self-attention scores of different positions on the two-dimensional image are obtained, and the median of the scores obtained at all positions is selected as the threshold T. When the model is classified, the features in the area with a score less than T/2 are discarded, and the feature weights of the areas with scores between T/2 and 3T/2 are set to 1, and the feature weights of the areas with scores greater than 3T/2 are set to 2, so that the features used by the model are more representative.

在一种具体的实施方式中，集成学习模块中，采用层次集成学习的方法策略，通过一系列的集成模块将基础模型的预测结果进行层次化组合的方法。每个集成模块可以根据不同的规则或权重进行集成，最终的预测结果通过多个级别的加权平均获得。具体的，对肿瘤初步识别结果进行集成学习的具体步骤为：In a specific embodiment, in the integrated learning module, a hierarchical integrated learning method strategy is adopted to hierarchically combine the prediction results of the basic model through a series of integrated modules. Each integrated module can be integrated according to different rules or weights, and the final prediction result is obtained by weighted average of multiple levels. Specifically, the specific steps of integrated learning of the preliminary tumor identification results are:

利用肿瘤初步识别结果按照7：3的比例划分训练集与验证集，设计层次为2层，第一层次为分别对一维卷积神经网络和二维卷积神经网络进行训练，将训练后的两个模型进行组合形成第一集成模型，采用加权投票的方法对两个模型的预测结果进行汇总作为第一层预测结果。The preliminary tumor identification results were used to divide the training set and the validation set in a ratio of 7:3. The design level was 2 layers. The first layer was to train the one-dimensional convolutional neural network and the two-dimensional convolutional neural network respectively. The two trained models were combined to form the first integrated model. The weighted voting method was used to summarize the prediction results of the two models as the first-layer prediction results.

采用加权投票的方法对两个模型的预测结果进行汇总作为第一层预测结果的具体步骤为：第一集成模型包括6个一维卷积神经网络和4个二维卷积神经网络作为基学习器，每个基学习器的结果作为一票，一维卷积神经网络的权重为1，二维卷积神经网络的权重设为2；分类票数要乘以自身权重，最终将各个类别的加权票数求和，输出最大票数值与次大的票数值对应的两个结果分别作为第一集成模型的输出A与输出B。输出A初始权重设置为2，输出B的初始权重设置为1，与自注意残差网络再次进行集成学习。The specific steps of using the weighted voting method to summarize the prediction results of the two models as the first-layer prediction results are as follows: the first integrated model includes 6 one-dimensional convolutional neural networks and 4 two-dimensional convolutional neural networks as base learners, and the result of each base learner is used as one vote. The weight of the one-dimensional convolutional neural network is 1, and the weight of the two-dimensional convolutional neural network is set to 2; the classification votes must be multiplied by their own weights, and finally the weighted votes of each category are summed up, and the two results corresponding to the maximum vote value and the second largest vote value are output as the output A and output B of the first integrated model respectively. The initial weight of output A is set to 2, and the initial weight of output B is set to 1, and integrated learning is performed again with the self-attention residual network.

第二层次为利用与二维卷积神经网络相同的训练集训练自注意残差网络，采用加权投票的方法对训练后的自注意残差网络模型的预测结果进行汇总，并将自注意残差网络模型和第一集成模型的两个输出结果进行集成得到第二集成模型，作为优化后的肿瘤识别模型。The second level is to train the self-attention residual network using the same training set as the two-dimensional convolutional neural network, and use the weighted voting method to summarize the prediction results of the trained self-attention residual network model. The two output results of the self-attention residual network model and the first integrated model are integrated to obtain the second integrated model as the optimized tumor recognition model.

采用加权投票的方法对训练后的自注意残差网络模型的预测结果进行汇总的具体步骤为：第二集成模型包括第一集成模型的两个输出结果A、B，以及3个自注意残差网络模型作为基学习器。自注意残差网络的权重设置为1，其输出若与第一集成模型的输出A、B相同则将票数归于A或B，最终选取票数最高的结果作为整体模型的最终结果。The specific steps of summarizing the prediction results of the trained self-attention residual network model using the weighted voting method are as follows: the second integrated model includes the two output results A and B of the first integrated model, and three self-attention residual network models as base learners. The weight of the self-attention residual network is set to 1. If its output is the same as the output A and B of the first integrated model, the votes are attributed to A or B, and finally the result with the highest number of votes is selected as the final result of the overall model.

之后使用验证集数据评估两层集成后模型的性能，完成模型的训练和评估后，将最终的肿瘤识别模型应用于未知数据进行预测。The validation set data was then used to evaluate the performance of the two-layer integrated model. After completing the training and evaluation of the model, the final tumor recognition model was applied to unknown data for prediction.

上述虽然结合附图对本发明的具体实施方式进行了描述，但并非对本发明保护范围的限制，所属领域技术人员应该明白，在本发明的技术方案的基础上，本领域技术人员不需要付出创3造性劳动即可做出的各种修改或变形仍在本发明的保护范围以内。Although the specific implementation modes of the present invention are described above in conjunction with the accompanying drawings, this does not limit the protection scope of the present invention. Those skilled in the art should understand that various modifications or variations that can be made by those skilled in the art on the basis of the technical solution of the present invention without creative work are still within the protection scope of the present invention.

Claims

1. A tumor early recognition system based on deep learning and hyperspectral images, comprising:

the image dimension reduction module is used for acquiring a hyperspectral image, extracting spectral information in the hyperspectral image and converting the three-dimensional hyperspectral image into one-dimensional data;

The characteristic wave band selection module is used for carrying out differential analysis on the extracted spectrum information and determining a characteristic wave band according to an analysis result;

The deep learning model module is used for constructing a tumor recognition model and carrying out preliminary tumor recognition on spectrum information under the selected characteristic wave band by utilizing the tumor recognition model;

and the integrated learning module is used for carrying out integrated learning on the preliminary tumor recognition result, optimizing a tumor recognition model according to the integrated learning result, and recognizing the hyperspectral image to be detected by adopting the optimized tumor recognition model to obtain a final tumor recognition result.

2. The tumor early recognition system based on deep learning and hyperspectral image as claimed in claim 1, wherein the specific steps of extracting spectral information in the hyperspectral image in the image dimension reduction module are as follows:

selecting part of pixel points by adopting an equidistant sampling mode to sample the pixel points;

spectral information of the selected pixel point is extracted.

3. The deep learning and hyperspectral image based tumor early recognition system of claim 2 wherein the sampling interval is determined by selecting a reduction factor based on the proportion of the artwork occupied by the different tumor regions.

4. The early tumor recognition system based on deep learning and hyperspectral image as claimed in claim 3, wherein the reduction factor is selected by considering the length and width of the tumor region, so as to determine the sampling interval in the horizontal direction and the vertical direction.

5. The tumor early recognition system based on deep learning and hyperspectral image as claimed in claim 1, wherein the characteristic wave band selection module performs the specific steps of performing differential analysis on the extracted spectral information:

the spectrum information is subjected to characteristic extraction through a plurality of characteristic selection methods;

selecting a wave band with the largest spectrum information difference in each characteristic selection method as a candidate characteristic wave band;

and selecting the intersection of all the selected candidate characteristic wave bands as the proper characteristic wave band.

6. The deep learning and hyperspectral image based tumor early recognition system of claim 5, wherein the plurality of feature selection methods includes two linear feature selection methods and two nonlinear feature selection methods.

7. The tumor early recognition system based on deep learning and hyperspectral image as claimed in claim 1, wherein in the deep learning model module, the tumor recognition model is composed of two layers, the first layer model comprises a one-dimensional convolutional neural network and a two-dimensional convolutional neural network, the second layer model comprises a self-attention residual network, and the spectral information under the selected characteristic wave band sequentially passes through the one-dimensional convolutional neural network, the two-dimensional convolutional neural network and the self-attention residual network.

8. The tumor early recognition system based on deep learning and hyperspectral image as claimed in claim 7, wherein the specific steps of performing the preliminary recognition of the tumor on the spectral information under the selected characteristic wave band by using the tumor recognition model are as follows:

Selecting a corresponding number of pixel point spectrum information according to the characteristic wave band range by utilizing a one-dimensional convolutional neural network, and arranging to form a two-dimensional image;

capturing context information in the two-dimensional image by using a two-dimensional convolutional neural network to obtain a two-dimensional image with enhanced characteristics;

and performing self-attention calculation on the two-dimensional image by using a self-attention mechanism in the self-attention residual error network, and obtaining a tumor primary identification result according to a calculation result.

9. The tumor early recognition system based on deep learning and hyperspectral image as claimed in claim 8, wherein the specific steps of the self-attention calculation are:

Mapping the two-dimensional image to three different spaces and performing self-attention calculation;

assigning different weights to different portions of the two-dimensional image according to the attention score;

and weighting different areas of the two-dimensional image according to the assigned weights.

10. The tumor early recognition system based on deep learning and hyperspectral image as claimed in claim 9, wherein the integrated learning module performs the specific steps of integrated learning on the preliminary tumor recognition result:

training a one-dimensional convolutional neural network and a two-dimensional convolutional neural network respectively by using a tumor preliminary identification result;

combining the two trained models to form a first integrated model, and summarizing the prediction results of the two models by adopting a weighted voting method to serve as a first layer prediction result;

And training the self-attention residual error network by using the training set which is the same as the two-dimensional convolutional neural network, summarizing the predicted result of the trained self-attention residual error network model by adopting a weighted voting method, and integrating the two output results of the self-attention residual error network model and the first integrated model to obtain the optimized tumor recognition model.