CN118606855A

CN118606855A - Soil contamination detection method and device based on artificial intelligence

Info

Publication number: CN118606855A
Application number: CN202410648396.4A
Authority: CN
Inventors: 王峰; 李婷婷
Original assignee: Shandong Guangyufeng Agriculture And Forestry Technology Co ltd
Current assignee: Shandong Guangyufeng Agriculture And Forestry Technology Co ltd
Priority date: 2024-05-23
Filing date: 2024-05-23
Publication date: 2024-09-06

Abstract

The invention provides a soil pollution degree detection method and device based on artificial intelligence, and relates to the technical field of data processing. Wherein the method comprises the following steps: acquiring sample data of sample soil, and marking the sample data; performing data expansion on the marked sample data to obtain expansion data; performing data processing on the extended data to obtain training data; inputting training data into a classifier to obtain a final classification model; after target data corresponding to target soil are obtained, carrying out data processing on the target data to obtain target data with characteristics reduced; inputting the target data with the characteristics reduced into a final classification model to obtain the pollution degree corresponding to the target soil; the soil pollution classification model is trained to detect the soil pollution state by the training data obtained after labeling, processing and expanding based on the sample data, so that the soil pollution detection cost is reduced, the detection time is shortened, and the application range is expanded.

Description

Soil contamination detection method and device based on artificial intelligence

技术领域Technical Field

本发明涉及数据处理技术领域，尤其是涉及一种基于人工智能的土壤污染度检测方法和装置。The present invention relates to the field of data processing technology, and in particular to a soil contamination detection method and device based on artificial intelligence.

背景技术Background Art

随着工业化和城市化的快速发展，土壤污染已成为全球范围内的一个重大环境问题。土壤中的重金属、有机污染物以及其他有害化学物质的积累对人类健康和生态系统造成了严重威胁。因此，发展一种能够快速准确判断土壤污染程度的方法显得尤为重要。传统的土壤污染检测方法多依赖于化学分析，这些方法虽然准确，但通常耗时长、成本高，且不适合大规模或实时监测。此外，这些传统方法的操作复杂，需要专业的技术人员进行样品处理和数据解析，限制了其在实时和广域环境监测中的应用。With the rapid development of industrialization and urbanization, soil pollution has become a major environmental problem worldwide. The accumulation of heavy metals, organic pollutants and other harmful chemicals in the soil poses a serious threat to human health and the ecosystem. Therefore, it is particularly important to develop a method that can quickly and accurately determine the degree of soil pollution. Traditional soil pollution detection methods mostly rely on chemical analysis. Although these methods are accurate, they are usually time-consuming, costly, and not suitable for large-scale or real-time monitoring. In addition, these traditional methods are complicated to operate and require professional technicians for sample processing and data analysis, which limits their application in real-time and wide-area environmental monitoring.

发明内容Summary of the invention

有鉴于此，本发明的目的在于提供一种基于人工智能的土壤污染度检测方法和装置，通过基于样本数据进行标注、处理和扩充后得到的训练数据，训练得到土壤污染分类模型检测土壤污染状态，降低了土壤污染检测成本，说短了检测时间，扩展了适用范围。In view of this, the purpose of the present invention is to provide a soil contamination detection method and device based on artificial intelligence. By training data obtained by annotating, processing and expanding sample data, a soil pollution classification model is trained to detect soil pollution status, thereby reducing the cost of soil pollution detection, shortening the detection time, and expanding the scope of application.

第一方面，本发明提供了一种基于人工智能的土壤污染度检测方法，包括：获取样本土壤的样本数据，对样本数据进行标注；其中，标注的类别表征样本土壤的污染程度；通过基于变分量子编码的生成对抗网络对标注后的样本数据进行数据扩充，得到扩充数据；将扩充数据进行数据处理，得到训练数据；其中，数据处理包括：特征提取处理和特征降维处理；将训练数据输入分类器中，得到最终分类模型；其中，分类器为基于自适应学习率调整的概率神经网络分类器；获取目标土壤对应的目标数据后，将目标数据进行数据处理，得到特征降为后的目标数据；将特征降为后的目标数据输入最终分类模型中，得到目标土壤对应的污染程度。In a first aspect, the present invention provides a soil contamination detection method based on artificial intelligence, comprising: obtaining sample data of sample soil, and labeling the sample data; wherein the labeled category characterizes the degree of contamination of the sample soil; performing data expansion on the labeled sample data through a generative adversarial network based on variational quantum coding to obtain expanded data; performing data processing on the expanded data to obtain training data; wherein the data processing includes: feature extraction processing and feature dimension reduction processing; inputting the training data into a classifier to obtain a final classification model; wherein the classifier is a probabilistic neural network classifier based on adaptive learning rate adjustment; after obtaining the target data corresponding to the target soil, performing data processing on the target data to obtain the target data after feature reduction; inputting the target data after feature reduction into the final classification model to obtain the degree of contamination corresponding to the target soil.

在本发明一些较佳的实施例中，训练基于变分量子编码的生成对抗网络的步骤，包括：通变分量子编码构建第一量子潜在空间；生成对抗网络的生成器基于在第一量子潜在空间采集的特征生成数据；生成对抗网络的判别器判别生成器生成的数据和真实数据的区别；通过对抗训练迭代优化生成器的参数和判别器的参数，直至生成器生成的数据的多样性满足预设的多样性条件。In some preferred embodiments of the present invention, the step of training a generative adversarial network based on variational quantum coding includes: constructing a first quantum latent space through variational quantum coding; a generator of the generative adversarial network generates data based on features collected in the first quantum latent space; a discriminator of the generative adversarial network distinguishes between data generated by the generator and real data; and iteratively optimizing parameters of the generator and parameters of the discriminator through adversarial training until the diversity of data generated by the generator meets a preset diversity condition.

在本发明一些较佳的实施例中，在对抗训练中基于增量式损失函数约束生成数据；其中，增量式损失函数基于生成器的损失函数、迭代过程中的惩罚项第一调节系数和迭代过程中的惩罚项第二调节系数约束。In some preferred embodiments of the present invention, data is generated based on an incremental loss function constraint in adversarial training; wherein the incremental loss function is based on the loss function of the generator, a first adjustment coefficient of the penalty term in the iteration process, and a second adjustment coefficient constraint of the penalty term in the iteration process.

在本发明一些较佳的实施例中，生成器生成的数据的多样性基于生成数据、生成数据的平均值、生成数据的数量和生成数据的维度评估。In some preferred embodiments of the present invention, the diversity of the data generated by the generator is evaluated based on the generated data, the average value of the generated data, the number of generated data, and the dimension of the generated data.

在本发明一些较佳的实施例中，将扩充数据进行数据处理，得到训练数据的步骤，包括：通过基于次模函数优化的神经网络模型对扩充数据进行特征提取，得到特征提取后的数据；其中，基于策略约束次模函数的优化过程；将特征提取后的数据输入特征降维模型中，得到降维后的数据，将降为后的数据确定为训练数据；其中，特征降维模型基于潜在量子编码的自编码神经网络算法确定。In some preferred embodiments of the present invention, the step of processing the expanded data to obtain training data includes: extracting features from the expanded data through a neural network model optimized based on a submodular function to obtain feature-extracted data; wherein the optimization process of the submodular function is based on strategy constraints; inputting the feature-extracted data into a feature dimensionality reduction model to obtain reduced dimensionality data, and determining the reduced dimensionality data as training data; wherein the feature dimensionality reduction model is determined based on an autoencoding neural network algorithm of potential quantum coding.

在本发明一些较佳的实施例中，训练基于次模函数优化的神经网络模型的步骤，包括：初始化神经网络模型的参数；设定优化过程中的策略约束条件；其中，策略约束的策略约束函数基于约束条件的数量、每个约束条件对应的权重和每个约束条件的计算公式约束；迭代进行训练操作，直至达到预设的最大迭代次数；训练操作包括：基于神经网络模型对输入数据进行特征提取；其中，特征提取的目标函数基于输入数据的损失函数、稀疏性约束项、调节稀疏性的超参数和策略约束影响力的超参数约束；评估特征提取后的数据的有效性，基于评估结果调整策略约束条件和神经网络模型的参数；其中，评估函数基于评估指标点数量、评估指标的计算公式约束。In some preferred embodiments of the present invention, the steps of training a neural network model based on submodular function optimization include: initializing the parameters of the neural network model; setting the strategy constraints in the optimization process; wherein the strategy constraint function of the strategy constraint is based on the number of constraints, the weight corresponding to each constraint and the calculation formula constraints of each constraint; iterating the training operation until a preset maximum number of iterations is reached; the training operation includes: extracting features of the input data based on the neural network model; wherein the objective function of the feature extraction is based on the loss function of the input data, the sparsity constraint term, the hyperparameter for adjusting the sparsity and the hyperparameter constraints of the influence of the strategy constraints; evaluating the validity of the data after feature extraction, and adjusting the strategy constraints and the parameters of the neural network model based on the evaluation results; wherein the evaluation function is based on the number of evaluation indicator points and the calculation formula constraints of the evaluation indicators.

在本发明一些较佳的实施例中，基于评估结果调整神经网络模型的参数的步骤，包括：基于动态特征感知网络调整机制调整神经网络模型的参数；其中，调整函数基于神经网络模型的当前的参数、评估函数相对于当前的参数的梯度、网络结构调整函数、学习率和动态调整因子约束；评估函数相对于当前的参数的梯度基于提取的特征值、目标特征值和特征的总数约束；网络结构调整函数基于控制网络调整幅度的超参数、控制网络敏感度的超参数和针对生成器的优化方向约束。In some preferred embodiments of the present invention, the step of adjusting the parameters of the neural network model based on the evaluation results includes: adjusting the parameters of the neural network model based on a dynamic feature-aware network adjustment mechanism; wherein the adjustment function is based on the current parameters of the neural network model, the gradient of the evaluation function relative to the current parameters, the network structure adjustment function, the learning rate and the dynamic adjustment factor constraints; the gradient of the evaluation function relative to the current parameters is based on the extracted eigenvalues, the target eigenvalues and the total number of features constraints; the network structure adjustment function is based on the hyperparameters for controlling the network adjustment amplitude, the hyperparameters for controlling the network sensitivity and the optimization direction constraints for the generator.

在本发明一些较佳的实施例中，训练特征降维模型的步骤，包括：初始化特征降维模型的参数；迭代执行数据降维操作，直至达到预设的最大迭代次数；数据降为操作包括：将输入特征降维模型的原始数据通过编码器向前传播至第二量子潜在空间得到量子位的数据；基于解码器将量子位的数据红狗会原始空间，得到重构数据；基于重构数据和原始数据采用异步策略对编码器的参数和解码器的参数进行更新；其中，基于限制性优化策略调整量子位的状态。In some preferred embodiments of the present invention, the step of training the feature dimensionality reduction model includes: initializing the parameters of the feature dimensionality reduction model; iteratively performing the data dimensionality reduction operation until a preset maximum number of iterations is reached; the data dimensionality reduction operation includes: forward propagating the original data of the input feature dimensionality reduction model to the second quantum latent space through the encoder to obtain the quantum bit data; based on the decoder, the quantum bit data is returned to the original space to obtain reconstructed data; based on the reconstructed data and the original data, the parameters of the encoder and the parameters of the decoder are updated using an asynchronous strategy; wherein, the state of the quantum bit is adjusted based on the restrictive optimization strategy.

在本发明一些较佳的实施例中，将训练数据输入分类器中，得到最终分类模型的步骤包括：将训练数据分为训练集和验证集；基于训练集训练分类器，基于验证集验证训练后的分类器，直至训练集对应的损失函数达到预设的阈值，将训练完成的分类器确定为最终分类模型；其中，在训练过程中基于自适应学习率调整机制调整分类器的学习率。In some preferred embodiments of the present invention, the training data is input into the classifier, and the step of obtaining the final classification model includes: dividing the training data into a training set and a validation set; training the classifier based on the training set, and validating the trained classifier based on the validation set until the loss function corresponding to the training set reaches a preset threshold, and determining the trained classifier as the final classification model; wherein, during the training process, the learning rate of the classifier is adjusted based on an adaptive learning rate adjustment mechanism.

第二方面，本发明提供了一种基于人工智能的土壤污染度检测装置，包括：训练数据获取模块，用于获取样本土壤的样本数据，对样本数据进行标注；其中，标注的类别表征样本土壤的污染程度；数据扩充模块，用于通过基于变分量子编码的生成对抗网络对标注后的样本数据进行数据扩充，得到扩充数据；数据处理模块，用于将扩充数据进行数据处理，得到训练数据；其中，数据处理包括：特征提取处理和特征降维处理；数据分类模块，用于将训练数据输入分类器中，得到最终分类模型；其中，分类器为基于自适应学习率调整的概率神经网络分类器；目标数据获取模块，用于获取目标土壤对应的目标数据后，将目标数据进行数据处理，得到特征降为后的目标数据；数据评估模块，用于将特征降为后的目标数据输入最终分类模型中，得到目标土壤对应的污染程度。In a second aspect, the present invention provides a soil contamination detection device based on artificial intelligence, comprising: a training data acquisition module, used to acquire sample data of sample soil and label the sample data; wherein the labeled category represents the degree of contamination of the sample soil; a data expansion module, used to perform data expansion on the labeled sample data through a generative adversarial network based on variational quantum coding to obtain expanded data; a data processing module, used to perform data processing on the expanded data to obtain training data; wherein the data processing includes: feature extraction processing and feature dimension reduction processing; a data classification module, used to input the training data into a classifier to obtain a final classification model; wherein the classifier is a probabilistic neural network classifier based on adaptive learning rate adjustment; a target data acquisition module, used to obtain the target data corresponding to the target soil, and then perform data processing on the target data to obtain the target data after feature reduction; a data evaluation module, used to input the target data after feature reduction into the final classification model to obtain the degree of contamination corresponding to the target soil.

本发明带来了以下有益效果：The present invention brings the following beneficial effects:

本发明提供了一种基于人工智能的土壤污染度检测方法和装置，该方法包括：获取样本土壤的样本数据，对样本数据进行标注；其中，标注的类别表征样本土壤的污染程度；通过基于变分量子编码的生成对抗网络对标注后的样本数据进行数据扩充，得到扩充数据；将扩充数据进行数据处理，得到训练数据；其中，数据处理包括：特征提取处理和特征降维处理；将训练数据输入分类器中，得到最终分类模型；其中，分类器为基于自适应学习率调整的概率神经网络分类器；获取目标土壤对应的目标数据后，将目标数据进行数据处理，得到特征降为后的目标数据；将特征降为后的目标数据输入最终分类模型中，得到目标土壤对应的污染程度；通过基于样本数据进行标注、处理和扩充后得到的训练数据，训练得到土壤污染分类模型检测土壤污染状态，降低了土壤污染检测成本，说短了检测时间，扩展了适用范围。The present invention provides a soil contamination detection method and device based on artificial intelligence, the method comprising: obtaining sample data of sample soil, and marking the sample data; wherein the marked category represents the pollution degree of the sample soil; performing data expansion on the marked sample data through a generative adversarial network based on variational quantum coding to obtain expanded data; performing data processing on the expanded data to obtain training data; wherein the data processing comprises: feature extraction processing and feature dimension reduction processing; inputting the training data into a classifier to obtain a final classification model; wherein the classifier is a probabilistic neural network classifier based on adaptive learning rate adjustment; after obtaining target data corresponding to the target soil, performing data processing on the target data to obtain target data after feature reduction; inputting the target data after feature reduction into the final classification model to obtain the pollution degree corresponding to the target soil; and detecting the soil contamination state by training a soil contamination classification model based on the training data obtained after marking, processing and expanding the sample data, thereby reducing the cost of soil contamination detection, shortening the detection time, and expanding the scope of application.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本发明具体实施方式或现有技术中的技术方案，下面将对具体实施方式或现有技术描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图是本发明的一些实施方式，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the specific implementation methods of the present invention or the technical solutions in the prior art, the drawings required for use in the specific implementation methods or the description of the prior art will be briefly introduced below. Obviously, the drawings described below are some implementation methods of the present invention. For ordinary technicians in this field, other drawings can be obtained based on these drawings without paying creative work.

图1为本发明实施例提供的一种基于人工智能的土壤污染度检测方法的流程图；FIG1 is a flow chart of a soil contamination detection method based on artificial intelligence provided by an embodiment of the present invention;

图2为本发明实施例提供的一种训练基于变分量子编码的生成对抗网络的流程图；FIG2 is a flowchart of a method for training a generative adversarial network based on variational quantum coding provided by an embodiment of the present invention;

图3为本发明实施例提供的一种基于人工智能的土壤污染度检测装置的结构示意图；FIG3 is a schematic diagram of the structure of a soil contamination detection device based on artificial intelligence provided by an embodiment of the present invention;

图4为本发明实施例提供的一种电子设备的结构示意图。FIG. 4 is a schematic diagram of the structure of an electronic device provided by an embodiment of the present invention.

图标：310-训练数据获取模块；320-数据扩充模块；330-数据处理模块；340-数据分类模块；350-目标数据获取模块；360-数据评估模块；400-存储器；401-处理器；402-总线；403-通信接口。Icons: 310 - training data acquisition module; 320 - data expansion module; 330 - data processing module; 340 - data classification module; 350 - target data acquisition module; 360 - data evaluation module; 400 - memory; 401 - processor; 402 - bus; 403 - communication interface.

具体实施方式DETAILED DESCRIPTION

为使本发明实施例的目的、技术方案和优点更加清楚，下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本发明一部分实施例，而不是全部的实施例。通常在此处附图中描述和示出的本发明实施例的组件可以以各种不同的配置来布置和设计。In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments are part of the embodiments of the present invention, not all of the embodiments. Generally, the components of the embodiments of the present invention described and shown in the drawings here can be arranged and designed in various different configurations.

因此，以下对在附图中提供的本发明的实施例的详细描述并非旨在限制要求保护的本发明的范围，而是仅仅表示本发明的选定实施例。基于本发明中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。Therefore, the following detailed description of the embodiments of the present invention provided in the accompanying drawings is not intended to limit the scope of the invention claimed for protection, but merely represents selected embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of the present invention.

应注意到：相似的标号和字母在下面的附图中表示类似项，因此，一旦某一项在一个附图中被定义，则在随后的附图中不需要对其进行进一步定义和解释。It should be noted that similar reference numerals and letters denote similar items in the following drawings, and therefore, once an item is defined in one drawing, it does not require further definition and explanation in the subsequent drawings.

在本发明的描述中，需要说明的是，术语“中心”、“上”、“下”、“左”、“右”、“竖直”、“水平”、“内”、“外”等指示的方位或位置关系为基于附图所示的方位或位置关系，或者是该发明产品使用时惯常摆放的方位或位置关系，仅是为了便于描述本发明和简化描述，而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作，因此不能理解为对本发明的限制。此外，术语“第一”、“第二”、“第三”等仅用于区分描述，而不能理解为指示或暗示相对重要性。In the description of the present invention, it should be noted that the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inside", "outside", etc. indicate positions or positional relationships based on the positions or positional relationships shown in the accompanying drawings, or the positions or positional relationships in which the inventive product is usually placed when in use. They are only for the convenience of describing the present invention and simplifying the description, and do not indicate or imply that the device or element referred to must have a specific position, be constructed and operated in a specific position, and therefore cannot be understood as limiting the present invention. In addition, the terms "first", "second", "third", etc. are only used to distinguish the description, and cannot be understood as indicating or implying relative importance.

此外，术语“水平”、“竖直”、“悬垂”等术语并不表示要求部件绝对水平或悬垂，而是可以稍微倾斜。如“水平”仅仅是指其方向相对“竖直”而言更加水平，并不是表示该结构一定要完全水平，而是可以稍微倾斜。In addition, the terms "horizontal", "vertical", "overhanging" and the like do not mean that the components are required to be absolutely horizontal or overhanging, but can be slightly tilted. For example, "horizontal" only means that its direction is more horizontal than "vertical", and does not mean that the structure must be completely horizontal, but can be slightly tilted.

在本发明的描述中，还需要说明的是，除非另有明确的规定和限定，术语“设置”、“安装”、“相连”、“连接”应做广义理解，例如，可以是固定连接，也可以是可拆卸连接，或一体地连接；可以是机械连接，也可以是电连接；可以是直接相连，也可以通过中间媒介间接相连，可以是两个元件内部的连通。对于本领域的普通技术人员而言，可以具体情况理解上述术语在本发明中的具体含义。In the description of the present invention, it is also necessary to explain that, unless otherwise clearly specified and limited, the terms "set", "install", "connect", and "connect" should be understood in a broad sense, for example, it can be a fixed connection, a detachable connection, or an integral connection; it can be a mechanical connection or an electrical connection; it can be a direct connection, or it can be indirectly connected through an intermediate medium, or it can be the internal communication of two elements. For ordinary technicians in this field, the specific meanings of the above terms in the present invention can be understood according to specific circumstances.

在现有技术中公开了一些技术手段，通过中控单元对获取单元的获取路径进行划分，能够保证土壤样本的准确性，降低对土壤污染监测的误差从而降低对水域污染的监测的影响，检测单元检测的水域样本和河床样本的重金属含量与土壤样本的重金属含量进行比对以确认土壤污染的污染源，能够智能计算土壤样本的检测路径和样本数量并判断土壤的污染源、及时定位出现异常时能够迅速给出排查节约大量的时间成本，实现路径的快速划分，从而提高了土壤检测的效率，进一步的避免了由于水域污染导致的土壤污染从而导致的农作物污染。Some technical means are disclosed in the prior art. By dividing the acquisition path of the acquisition unit through the central control unit, the accuracy of soil samples can be guaranteed, the error in soil pollution monitoring can be reduced, and the impact on water pollution monitoring can be reduced. The heavy metal content of the water samples and riverbed samples detected by the detection unit is compared with the heavy metal content of the soil samples to confirm the source of soil pollution. The detection path and sample quantity of the soil samples can be intelligently calculated, and the source of soil pollution can be determined and the abnormality can be located in time. It can quickly provide an investigation and save a lot of time cost, realize the rapid division of the path, thereby improving the efficiency of soil detection, and further avoiding the soil pollution caused by water pollution and the resulting crop pollution.

但是，现有技术中在训练过程中，训练样本方面往往受限于数据的真实性和多样性，难以有效提高模型的泛化能力。However, in the existing technology, during the training process, the training samples are often limited by the authenticity and diversity of the data, making it difficult to effectively improve the generalization ability of the model.

基于此，本发明提供了一种基于人工智能的土壤污染度检测方法和装置。下面结合附图，对本发明的一些实施方式作详细说明。在不冲突的情况下，下述的实施例及实施例中的特征可以相互组合。Based on this, the present invention provides a soil contamination detection method and device based on artificial intelligence. Some embodiments of the present invention are described in detail below in conjunction with the accompanying drawings. The following embodiments and features in the embodiments can be combined with each other without conflict.

实施例一Embodiment 1

本发明实施例提供了一种基于人工智能的土壤污染度检测方法，参见图1所示的本发明实施例提供的一种基于人工智能的土壤污染度检测方法的流程图，该方法包括：The embodiment of the present invention provides a soil contamination detection method based on artificial intelligence. Referring to FIG. 1 , a flow chart of a soil contamination detection method based on artificial intelligence provided by the embodiment of the present invention is shown. The method includes:

步骤S102，获取样本土壤的样本数据，对样本数据进行标注；其中，标注的类别表征样本土壤的污染程度。Step S102, obtaining sample data of the sample soil, and labeling the sample data; wherein the labeled category represents the degree of contamination of the sample soil.

具体的，需要收集样本土壤并进行实验室分析以获得样本数据。这些数据可能包括化学组成、物理特性、生物学指标等。随后，对每个样本进行污染程度的标注，这可以通过专家知识或者参考现有的环境标准来完成。标注的结果将作为后续机器学习模型训练的基础。Specifically, it is necessary to collect soil samples and conduct laboratory analysis to obtain sample data. This data may include chemical composition, physical properties, biological indicators, etc. Subsequently, each sample is labeled with the degree of pollution, which can be done through expert knowledge or reference to existing environmental standards. The labeled results will serve as the basis for subsequent machine learning model training.

在本发明一些较佳的实施例中，土壤污染数据的采集于多个地理区域内的土壤样本，涵盖不同的污染级别，包括工业区、农业区以及居民区，采样方法采用随机取样法，以保证样本的代表性和多样性。In some preferred embodiments of the present invention, soil pollution data is collected from soil samples in multiple geographical areas, covering different pollution levels, including industrial areas, agricultural areas and residential areas, and the sampling method adopts random sampling to ensure the representativeness and diversity of the samples.

在本发明一些较佳的实施例中，使用自动化采样设备在不同地点采集土壤样本。这些设备可以配备GPS定位系统，确保采样地点的精确记录；将土壤样本送至实验室进行详细分析，包括对有机污染物、无机物含量、重金属含量等参数的测定；专家根据现行的环境标准和土壤质量指南对每个样本进行污染程度的分类和标注。In some preferred embodiments of the present invention, soil samples are collected at different locations using automated sampling equipment. These equipment can be equipped with a GPS positioning system to ensure accurate record of the sampling location; the soil samples are sent to a laboratory for detailed analysis, including determination of parameters such as organic pollutants, inorganic content, and heavy metal content; experts classify and label the degree of contamination of each sample according to current environmental standards and soil quality guidelines.

在本发明一些较佳的实施例中，采集的数据属性包括：a1:pH值，a2:有机物含量(％)，a3:铅含量(ppm)，a4:汞含量(ppm)，a5:砷含量(ppm)，a6:镉含量(ppm)，a7:氮含量(％)，a8:磷含量(％)，a9:钾含量(％)和a10:重金属总量(ppm)；需要强调的是，本实施例仅为说明本发明的一种数据格式和种类，在实际应用中，数据的属性通常多于10个属性，数据的属性数量可能达到几十个甚至上百个。进一步的，对采集到的数据进行标注，本发明的标注方式为人工标注，在本发明一些较佳的实施例中，标注的类别包括“未污染”、“轻度污染”、“中度污染”和“重度污染”4类。In some preferred embodiments of the present invention, the collected data attributes include: a1: pH value, a2: organic matter content (%), a3: lead content (ppm), a4: mercury content (ppm), a5: arsenic content (ppm), a6: cadmium content (ppm), a7: nitrogen content (%), a8: phosphorus content (%), a9: potassium content (%) and a10: total heavy metal content (ppm); it should be emphasized that this embodiment is only to illustrate a data format and type of the present invention. In practical applications, the attributes of the data are usually more than 10 attributes, and the number of attributes of the data may reach dozens or even hundreds. Further, the collected data is annotated, and the annotation method of the present invention is manual annotation. In some preferred embodiments of the present invention, the annotated categories include "unpolluted", "mildly polluted", "moderately polluted" and "severely polluted".

步骤S104，通过基于变分量子编码的生成对抗网络对标注后的样本数据进行数据扩充，得到扩充数据。Step S104, performing data expansion on the labeled sample data through a generative adversarial network based on variational quantum coding to obtain expanded data.

具体的，为了提高模型的泛化能力，使用基于变分量子编码的生成对抗网络对已有的数据进行扩充。这种技术可以生成新的、多样化且符合实际分布的土壤样本数据，从而增加数据集的规模和多样性。Specifically, in order to improve the generalization ability of the model, a generative adversarial network based on variational quantum coding is used to expand the existing data. This technology can generate new, diverse and realistically distributed soil sample data, thereby increasing the scale and diversity of the dataset.

在本发明一些较佳的实施例中，引入基于量子计算的概念，通过量子比特(qubits)来编码数据，利用量子纠缠和叠加原理，提高数据处理的效率和能力。In some preferred embodiments of the present invention, concepts based on quantum computing are introduced, data is encoded by quantum bits (qubits), and the principles of quantum entanglement and superposition are used to improve the efficiency and capability of data processing.

步骤S106，将扩充数据进行数据处理，得到训练数据；其中，数据处理包括：特征提取处理和特征降维处理。Step S106, processing the expanded data to obtain training data; wherein the data processing includes: feature extraction processing and feature dimension reduction processing.

具体的，数据处理包括特征提取和特征降维两个部分。特征提取是指从原始数据中提取出对分类任务有帮助的信息，而特征降维则是减少数据的复杂性，去除冗余信息，提高计算效率。Specifically, data processing includes two parts: feature extraction and feature dimensionality reduction. Feature extraction refers to extracting information that is helpful for classification tasks from raw data, while feature dimensionality reduction reduces the complexity of data, removes redundant information, and improves computational efficiency.

在本发明一些较佳的实施例中，应用高级数据处理技术如主成分分析(PCA)、小波变换或深度学习特征提取器，以从原始数据中提取关键信息。In some preferred embodiments of the present invention, advanced data processing techniques such as principal component analysis (PCA), wavelet transform or deep learning feature extractor are applied to extract key information from the raw data.

步骤S108，将训练数据输入分类器中，得到最终分类模型；其中，分类器为基于自适应学习率调整的概率神经网络分类器。Step S108, inputting the training data into the classifier to obtain a final classification model; wherein the classifier is a probabilistic neural network classifier based on adaptive learning rate adjustment.

具体的，将处理好的数据输入到一个分类器中进行训练。这里使用的是基于自适应学习率调整的概率神经网络分类器。自适应学习率可以帮助模型更快地收敛到最优解，提高训练效率。Specifically, the processed data is input into a classifier for training. Here, a probabilistic neural network classifier based on adaptive learning rate adjustment is used. Adaptive learning rate can help the model converge to the optimal solution faster and improve training efficiency.

步骤S110，获取目标土壤对应的目标数据后，将目标数据进行数据处理，得到特征降为后的目标数据。Step S110, after acquiring the target data corresponding to the target soil, the target data is processed to obtain the target data after feature reduction.

具体的，当需要对新的目标土壤样本进行污染程度检测时，首先要对这个目标数据进行与训练数据相同的数据处理流程，以确保特征的一致性。Specifically, when a new target soil sample needs to be tested for contamination, the target data must first be processed in the same way as the training data to ensure feature consistency.

步骤S112，将特征降为后的目标数据输入最终分类模型中，得到目标土壤对应的污染程度。Step S112, inputting the target data after feature reduction into the final classification model to obtain the pollution degree corresponding to the target soil.

具体的，将处理后的目标数据输入到已经训练好的最终分类模型中，模型将输出目标土壤的污染程度。这个结果可以用于环境监测、农业管理等多个领域。Specifically, the processed target data is input into the trained final classification model, and the model will output the pollution degree of the target soil. This result can be used in many fields such as environmental monitoring and agricultural management.

在本发明一些较佳的实施例中，模型训练完成后，对新收集的土壤污染度检测数据样本，这通过特征提取、和特征降维步骤处理，将处理后的特征输入到分类器中，以进行土壤污染度分类。在一个实施例中，分类的类别包括“未污染”、“轻度污染”、“中度污染”和“重度污染”4类。In some preferred embodiments of the present invention, after the model training is completed, the newly collected soil pollution detection data samples are processed through feature extraction and feature dimension reduction steps, and the processed features are input into the classifier to classify the soil pollution. In one embodiment, the classification categories include "unpolluted", "mildly polluted", "moderately polluted" and "heavily polluted".

本发明提供了一种基于人工智能的土壤污染度检测方法，包括：获取样本土壤的样本数据，对样本数据进行标注；其中，标注的类别表征样本土壤的污染程度；通过基于变分量子编码的生成对抗网络对标注后的样本数据进行数据扩充，得到扩充数据；将扩充数据进行数据处理，得到训练数据；其中，数据处理包括：特征提取处理和特征降维处理；将训练数据输入分类器中，得到最终分类模型；其中，分类器为基于自适应学习率调整的概率神经网络分类器；获取目标土壤对应的目标数据后，将目标数据进行数据处理，得到特征降为后的目标数据；将特征降为后的目标数据输入最终分类模型中，得到目标土壤对应的污染程度；通过基于样本数据进行标注、处理和扩充后得到的训练数据，训练得到土壤污染分类模型检测土壤污染状态，降低了土壤污染检测成本，说短了检测时间，扩展了适用范围。通过人工智能技术，可以在较短的时间内对大量土壤样本进行分析，而且模型的预测结果具有较高的可靠性。此外，这种方法还具有很好的扩展性，可以根据不同的应用场景调整模型参数，以适应不同的需求。The present invention provides a soil contamination detection method based on artificial intelligence, comprising: obtaining sample data of sample soil, and marking the sample data; wherein the marked category represents the pollution degree of the sample soil; performing data expansion on the marked sample data through a generative adversarial network based on variational quantum coding to obtain expanded data; performing data processing on the expanded data to obtain training data; wherein the data processing includes: feature extraction processing and feature dimension reduction processing; inputting the training data into a classifier to obtain a final classification model; wherein the classifier is a probabilistic neural network classifier based on adaptive learning rate adjustment; after obtaining the target data corresponding to the target soil, performing data processing on the target data to obtain the target data after feature reduction; inputting the target data after feature reduction into the final classification model to obtain the pollution degree corresponding to the target soil; and training the soil pollution classification model to detect the soil pollution state by the training data obtained after marking, processing and expanding the sample data, thereby reducing the cost of soil pollution detection, shortening the detection time, and expanding the scope of application. Through artificial intelligence technology, a large number of soil samples can be analyzed in a relatively short time, and the prediction results of the model have high reliability. In addition, this method has good scalability and can adjust model parameters according to different application scenarios to meet different needs.

实施例二Embodiment 2

在上述实施例的基础上，本发明实施例提供了另一种基于人工智能的土壤污染度检测方法，重点描述数据扩充、特征提取、特征降维和分类器训练等步骤。On the basis of the above embodiments, the embodiments of the present invention provide another soil contamination detection method based on artificial intelligence, focusing on describing the steps of data expansion, feature extraction, feature dimensionality reduction and classifier training.

可以理解的是，训练数据的采集获取、标注及预处理是耗时耗力的，且训练样本不足容易导致模型泛化能力差，同时影响模型的精度。本发明实施例提出一种基于变分量子编码的生成对抗网络算法，利用量子计算的优势来增强生成模型的能力，同时通过增量式损失函数实现更精细的生成效果控制，不仅考虑了生成数据与真实数据之间的全局差异，还通过逐渐增加的惩罚项来引导模型关注生成过程中的细节，从而在多次迭代后获得更加逼真的数据，以提升模型对小样本数据的学习能力和生成数据的多样性与真实性。It is understandable that the collection, acquisition, labeling and preprocessing of training data are time-consuming and labor-intensive, and insufficient training samples can easily lead to poor generalization ability of the model, while affecting the accuracy of the model. The embodiment of the present invention proposes a generative adversarial network algorithm based on variational quantum coding, which uses the advantages of quantum computing to enhance the ability of the generative model, and at the same time achieves more refined control of the generation effect through an incremental loss function. It not only considers the global difference between the generated data and the real data, but also guides the model to pay attention to the details of the generation process through gradually increasing penalty terms, so as to obtain more realistic data after multiple iterations, so as to improve the model's learning ability for small sample data and the diversity and authenticity of the generated data.

具体的，在本发明一些较佳的实施例中，参见图2所示的本发明实施例提供的一种训练基于变分量子编码的生成对抗网络的流程图，具体包括：Specifically, in some preferred embodiments of the present invention, referring to FIG. 2 , a flowchart of training a generative adversarial network based on variational quantum coding provided by an embodiment of the present invention specifically includes:

步骤S202，通变分量子编码构建第一量子潜在空间。Step S202, constructing a first quantum latent space by variable component quantum coding.

具体的，首先，利用变分量子编码技术构建一个量子潜在空间。这个空间是高维的，允许存储和处理大量的量子态信息。通过量子电路设计，将土壤样本数据映射到量子位(qubits)上，形成量子态。这些量子态包含了原始数据的复杂关系和模式。使用量子机器学习算法，如变分量子特征提取，从量子态中提取有用的特征，并构建出第一量子潜在空间。Specifically, first, a quantum latent space is constructed using variational quantum coding technology. This space is high-dimensional and allows the storage and processing of a large amount of quantum state information. Through quantum circuit design, soil sample data is mapped to quantum bits (qubits) to form quantum states. These quantum states contain complex relationships and patterns of the original data. Using quantum machine learning algorithms, such as variational quantum feature extraction, useful features are extracted from the quantum states and the first quantum latent space is constructed.

在本发明一些较佳的实施例中，在初始阶段，通过变分量子算法构建潜在空间，所述潜在空间具备高维度特性，为后续生成提供更丰富的数据表示基础；具体包括，定义量子潜在空间的构建过程，通过变分量子编码实现。考虑到变分量子编码的目标是最小化量子态与目标态之间的距离，本发明实施例使用以下方式来表示这一过程：In some preferred embodiments of the present invention, in the initial stage, a latent space is constructed by a variational quantum algorithm, and the latent space has high-dimensional characteristics, providing a richer data representation basis for subsequent generation; specifically, it includes defining the construction process of the quantum latent space, which is implemented by variational quantum coding. Considering that the goal of variational quantum coding is to minimize the distance between the quantum state and the target state, the embodiment of the present invention uses the following method to represent this process:

量子态表示为：|ψ(b_x)>，其中，b_x为从数据集中提取的特征向量。The quantum state is represented as: |ψ(b _x )>, where b _x is the feature vector extracted from the data set.

目标态表示为：|φ(b_y)>，其中，b_y为目标特征向量，即通过某种归一化预处理的数据。The target state is expressed as: |φ(by ₎ >, where _by is the target feature vector, that is, the data preprocessed by some normalization.

量子态与目标态之间的距离通过FidelityF来度量，计算方式为公式(1)：The distance between the quantum state and the target state is measured by FidelityF, which is calculated as formula (1):

F＝|<ψ(b_x)|φ(b_y)>|² (1)F＝|<ψ(b _x )|φ(b _y )>| ² (1)

其中，目标是通过调整量子编码过程中的参数b_θ来最大化F。Here, the goal is to maximize F by adjusting the parameters b _θ in the quantum encoding process.

进一步地，量子态|ψ(b_x)>的构建方式可以表示为公式(2)：Furthermore, the construction method of the quantum state |ψ(b _x )> can be expressed as formula (2):

|ψ(b_x)>＝U(θ)|0>ⁿ (2)|ψ(b _x )>＝U(θ)|0> ⁿ (2)

其中，U(θ)为应用于初始量子比特态|0>ⁿ(n代表量子比特的数量)的参数化量子电路，θ为电路的参数，由经典优化算法调整以最小化目标函数。在本发明一些较佳的实施例中，量子比特数n被设置为5。Wherein, U(θ) is a parameterized quantum circuit applied to the initial quantum bit state |0> ⁿ (n represents the number of quantum bits), and θ is a parameter of the circuit, which is adjusted by a classical optimization algorithm to minimize the objective function. In some preferred embodiments of the present invention, the number of quantum bits n is set to 5.

且，目标态|φ(b_y)>的构建方式可以表示为公式(3)：Moreover, the construction method of the target state |φ(b _y )> can be expressed as formula (3):

其中，c_k为基于目标特征b_y计算得到的系数，K为基态的数量，|k>为量子基态。Among them, c _k is the coefficient calculated based on the target feature _by , K is the number of ground states, and |k> is the quantum ground state.

进一步的，在构建的量子潜在空间基础上，使用生成对抗网络的框架进行训练。生成器负责从量子潜在空间中采样并生成数据，判别器则尝试区分生成数据与真实数据。通过对抗训练，不断优化生成器和判别器的参数，以提升生成数据的真实性和多样性。Furthermore, based on the constructed quantum latent space, the framework of generative adversarial network is used for training. The generator is responsible for sampling and generating data from the quantum latent space, while the discriminator attempts to distinguish the generated data from the real data. Through adversarial training, the parameters of the generator and discriminator are continuously optimized to improve the authenticity and diversity of the generated data.

步骤S204，生成对抗网络的生成器基于在第一量子潜在空间采集的特征生成数据。Step S204: a generator of a generative adversarial network generates data based on features collected in the first quantum latent space.

具体的，在构建好的第一量子潜在空间中，通过量子采样采集潜在的特征，这些特征能够代表土壤样本的关键信息。生成对抗网络(GANs)的生成器接收这些潜在特征作为输入，并学习如何生成与真实土壤样本数据相似的合成数据。生成器由一系列神经网络层组成，它们将潜在特征转换成具有类似真实样本数据分布的合成数据。Specifically, in the constructed first quantum latent space, potential features are collected through quantum sampling, which can represent the key information of soil samples. The generator of the generative adversarial network (GANs) receives these potential features as input and learns how to generate synthetic data similar to real soil sample data. The generator consists of a series of neural network layers that convert potential features into synthetic data with a distribution similar to real sample data.

在本发明一些较佳的实施例中，在基于变分量子编码的生成对抗网络的框架下，生成器损失函数L_G的计算方式表示为公式(4)：In some preferred embodiments of the present invention, in the framework of a generative adversarial network based on variational quantum coding, the calculation method of the generator loss function _LG is expressed as formula (4):

L_G＝-log D(G(b_z))+λ(1-F) (4)L _G =-log D(G(b _z ))+λ(1-F) (4)

其中，b_z为从量子潜在空间采样得到的噪声向量，λ为平衡项，用于调节对抗损失和量子编码距离之间的权重。在本发明一些较佳的实施例中，平衡项λ被设置为0.1。Wherein, b _z is a noise vector sampled from the quantum latent space, and λ is a balancing term used to adjust the weight between the adversarial loss and the quantum coding distance. In some preferred embodiments of the present invention, the balancing term λ is set to 0.1.

步骤S206，生成对抗网络的判别器判别生成器生成的数据和真实数据的区别。Step S206, the discriminator of the generative adversarial network distinguishes the difference between the data generated by the generator and the real data.

具体的，生成对抗网络的判别器的任务是区分生成器产生的合成数据和真实的土壤样本数据。判别器也是一个神经网络，它通过学习识别数据中的微小差异来评估生成数据的真实性。判别器的输出是一个概率值，表示数据是真实样本的可能性。Specifically, the task of the discriminator of the generative adversarial network is to distinguish between the synthetic data generated by the generator and the real soil sample data. The discriminator is also a neural network that evaluates the authenticity of the generated data by learning to identify small differences in the data. The output of the discriminator is a probability value that indicates the possibility that the data is a real sample.

在本发明一些较佳的实施例中，在基于变分量子编码的生成对抗网络的框架下，判别器损失函数L_D的计算方式表示为公式(5)：In some preferred embodiments of the present invention, in the framework of a generative adversarial network based on variational quantum coding, the calculation method of the discriminator loss function _LD is expressed as formula (5):

L_D＝-[log D(b_y)+log (1-D(G(b_z)))] (5)L _D =-[log D(b _y )+log (1-D(G(b _z )))] (5)

进一步的，在对抗训练过程中，采用增量式损失函数来细化生成数据的质量。所述增量式损失函数通过在每次迭代中增加额外的惩罚项，使得模型更加关注于生成数据的细节和质量，进而实现高质量数据的生成。Furthermore, during adversarial training, an incremental loss function is used to refine the quality of generated data. The incremental loss function adds additional penalty terms in each iteration, so that the model pays more attention to the details and quality of generated data, thereby achieving the generation of high-quality data.

步骤S208，通过对抗训练迭代优化生成器的参数和判别器的参数，直至生成器生成的数据的多样性满足预设的多样性条件。Step S208, iteratively optimize the parameters of the generator and the parameters of the discriminator through adversarial training until the diversity of the data generated by the generator meets the preset diversity conditions.

具体的，在对抗训练过程中，生成器和判别器进行一个迭代的竞争和优化过程。生成器尝试产生越来越真实的数据以“欺骗”判别器，而判别器则努力提高区分真伪的能力。通过反复的训练迭代，生成器逐渐学会生成多样化且高度逼真的土壤样本数据。训练继续进行，直到生成的数据满足预定的多样性条件，即生成的数据足够丰富，能够覆盖真实数据的主要特征和分布。Specifically, during adversarial training, the generator and the discriminator undergo an iterative competition and optimization process. The generator tries to generate more and more realistic data to "cheat" the discriminator, while the discriminator strives to improve its ability to distinguish between true and false. Through repeated training iterations, the generator gradually learns to generate diverse and highly realistic soil sample data. Training continues until the generated data meets the predetermined diversity conditions, that is, the generated data is rich enough to cover the main features and distribution of real data.

在本发明一些较佳的实施例中，增量式损失函数的设计目的是在每次迭代中逐步提升生成数据的细节质量，定义增量式损失函数L_inc为公式(6)：In some preferred embodiments of the present invention, the incremental loss function is designed to gradually improve the detail quality of the generated data in each iteration, and the incremental loss function _Linc is defined as formula (6):

其中，ΔL_n为第n次迭代时增加的惩罚项，α为增量式损失函数的第一调节系数，β_n为增量式损失函数的第而调节系数，N是迭代次数。在本发明一些较佳的实施例中，α被设置为0.5，β_n的序列以指数形式递增，起始值设置为0.05。Wherein, _ΔLn is the penalty term added at the nth iteration, α is the first adjustment coefficient of the incremental loss function, _βn is the second adjustment coefficient of the incremental loss function, and N is the number of iterations. In some preferred embodiments of the present invention, α is set to 0.5, and the sequence of _βn increases exponentially, with the starting value set to 0.05.

进一步地，ΔL_n的计算方式可以表示为公式(7)：Furthermore, the calculation method of _ΔLn can be expressed as formula (7):

其中，为生成数据相对于输入的梯度，为真实数据相对于相同输入的梯度。in, To generate data relative to the input The gradient of is the gradient of the true data with respect to the same input.

通过上述步骤，基于变分量子编码的生成对抗网络能够有效地扩充土壤样本数据集，为后续的模型训练提供更加全面和代表性的数据。这不仅有助于提高模型的泛化能力，还能够在一定程度上解决因实际采样困难或成本过高而造成的数据不足问题。Through the above steps, the generative adversarial network based on variational quantum coding can effectively expand the soil sample data set and provide more comprehensive and representative data for subsequent model training. This not only helps to improve the generalization ability of the model, but also can solve the problem of insufficient data caused by actual sampling difficulties or high costs to a certain extent.

进一步的，在本发明一些较佳的实施例中，在对抗训练中基于增量式损失函数约束生成数据；其中，增量式损失函数基于生成器的损失函数、迭代过程中的惩罚项第一调节系数和迭代过程中的惩罚项第二调节系数约束。Furthermore, in some preferred embodiments of the present invention, data is generated based on an incremental loss function constraint in adversarial training; wherein the incremental loss function is based on the loss function of the generator, a first adjustment coefficient of the penalty term in the iteration process, and a second adjustment coefficient constraint of the penalty term in the iteration process.

具体的，在每一轮训练结束后，根据生成数据与真实数据之间的差异进行反馈调整，本发明实施例使用以下方式来表示这一过程：Specifically, after each round of training, feedback adjustment is performed based on the difference between the generated data and the real data. The embodiment of the present invention uses the following method to represent this process:

计算生成数据与真实数据之间的差异度量D_diff，可以表示为公式(8)：The difference measure D _diff between the generated data and the real data is calculated and can be expressed as formula (8):

D_diff＝∥G(b_z)-b_y∥₂ (8)D _diff ＝∥G(b _z )-b _y ∥ ₂ (8)

根据D_diff调整量子编码参数b_θ和损失函数权重α,β_n。Adjust the quantum coding parameters b _θ and loss function weights α, β _n according to D _diff .

进一步的，在本发明一些较佳的实施例中，生成器生成的数据的多样性基于生成数据、生成数据的平均值、生成数据的数量和生成数据的维度评估。Furthermore, in some preferred embodiments of the present invention, the diversity of the data generated by the generator is evaluated based on the generated data, the average value of the generated data, the number of generated data, and the dimension of the generated data.

具体的，为了确保生成数据的多样性，本发明采用多样性指数D_var来评估，其计算方式为公式(9)：Specifically, in order to ensure the diversity of generated data, the present invention uses the diversity index D _var for evaluation, which is calculated as formula (9):

其中，M为生成数据的数量，d为数据维度，为所有生成数据的均值向量。Among them, M is the number of generated data, d is the data dimension, is the mean vector of all generated data.

进一步地，生成数据的均值向量的计算方式可以表示为公式(10)：Furthermore, we generate the mean vector of the data The calculation method can be expressed as formula (10):

其中，表示第i个生成数据向量，M是生成数据的总数。通过计算所有生成数据向量的平均值，得到均值向量用于评估生成数据的多样性。在本发明一些较佳的实施例中，M被设置为1000。in, represents the i-th generated data vector, and M is the total number of generated data. By calculating the average value of all generated data vectors, we get the mean vector Used to evaluate the diversity of generated data. In some preferred embodiments of the present invention, M is set to 1000.

本方面实施例提供的基于工智能的土壤污染度检测方法对生成的数据进行多样性评估，如果数据多样性不足，将通过调整量子潜在空间的参数，增加生成数据的变化范围，确保最终生成的数据既真实又多样。The soil contamination detection method based on artificial intelligence provided by the embodiment of this aspect performs diversity assessment on the generated data. If the data diversity is insufficient, the parameters of the quantum latent space are adjusted to increase the variation range of the generated data, thereby ensuring that the ultimately generated data is both real and diverse.

进一步的，现有技术在特征提取方面可能无法有效地过滤掉无关特征，影响模型的性能和准确度。传统特征降维技术可能在数据压缩过程中丢失关键信息，影响后续的数据处理和分析效果。在本发明一些较佳的实施例中，将扩充数据进行数据处理，得到训练数据包括下述步骤A1至A2：Furthermore, the prior art may not be able to effectively filter out irrelevant features in feature extraction, affecting the performance and accuracy of the model. Traditional feature dimensionality reduction technology may lose key information during data compression, affecting subsequent data processing and analysis. In some preferred embodiments of the present invention, the expanded data is processed to obtain training data, including the following steps A1 to A2:

步骤A1，通过基于次模函数优化的神经网络模型对扩充数据进行特征提取，得到特征提取后的数据；其中，基于策略约束次模函数的优化过程。Step A1, extracting features from the expanded data through a neural network model based on submodular function optimization to obtain feature-extracted data; wherein the optimization process of the submodular function is constrained based on a strategy.

具体的，首先，我们需要设计一个基于次模函数优化的神经网络模型，用于从扩充数据中提取有用的特征。次模函数是一种在优化问题中常用的数学工具，它可以保证在满足某些条件的情况下，解的组合仍然是一个解；其次，在神经网络的训练过程中，引入策略约束，这些约束是基于次模函数的，确保特征提取过程能够捕捉到数据中最有意义的信息，同时避免过拟合；再次，将扩充的数据输入到设计好的神经网络模型中，通过前向传播计算，提取出能够代表原始数据重要特性的特征。Specifically, first, we need to design a neural network model based on submodular function optimization to extract useful features from the augmented data. Submodular function is a mathematical tool commonly used in optimization problems. It can ensure that the combination of solutions is still a solution when certain conditions are met; secondly, in the training process of the neural network, policy constraints are introduced. These constraints are based on submodular functions to ensure that the feature extraction process can capture the most meaningful information in the data while avoiding overfitting; thirdly, the augmented data is input into the designed neural network model, and features that can represent the important characteristics of the original data are extracted through forward propagation calculation.

在本发明一些较佳的实施例中，训练基于次模函数优化的神经网络模型包括下述步骤B1至B3：In some preferred embodiments of the present invention, training the neural network model based on submodular function optimization includes the following steps B1 to B3:

步骤B1，初始化神经网络模型的参数。Step B1, initializing the parameters of the neural network model.

具体的，在开始训练之前，首先需要初始化神经网络的参数，包括权重和偏置项。这些参数是随机生成的，通常使用特定的分布，如正态分布或均匀分布。初始化的目的是为网络提供一个起点，使其能够开始学习过程。一个好的初始化方法可以帮助模型更快地收敛并避免一些常见问题，如梯度消失或梯度爆炸。Specifically, before starting training, you first need to initialize the parameters of the neural network, including weights and biases. These parameters are randomly generated, usually using a specific distribution, such as normal distribution or uniform distribution. The purpose of initialization is to provide a starting point for the network so that it can begin the learning process. A good initialization method can help the model converge faster and avoid some common problems, such as gradient disappearance or gradient explosion.

在本发明一些较佳的实施例中，初始化神经网络模型的参数，设表示神经网络的初始参数集合，其中下标0表示训练开始前的初始状态。In some preferred embodiments of the present invention, the parameters of the neural network model are initialized, and Represents the initial parameter set of the neural network, where the subscript 0 represents the initial state before training begins.

步骤B2，设定优化过程中的策略约束条件；其中，策略约束的策略约束函数基于约束条件的数量、每个约束条件对应的权重和每个约束条件的计算公式约束。Step B2, setting the strategy constraints in the optimization process; wherein the strategy constraint function of the strategy constraint is based on the number of constraints, the weight corresponding to each constraint and the calculation formula constraint of each constraint.

具体的，在优化过程中，引入策略约束条件来指导神经网络的学习过程。这些约束条件是基于次模函数的，它们定义了优化问题的解决方案空间。每个约束条件都有一个对应的权重，这个权重决定了该约束在优化过程中的重要性。权重越高，对应的约束条件对模型的影响越大。约束条件的计算公式是预先定义的，它们可以是关于数据的某些性质，如稀疏性、平滑性或其他与问题相关的特征。Specifically, during the optimization process, policy constraints are introduced to guide the learning process of the neural network. These constraints are based on submodular functions, and they define the solution space of the optimization problem. Each constraint has a corresponding weight, which determines the importance of the constraint in the optimization process. The higher the weight, the greater the impact of the corresponding constraint on the model. The calculation formulas of the constraints are pre-defined, and they can be about certain properties of the data, such as sparsity, smoothness, or other characteristics related to the problem.

在本发明一些较佳的实施例中，设定优化过程中的策略约束条件，这些约束条件基于土壤污染度检测的具体需求制定，以确保提取的特征能够精确反映土壤污染度。具体的，设定策略约束C(c_θ)，其中，c_θ为网络参数。约束函数形式化为公式(11)：In some preferred embodiments of the present invention, strategic constraints are set in the optimization process. These constraints are formulated based on the specific requirements of soil contamination detection to ensure that the extracted features can accurately reflect the soil contamination. Specifically, the strategic constraint C(c _θ ) is set, where c _θ is a network parameter. The constraint function is formalized as formula (11):

其中，N_c为约束条件的数量，c_i为针对第i个条件的权重，f_i(c_θ)为第i个约束条件的计算公式，映射网络参数到满足特定约束条件的程度。Where N _c is the number of constraints, c _i is the weight for the i-th constraint, and _fi (c _θ ) is the calculation formula for the i-th constraint, mapping the network parameters to the degree of satisfying a specific constraint.

在本发明一些较佳的实施例中，对于每个约束条件f_i(c_θ)，设某个约束旨在保持提取的特征对于土壤污染度检测中酸碱度超标的识别具有高度敏感性，那么该约束可以具体化为公式(12)：In some preferred embodiments of the present invention, for each constraint condition _fi (c _θ ), a constraint is provided to maintain the extracted features to have high sensitivity for identifying excessive pH in soil contamination detection, and the constraint can be embodied as formula (12):

其中，为与约束c_i相关的网络权重部分，X为输入数据，Y_crack为酸碱度超标特征的期望输出，σ是控制敏感度的超参数。in, is the part of the network weight related to the constraint _ci , X is the input data, Y _crack is the expected output of the pH over-limit feature, and σ is the hyperparameter controlling the sensitivity.

步骤B3，迭代进行训练操作，直至达到预设的最大迭代次数；训练操作包括：基于神经网络模型对输入数据进行特征提取；其中，特征提取的目标函数基于输入数据的损失函数、稀疏性约束项、调节稀疏性的超参数和策略约束影响力的超参数约束；评估特征提取后的数据的有效性，基于评估结果调整策略约束条件和神经网络模型的参数；其中，评估函数基于评估指标点数量、评估指标的计算公式约束。Step B3, iteratively perform training operations until a preset maximum number of iterations is reached; the training operations include: extracting features from input data based on a neural network model; wherein the objective function of feature extraction is based on a loss function of the input data, a sparsity constraint term, a hyperparameter for adjusting sparsity, and a hyperparameter constraint for the influence of strategy constraints; evaluating the validity of the data after feature extraction, and adjusting strategy constraints and parameters of the neural network model based on the evaluation results; wherein the evaluation function is based on the number of evaluation indicator points and the calculation formula constraints of the evaluation indicators.

具体的，在达到预设的最大迭代次数之前，不断进行以下训练操作：Specifically, before reaching the preset maximum number of iterations, the following training operations are performed continuously:

特征提取：使用当前参数的神经网络模型对输入数据进行特征提取。这一步的目标是最大化目标函数，该函数是基于输入数据的损失函数、稀疏性约束项、调节稀疏性的超参数和策略约束影响力的超参数约束的。评估有效性：提取特征后，需要评估这些特征的有效性。这可以通过计算评估函数来完成，该函数基于评估指标点数量和评估指标的计算公式约束。调整参数：根据评估结果，调整策略约束条件和神经网络模型的参数。这是通过反向传播算法来实现的，该算法计算损失函数关于模型参数的梯度，并相应地更新参数。Feature extraction: Extract features from the input data using the neural network model with the current parameters. The goal of this step is to maximize the objective function, which is based on the loss function of the input data, the sparsity constraint term, the hyperparameters that adjust the sparsity, and the hyperparameters that constrain the influence of the policy constraints. Evaluate effectiveness: After extracting the features, the effectiveness of these features needs to be evaluated. This can be done by calculating the evaluation function, which is based on the number of evaluation indicator points and the calculation formula constraints of the evaluation indicators. Adjust parameters: Based on the evaluation results, adjust the policy constraints and the parameters of the neural network model. This is achieved through the back-propagation algorithm, which calculates the gradient of the loss function with respect to the model parameters and updates the parameters accordingly.

在本发明一些较佳的实施例中，利用神经网络对输入数据进行特征提取，同时在特征提取过程中引入稀疏性约束，通过迭代优化网络参数，不断调整提取的特征，直至达到既定的稀疏性水平。具体的，设特征提取目标函数为公式(13)：In some preferred embodiments of the present invention, a neural network is used to extract features from input data, and a sparsity constraint is introduced in the feature extraction process. The extracted features are continuously adjusted by iteratively optimizing the network parameters until a predetermined sparsity level is reached. Specifically, the feature extraction objective function is set as formula (13):

O(c_θ,C)＝L(c_θ)+λs·S(c_θ)+γs·C(c_θ) (13)O(c _θ ,C)=L(c _θ )+λs·S(c _θ )+γs·C(c _θ ) (13)

其中，L(c_θ)为基于数据的损失函数，用于评估特征提取的效果；S(c_θ)为稀疏性约束项，λs为调节稀疏性的超参数，γs为调节策略约束影响力的超参数。在本发明一些较佳的实施例中，λs和γs分别设置为0.2和0.8。Wherein, L(c _θ ) is a data-based loss function used to evaluate the effect of feature extraction; S(c _θ ) is a sparsity constraint term, λs is a hyperparameter for adjusting sparsity, and γs is a hyperparameter for adjusting the influence of strategy constraints. In some preferred embodiments of the present invention, λs and γs are set to 0.2 and 0.8 respectively.

在本发明一些较佳的实施例中，稀疏性约束项S(c_θ)的计算方式可以表示为公式(14)：In some preferred embodiments of the present invention, the calculation method of the sparsity constraint term S(c _θ ) can be expressed as formula (14):

其中，∥c_θ∥₁为参数c_θ的L1范数，用于促进网络参数的稀疏性，从而增强模型的解释性并减少过拟合风险。Among them, ∥c _θ ∥ ₁ is the L1 norm of parameter c _θ , which is used to promote the sparsity of network parameters, thereby enhancing the interpretability of the model and reducing the risk of overfitting.

通过上述步骤B1至B3，神经网络模型在策略约束的指导下，逐步学习并提取出有用的特征。这种基于次模函数优化的方法使得模型能够在满足特定约束条件的同时，有效地从数据中学习到有意义的特征表示。这不仅有助于提高模型的性能，还能够保证模型的解释性和泛化能力。Through the above steps B1 to B3, the neural network model gradually learns and extracts useful features under the guidance of policy constraints. This method based on submodular function optimization enables the model to effectively learn meaningful feature representations from data while meeting specific constraints. This not only helps to improve the performance of the model, but also ensures the interpretability and generalization ability of the model.

进一步的，在每轮训练结束后，评估提取的特征对于土壤污染度检测任务的有效性，并根据评估结果调整策略约束和网络参数，以进一步优化特征提取过程。具体的，评估函数来调整网络参数和策略约束，且评估函数为公式(15)：Furthermore, after each round of training, the effectiveness of the extracted features for the soil contamination detection task is evaluated, and the strategy constraints and network parameters are adjusted according to the evaluation results to further optimize the feature extraction process. Specifically, the evaluation function is used to adjust the network parameters and strategy constraints, and the evaluation function is formula (15):

其中，M为评估指标的数量，g_j为第j个评估指标的计算公式，依赖于特征提取目标函数的输出。Among them, M is the number of evaluation indicators, _gj is the calculation formula of the jth evaluation indicator, which depends on the output of the feature extraction objective function.

在本发明一些较佳的实施例中，每个评估指标g_j可以具体化为模型在特定任务上的性能度量，如，对于酸碱度超标检测任务的准确率评估指标可以是公式(16)：In some preferred embodiments of the present invention, each evaluation index _gj can be embodied as a performance metric of the model on a specific task. For example, the accuracy evaluation index for the pH over-limit detection task can be formula (16):

其中，TP为真正例的数量，表示正确识别的酸碱度超标特征数量；FP为假正例的数量，表示错误识别为酸碱度超标特征数量。其中，所述识别方法通过预设的Softmax函数实现。Wherein, TP is the number of true positive examples, indicating the number of correctly identified features of excessive pH; FP is the number of false positive examples, indicating the number of features incorrectly identified as excessive pH. Wherein, the identification method is implemented by a preset Softmax function.

进一步的，在迭代过程中，参数的更新方式可以表示为公式(17)：Furthermore, during the iteration process, the parameter update method can be expressed as formula (17):

其中，ηs是学习率，是目标函数相对于网络参数的梯度，t表示当前迭代次数。在本发明一些较佳的实施例中，学习率ηs被设置为0.01。Among them, ηs is the learning rate, is the gradient of the objective function with respect to the network parameters, and t represents the current iteration number. In some preferred embodiments of the present invention, the learning rate ηs is set to 0.01.

本发明实施例提出一种基于策略约束次模函数优化的神经网络算法，采用策略约束来引导次模函数优化过程，确保特征提取过程既高效又能保留数据的核心信息，通过在优化过程中加入约束条件，能够有效地提取出对土壤污染度检测最具判别力的特征，同时避免了无关特征的干扰。此外，本发明在特征提取过程中，引入稀疏性约束，以减少特征空间的维度并提升模型的解释性，不仅有助于降低计算复杂度，而且在处理大规模数据时，能够显著提高特征提取的准确性和效率。The embodiment of the present invention proposes a neural network algorithm based on strategy-constrained submodular function optimization, which uses strategy constraints to guide the submodular function optimization process to ensure that the feature extraction process is both efficient and can retain the core information of the data. By adding constraints in the optimization process, the most discriminative features for soil contamination detection can be effectively extracted, while avoiding the interference of irrelevant features. In addition, the present invention introduces sparsity constraints in the feature extraction process to reduce the dimension of the feature space and improve the interpretability of the model, which not only helps to reduce the computational complexity, but also can significantly improve the accuracy and efficiency of feature extraction when processing large-scale data.

进一步的，在本发明一些较佳的实施例中，基于评估结果调整神经网络模型的参数的步骤，包括：基于动态特征感知网络调整机制调整神经网络模型的参数；其中，调整函数基于神经网络模型的当前的参数、评估函数相对于当前的参数的梯度、网络结构调整函数、学习率和动态调整因子约束；评估函数相对于当前的参数的梯度基于提取的特征值、目标特征值和特征的总数约束；网络结构调整函数基于控制网络调整幅度的超参数、控制网络敏感度的超参数和针对生成器的优化方向约束。Furthermore, in some preferred embodiments of the present invention, the step of adjusting the parameters of the neural network model based on the evaluation results includes: adjusting the parameters of the neural network model based on a dynamic feature-aware network adjustment mechanism; wherein the adjustment function is based on the current parameters of the neural network model, the gradient of the evaluation function relative to the current parameters, the network structure adjustment function, the learning rate and the dynamic adjustment factor constraints; the gradient of the evaluation function relative to the current parameters is based on the extracted eigenvalues, the target eigenvalues and the total number of features constraints; the network structure adjustment function is based on the hyperparameters for controlling the network adjustment amplitude, the hyperparameters for controlling the network sensitivity and the optimization direction constraints for the generator.

具体的，在本发明一些较佳的实施例中，考虑到动态特征感知网络调整机制的实施，设定模型在第t次迭代后的特征提取效果评估函数为D_t(c_θ,Y_target)，其中，Y_target为目标特征向量，则评估函数定义为公式(18)：Specifically, in some preferred embodiments of the present invention, considering the implementation of the dynamic feature perception network adjustment mechanism, the feature extraction effect evaluation function of the model after the tth iteration is set to D _t (c _θ ,Y _target ), where Y _target is the target feature vector, and the evaluation function is defined as formula (18):

其中，为第k个提取的特征值，为第k个目标特征值，K为特征的总数。in, is the kth extracted eigenvalue, is the kth target feature value, and K is the total number of features.

进一步地，基于D_t(c_θ,Y_target)的结果，动态特征感知网络调整机制通过下式调整网络参数如公式(19)：Furthermore, based on the result of D _t (c _θ ,Y _target ), the dynamic feature-aware network adjustment mechanism adjusts the network parameters as shown in formula (19) by the following formula:

其中，为评估函数相对于网络参数的梯度；Adjust(c_θ,D_t)为基于当前迭代特征提取效果的网络结构调整函数，定义为公式(20)：in, is the gradient of the evaluation function relative to the network parameters; Adjust(c _θ ,D _t ) is the network structure adjustment function based on the current iterative feature extraction effect, defined as formula (20):

Adjust(c_θ,D_t)＝α·exp(-β·D_t(c_θ,Y_target))·δc_θ (20)Adjust(c _θ ,D _t )=α·exp(-β·D _t (c _θ ,Y _target ))·δc _θ (20)

其中，α为控制网络调整幅度的超参数，β为控制网络敏感度的超参数；δc_θ为针对D_t的优化方向，旨在减少特征提取效果与目标之间的差异；ζ为一个动态调整因子，用于平衡原始梯度下降和基于特征提取效果的网络结构调整。Among them, α is a hyperparameter that controls the adjustment amplitude of the network, β is a hyperparameter that controls the sensitivity of the network; δc _θ is the optimization direction for D _t , which aims to reduce the difference between the feature extraction effect and the target; ζ is a dynamic adjustment factor used to balance the original gradient descent and the network structure adjustment based on the feature extraction effect.

在训练过程中，通过动态特征感知网络调整模型，即，通过动态评估特征的变化来指导网络参数的更新，从而实现对土壤污染度检测任务更加敏感和适应性强的模型。具体的，在每次迭代后的评估与调整时，基于当前迭代的特征提取效果来优化网络结构，确保模型能够更加精确地捕捉对当前土壤污染度检测任务至关重要的特征。During the training process, the model is adjusted through a dynamic feature perception network, that is, the network parameters are updated by dynamically evaluating the changes in features, thereby achieving a model that is more sensitive and adaptable to the soil contamination detection task. Specifically, during the evaluation and adjustment after each iteration, the network structure is optimized based on the feature extraction effect of the current iteration to ensure that the model can more accurately capture the features that are critical to the current soil contamination detection task.

步骤A2，将特征提取后的数据输入特征降维模型中，得到降维后的数据，将降为后的数据确定为训练数据；其中，特征降维模型基于潜在量子编码的自编码神经网络算法确定。Step A2, inputting the data after feature extraction into the feature dimension reduction model to obtain the reduced data, and determining the reduced data as training data; wherein the feature dimension reduction model is determined based on the autoencoder neural network algorithm of potential quantum coding.

具体的，首先，我们需要构建一个基于潜在量子编码的自编码神经网络算法。自编码器是一种无监督学习的神经网络，能够学习数据的有效表征(编码)，并能够从这个表征重建数据；其次，将特征提取后的数据输入到自编码神经网络中，网络将这些数据压缩成一个低维度的表示，即降维后的数据。这个低维表示保留了原始数据的最关键信息，同时去除了噪声和冗余；再次，降维后的数据被确定为用于训练最终分类模型的训练数据。这些数据现在更加适合用于训练，因为它们既包含了必要的信息，又减少了计算负担。Specifically, first, we need to build an autoencoder neural network algorithm based on potential quantum coding. The autoencoder is an unsupervised learning neural network that can learn an effective representation (encoding) of data and reconstruct data from this representation; secondly, the feature-extracted data is input into the autoencoder neural network, and the network compresses the data into a low-dimensional representation, that is, the reduced-dimensional data. This low-dimensional representation retains the most critical information of the original data while removing noise and redundancy; thirdly, the reduced-dimensional data is determined as training data for training the final classification model. These data are now more suitable for training because they contain both necessary information and reduce the computational burden.

进一步的，在本发明一些较佳的实施例中，训练特征降维模型包括步骤C1至C2：Furthermore, in some preferred embodiments of the present invention, training the feature dimensionality reduction model includes steps C1 to C2:

步骤C1，初始化特征降维模型的参数。Step C1, initializing the parameters of the feature dimensionality reduction model.

步骤C2，迭代执行数据降维操作，直至达到预设的最大迭代次数；数据降为操作包括：将输入特征降维模型的原始数据通过编码器向前传播至第二量子潜在空间得到量子位的数据；基于解码器将量子位的数据红狗会原始空间，得到重构数据；基于重构数据和原始数据采用异步策略对编码器的参数和解码器的参数进行更新；其中，基于限制性优化策略调整量子位的状态。Step C2, iteratively performing the data dimension reduction operation until a preset maximum number of iterations is reached; the data reduction operation includes: forwarding the original data of the input feature dimension reduction model through the encoder to the second quantum latent space to obtain the quantum bit data; based on the decoder, the quantum bit data is converted to the original space to obtain reconstructed data; based on the reconstructed data and the original data, the parameters of the encoder and the parameters of the decoder are updated using an asynchronous strategy; wherein, the state of the quantum bit is adjusted based on the restrictive optimization strategy.

具体的，根据潜在量子编码的要求初始化自编码器网络的参数，包括编码器和解码器中的权重和偏置，以及量子位的初始化状态。Specifically, the parameters of the autoencoder network are initialized according to the requirements of the underlying quantum coding, including the weights and biases in the encoder and decoder, as well as the initialization state of the qubits.

具体的，编码器权重的初始化方式为公式(21)：Specifically, the encoder weight is initialized as follows:

其中，l为层级，σ²为初始化的方差。在本发明一些较佳的实施例中，σ²被设置为0.01。Wherein, l is the level, σ ² is the initialization variance. In some preferred embodiments of the present invention, σ ² is set to 0.01.

且，解码器权重的初始化方式为公式(22)：And, the decoder weight is initialized as formula (22):

且，偏置初始化均为零，可以表示为公式(23)和(24)：Moreover, the biases are initialized to zero, which can be expressed as formulas (23) and (24):

进一步地，对量子位初始化，潜在量子编码状态的初始化方式可以表示为公式(25)：Furthermore, for qubit initialization, the initialization method of the potential quantum coding state can be expressed as formula (25):

其中，n为量子位的数量。在本发明一些较佳的实施例中，量子位的数量n被设置为8。Wherein, n is the number of qubits. In some preferred embodiments of the present invention, the number of qubits n is set to 8.

进一步的，输入降维模型的数据通过编码器进行前向传播，数据在经过层层变换后被编码到潜在空间中。在此过程中，数据被压缩成更低维度的表示，同时量子位的叠加和纠缠特性被用于增强数据的特征表达能力。Furthermore, the data input into the dimensionality reduction model is forward propagated through the encoder, and the data is encoded into the latent space after layer-by-layer transformation. In this process, the data is compressed into a lower-dimensional representation, and the superposition and entanglement properties of quantum bits are used to enhance the feature expression ability of the data.

具体的，编码器前向传播的参数计算可以表示为公式(26)：Specifically, the parameter calculation of the encoder forward propagation can be expressed as formula (26):

其中，eX为输入数据。Among them, eX is the input data.

且，解码器前向传播的激活函数可以表示为公式(27)：And, the activation function of the decoder forward propagation can be expressed as formula (27):

其中，Re()为ReLU激活函数。Among them, Re() is the ReLU activation function.

进一步地，潜在空间映射的方式可以表示为公式(28)：Furthermore, the latent space mapping method can be expressed as formula (28):

其中，φ()为量子编码映射函数，L为编码器的最后一层。Among them, φ() is the quantum coding mapping function, and L is the last layer of the encoder.

在本发明一些较佳的实施例中，映射函数将编码后的特征向量转换为量子位的表示形式，计算方式可以表示为公式(29)：In some preferred embodiments of the present invention, the mapping function The encoded feature vector is converted into the representation of quantum bits, and the calculation method can be expressed as formula (29):

其中，eW_φ为映射到潜在量子空间的权重，eb_φ为映射到潜在量子空间的偏置，softmax函数用于将编码器的输出转换为概率分布，这有助于表示概率振幅的量子位状态。Among them, eW _φ is the weight mapped to the latent quantum space, eb _φ is the bias mapped to the latent quantum space, and the softmax function is used to convert the output of the encoder into a probability distribution, which helps to represent the quantum bit state of the probability amplitude.

进一步的，在潜在空间中，通过引入限制性优化策略来调整量子位的状态，确保在压缩数据的同时，关键特征得到有效保留。Furthermore, in the latent space, a restrictive optimization strategy is introduced to adjust the state of the quantum bit to ensure that key features are effectively retained while compressing the data.

具体的，潜在空间优化的方式可以表示为公式(30)：Specifically, the latent space optimization method can be expressed as formula (30):

eL_quantum＝Ψ(eQ_mapped,eTargets) (30)eL _quantum =Ψ(eQ _mapped ,eTargets) (30)

其中，Ψ()为量子态的优化函数，eTargets为训练目标，用于调整量子位以最小化重构误差。Among them, Ψ() is the optimization function of the quantum state, and eTargets is the training target, which is used to adjust the qubit to minimize the reconstruction error.

在本发明一些较佳的实施例中，量子态的优化函数Ψ(eQ_mapped,eTargets)负责调整量子位以最小化重构误差，其通过引入量子态相似度测量来实现，本实施例采用量子Jensen-Shannon散度进行计算，可以表示为公式(31)：In some preferred embodiments of the present invention, the optimization function Ψ(eQ _mapped , eTargets) of the quantum state is responsible for adjusting the qubit to minimize the reconstruction error, which is achieved by introducing a quantum state similarity measurement. This embodiment uses the quantum Jensen-Shannon divergence for calculation, which can be expressed as formula (31):

Ψ(eQ_mapped,eTargets)＝QJSD(eQ_mapped∥eTargets) (31)Ψ(eQ _mapped ,eTargets)＝QJSD(eQ _mapped ∥eTargets) (31)

其中，QJSD函数衡量了量子态eQ_mapped与目标态eTargets之间的相似度。Among them, the QJSD function measures the similarity between the quantum state eQ _mapped and the target state eTargets.

进一步的，通过解码器将潜在空间的数据重构回原始数据空间，计算重构误差，并根据误差通过反向传播算法更新网络参数。本发明采用异步策略对编码器和解码器的参数进行更新，以提高训练效率。Furthermore, the data in the latent space is reconstructed back to the original data space through the decoder, the reconstruction error is calculated, and the network parameters are updated according to the error through the back propagation algorithm. The present invention adopts an asynchronous strategy to update the parameters of the encoder and decoder to improve the training efficiency.

具体的，在反向传播与参数更新的过程中，重构误差计算的方式可以表示为公式(32)、(33)和(34)：Specifically, in the process of back propagation and parameter updating, the reconstruction error calculation method can be expressed as formulas (32), (33) and (34):

其中，L_dec为解码器的最后一层，Sig()为Sigmoid激活函数。Among them, L _dec is the last layer of the decoder, and Sig() is the Sigmoid activation function.

进一步地，参数更新的方式可以表示为公式(35)和(36)：Furthermore, the parameter updating method can be expressed as formulas (35) and (36):

其中，η_m为自编码器地学习率。Among them, _ηm is the learning rate of the autoencoder.

进一步地，梯度和的具体计算方式可以表示为公式(37)：Furthermore, the gradient and The specific calculation method can be expressed as formula (37):

其中，的计算遵循类似的链式法则，且从解码器的角度进行。在本发明一些较佳的实施例中，学习率η_m被设置为0.001。in, The calculation of follows a similar chain rule and is performed from the decoder's perspective. In some preferred embodiments of the present invention, the learning rate η _m is set to 0.001.

重复执行迭代训练，直到达到预设的迭代次数。The iterative training is repeated until the preset number of iterations is reached.

在本发明一些较佳的实施例中，更新偏置的方式同理。In some preferred embodiments of the present invention, the method of updating the bias is similar.

将特征提取后的数据输入到特征降维模型中进行训练，本发明实施例提出一种基于潜在量子编码的自编码神经网络算进行特征降维。与传统的自编码器结构相比，本发明实施例通过引入量子位表示法来编码潜在空间，利用量子位的叠加和纠缠特性来增强特征表达的能力，从而实现更高效的数据压缩和特征表示。此外，本发明实施例通过引入限制性条件来指导潜在空间的学习过程，确保重构的数据能够保持与原始数据高度相似的同时，也能够在降维后的数据中保留更多关键信息。The data after feature extraction is input into the feature dimensionality reduction model for training. The embodiment of the present invention proposes an autoencoder neural network algorithm based on potential quantum coding to perform feature dimensionality reduction. Compared with the traditional autoencoder structure, the embodiment of the present invention introduces quantum bit representation to encode the latent space, and uses the superposition and entanglement characteristics of quantum bits to enhance the ability of feature expression, thereby achieving more efficient data compression and feature representation. In addition, the embodiment of the present invention guides the learning process of the latent space by introducing restrictive conditions, ensuring that the reconstructed data can remain highly similar to the original data while retaining more key information in the reduced data.

通过上述步骤A1和A2，我们得到了一个既包含了原始数据重要特征，又经过了有效降维处理的训练数据集。这个数据集为后续的分类器训练提供了坚实的基础，有助于提高分类模型的性能和准确性。Through the above steps A1 and A2, we obtain a training data set that contains important features of the original data and has undergone effective dimensionality reduction. This data set provides a solid foundation for subsequent classifier training and helps improve the performance and accuracy of the classification model.

进一步的，现有模型训练方法在学习率调整和模型优化方面可能不够灵活，影响模型的训练效率和分类准确性。在本发明一些较佳的实施例中，将训练数据输入分类器中，得到最终分类模型包括步骤D1至D2：Furthermore, the existing model training methods may not be flexible enough in terms of learning rate adjustment and model optimization, which affects the training efficiency and classification accuracy of the model. In some preferred embodiments of the present invention, inputting the training data into the classifier to obtain the final classification model includes steps D1 to D2:

步骤D1，将训练数据分为训练集和验证集。Step D1, divide the training data into a training set and a validation set.

具体的，在训练分类模型之前，首先需要将训练数据分割成训练集和验证集。训练集用于实际训练分类器，而验证集用于在训练过程中验证模型的性能。分割的比例可以根据具体情况而定，常见的比例有70％训练集和30％验证集，或者使用交叉验证等更复杂的分割策略。分割数据的目的是在训练过程中评估模型的泛化能力，防止过拟合，并确保模型能够在未见过的数据上做出准确预测。Specifically, before training the classification model, the training data must first be split into a training set and a validation set. The training set is used to actually train the classifier, while the validation set is used to verify the performance of the model during the training process. The split ratio can be determined based on the specific situation. Common ratios include 70% training set and 30% validation set, or more complex split strategies such as cross-validation can be used. The purpose of splitting the data is to evaluate the generalization ability of the model during the training process, prevent overfitting, and ensure that the model can make accurate predictions on unseen data.

步骤D2基于训练集训练分类器，基于验证集验证训练后的分类器，直至训练集对应的损失函数达到预设的阈值，将训练完成的分类器确定为最终分类模型；其中，在训练过程中基于自适应学习率调整机制调整分类器的学习率。Step D2 trains the classifier based on the training set, and verifies the trained classifier based on the verification set until the loss function corresponding to the training set reaches a preset threshold, and determines the trained classifier as the final classification model; wherein, during the training process, the learning rate of the classifier is adjusted based on an adaptive learning rate adjustment mechanism.

具体的，使用训练集中的数据来训练分类器。分类器可以是任何类型的机器学习模型，如决策树、支持向量机、神经网络等，具体选择取决于问题的性质和数据的特点。在训练过程中，使用验证集来验证分类器的性能。这通常在每个训练迭代或几个迭代后进行，以监控分类器在未知数据上的表现。训练继续进行，直到训练集对应的损失函数达到预设的阈值。损失函数是衡量模型预测与实际标签之间差异的指标，当损失降低到足够低的水平时，可以认为模型已经学习到了有效的模式。在训练过程中，采用自适应学习率调整机制来调整分类器的学习率。这种机制允许学习率在训练过程中根据模型的性能动态调整，有助于提高收敛速度并减少波动。一旦满足停止条件，如损失函数达到预设阈值或验证集上的准确率不再提升，训练完成的分类器就被确定为最终分类模型。Specifically, the classifier is trained using the data in the training set. The classifier can be any type of machine learning model, such as decision trees, support vector machines, neural networks, etc. The specific choice depends on the nature of the problem and the characteristics of the data. During the training process, the validation set is used to verify the performance of the classifier. This is usually done after each training iteration or several iterations to monitor the performance of the classifier on unknown data. Training continues until the loss function corresponding to the training set reaches a preset threshold. The loss function is an indicator that measures the difference between the model prediction and the actual label. When the loss is reduced to a sufficiently low level, it can be considered that the model has learned an effective pattern. During the training process, an adaptive learning rate adjustment mechanism is used to adjust the learning rate of the classifier. This mechanism allows the learning rate to be dynamically adjusted according to the performance of the model during the training process, which helps to increase the convergence speed and reduce fluctuations. Once the stopping condition is met, such as the loss function reaches a preset threshold or the accuracy on the validation set no longer improves, the trained classifier is determined as the final classification model.

在本发明一些较佳的实施例中，基于自适应学习率调整的概率神经网络分类模型地训练流程如下：In some preferred embodiments of the present invention, the training process of the probabilistic neural network classification model based on adaptive learning rate adjustment is as follows:

初始化概率神经网络的权重和偏差，设p_θ表示概率神经网络的参数集，初始化过程可以表示为其中下标0指初始状态。Initialize the weights and biases of the probabilistic neural network. Let p _θ represent the parameter set of the probabilistic neural network. The initialization process can be expressed as The subscript 0 refers to the initial state.

设置自适应学习率机制，基于模型在验证集上的表现，动态调整学习率。如果模型表现改善，则增加学习率以加快收敛；如果模型表现下降，则减小学习率以避免过度拟合。Set up an adaptive learning rate mechanism to dynamically adjust the learning rate based on the model's performance on the validation set. If the model performance improves, increase the learning rate to speed up convergence; if the model performance decreases, decrease the learning rate to avoid overfitting.

在训练过程中，通过考虑数据在高维空间中的流形结构，调整概率神经网络的训练策略，以更好地捕捉数据的内在特性。具体的，设输入特征向量为p_x，通过流形映射函数M(p_x)转换到低维流形空间，映射过程可表示为公式(38)：During the training process, by considering the manifold structure of the data in the high-dimensional space, the training strategy of the probabilistic neural network is adjusted to better capture the intrinsic characteristics of the data. Specifically, let the input feature vector be p _x , and transform it to the low-dimensional manifold space through the manifold mapping function M(p _x ). The mapping process can be expressed as formula (38):

M(p_x)＝f_manifold(p_x；p_φ) (38)M(p _x )=f _manifold (p _x ; p _φ ) (38)

其中，p_φ为流形映射函数的参数。Among them, p _φ is the parameter of the manifold mapping function.

在本发明一些较佳的实施例中，映射函数f_manifold()采用自编码器结构进行详细定义，设自编码器的编码部分由函数表示，解码部分由函数表示，其中z是编码后的低维表示，为编码部分的参数集，为解码部分的参数集。In some preferred embodiments of the present invention, the mapping function f _manifold () is defined in detail using an autoencoder structure, assuming that the encoding part of the autoencoder is composed of the function Indicates that the decoding part is composed of the function Representation, where z is the encoded low-dimensional representation, is the parameter set for the encoding part, Parameter set for the decoding part.

进一步地，映射函数f_manifold(p_x；p_φ)的计算方式可以表示为公式(39)：Furthermore, the calculation method of the mapping function f _manifold (p _x ; p _φ ) can be expressed as formula (39):

进一步地，进行反复迭代训练。在每次迭代中，根据当前的学习率更新模型参数，并评估模型在训练集和验证集上的表现，重复此过程，直至模型在验证集上的性能不再显著提升，或达到预设的最大迭代次数。Furthermore, repeated iterative training is performed. In each iteration, the model parameters are updated according to the current learning rate, and the performance of the model on the training set and the validation set is evaluated. This process is repeated until the performance of the model on the validation set is no longer significantly improved, or the preset maximum number of iterations is reached.

具体的，在流形空间中，概率神经网络的训练目标是最小化损失函数L_PLNN，可以表示为公式(40)：Specifically, in the manifold space, the training goal of the probabilistic neural network is to minimize the loss function L _PLNN , which can be expressed as formula (40):

其中，p_y为真实标签，为概率神经网络预测的标签概率，i为样本索引。Among them, p _y is the true label, is the label probability predicted by the probabilistic neural network, and i is the sample index.

进一步地，通过重启策略监测训练过程中的性能，如果性能低于预设地阈值，则动态调整学习率η和参数p_θ的更新策略，具体表示为公式(41)和(42)：Furthermore, the performance of the training process is monitored through the restart strategy. If the performance is lower than the preset threshold, the update strategy of the learning rate η and the parameter p _θ is dynamically adjusted, which is specifically expressed as formulas (41) and (42):

η^(t+1)＝η^(t)·γ_restart (41)η ^(t+1) ＝η ^(t) ·γ _restart (41)

其中，γ_restart为调整学习率的因子，通常小于1；为损失函数地变化梯度。Among them, γ _restart is the factor for adjusting the learning rate, which is usually less than 1; is the gradient of the loss function.

在本发明一些较佳的实施例中，性能的评价方式采用在验证集上的精度作为评价指标。In some preferred embodiments of the present invention, the performance is evaluated by using the accuracy on the validation set as an evaluation indicator.

选择在验证集上表现最佳的模型作为最终模型，用于新样本的分类任务。最终模型输出为公式(43)：The model with the best performance on the validation set is selected as the final model for the classification task of new samples. The final model output is formula (43):

其中，softmax()为Softmax函数，用于将神经网络的输出转换为概率分布。Among them, softmax() is the Softmax function, which is used to convert the output of the neural network into a probability distribution.

在本发明一些较佳的实施例中，预测标签概率是基于流形映射后的特征和概率神经网络参数p_θ进行，该过程通过Softmax层来实现，具体的，设h(z_i；p_θ)为概率神经网络的输出层，输出层在接收到映射后的特征z_i后计算一个线性变换，即公式(44)：In some preferred embodiments of the present invention, the predicted label probability It is based on the features after manifold mapping and the probabilistic neural network parameters p _θ . This process is implemented through the Softmax layer. Specifically, let h(z _i ; p _θ ) be the output layer of the probabilistic neural network. After receiving the mapped feature z _i , the output layer calculates a linear transformation, that is, formula (44):

h(z_i；p_θ)＝Wz_i+b (44)h(z _i ; p _θ )=Wz _i +b (44)

其中，W和b是概率神经网络输出层的权重和偏置。Where W and b are the weights and biases of the output layer of the probabilistic neural network.

进一步地，预测概率通过应用Softmax函数计算得到，可以表示为公式(45)：Furthermore, the predicted probability is calculated by applying the Softmax function and can be expressed as formula (45):

通过上述步骤D1和D2，我们得到了一个经过充分训练和验证的分类模型，该模型能够准确地预测土壤样本的污染程度。这种结合了数据预处理、特征提取、降维处理和自适应学习率调整的机器学习流程，不仅提高了模型的性能，还保证了模型的稳定性和可靠性。Through the above steps D1 and D2, we obtained a fully trained and validated classification model that can accurately predict the contamination level of soil samples. This machine learning process that combines data preprocessing, feature extraction, dimensionality reduction, and adaptive learning rate adjustment not only improves the performance of the model, but also ensures the stability and reliability of the model.

将降维后地数据输入到分类器中，本发明实施例提出一种基于自适应学习率调整的概率神经网络分类模型，与传统的概率神经网络不同，基于自适应学习率调整的概率神经网络不仅考虑了数据的流形结构，还引入自适应学习率调整机制，能够根据模型在训练过程中的表现动态调整学习率，从而优化模型训练过程，提高分类准确性，并加快收敛速度。The reduced-dimensional data is input into a classifier. An embodiment of the present invention proposes a probabilistic neural network classification model based on adaptive learning rate adjustment. Different from a traditional probabilistic neural network, the probabilistic neural network based on adaptive learning rate adjustment not only considers the manifold structure of the data, but also introduces an adaptive learning rate adjustment mechanism, which can dynamically adjust the learning rate according to the performance of the model during the training process, thereby optimizing the model training process, improving classification accuracy, and accelerating convergence speed.

本发明实施例提供了一种基于人工智能的土壤污染度检测方法，基于变分量子编码的生成对抗网络数据扩充，利用量子计算的优势，增强生成模型的能力。此外，通过变分量子算法构建高维度潜在空间，为数据生成提供丰富的数据表示基础；提高了数据扩充的真实性和多样性，通过量子计算增强的生成模型能够生成更加真实和多样化的数据，有助于提高模型的泛化能力和准确性。策略约束次模函数优化的神经网络算法应用于数据特征提取，引入策略约束来优化特征提取过程，确保提取的特征既高效又能保留数据的核心信息。此外，在特征提取过程中引入稀疏性约束，以减少特征空间的维度并提升模型的解释性；策略约束次模函数优化的神经网络算法能够有效提取与土壤污染度检测任务高度相关的特征，避免无关特征的干扰。基于潜在量子编码的自编码器进行特征降维，通过量子位的叠加和纠缠特性来增强特征表达能力，实现更高效的数据压缩和特征表示；数据压缩效率提高，基于潜在量子编码的自编码器能够在降维过程中保留更多关键信息，实现高效的数据压缩。基于自适应学习率调整的概率神经网络分类模型，引入自适应学习率调整机制，根据模型在训练过程中的表现动态调整学习率，优化模型训练过程；训练过程的优化和分类准确性提升，自适应学习率调整机制能够根据模型表现动态调整学习率，加快收敛速度并提高分类准确性。The embodiment of the present invention provides a soil contamination detection method based on artificial intelligence, which is based on the data expansion of the generative adversarial network of variational quantum coding, and utilizes the advantages of quantum computing to enhance the ability of the generative model. In addition, a high-dimensional latent space is constructed by a variational quantum algorithm to provide a rich data representation basis for data generation; the authenticity and diversity of data expansion are improved, and the generative model enhanced by quantum computing can generate more real and diverse data, which helps to improve the generalization ability and accuracy of the model. The neural network algorithm optimized by the policy-constrained submodular function is applied to data feature extraction, and the policy constraints are introduced to optimize the feature extraction process to ensure that the extracted features are both efficient and can retain the core information of the data. In addition, sparsity constraints are introduced in the feature extraction process to reduce the dimension of the feature space and improve the interpretability of the model; the neural network algorithm optimized by the policy-constrained submodular function can effectively extract features that are highly relevant to the soil contamination detection task and avoid interference from irrelevant features. The autoencoder based on potential quantum coding performs feature dimensionality reduction, enhances the feature expression capability through the superposition and entanglement characteristics of quantum bits, and achieves more efficient data compression and feature representation; the data compression efficiency is improved, and the autoencoder based on potential quantum coding can retain more key information in the dimensionality reduction process to achieve efficient data compression. A probabilistic neural network classification model based on adaptive learning rate adjustment introduces an adaptive learning rate adjustment mechanism to dynamically adjust the learning rate according to the performance of the model during training and optimize the model training process. The training process is optimized and the classification accuracy is improved. The adaptive learning rate adjustment mechanism can dynamically adjust the learning rate according to the model performance, accelerate the convergence speed and improve the classification accuracy.

实施例三Embodiment 3

在上述实施例的基础上，本发明实施例提供了一种基于人工智能的土壤污染度检测装置，参见图3所示的本发明实施例提供的一种基于人工智能的土壤污染度检测装置的结构示意图，该装置包括：On the basis of the above embodiments, an embodiment of the present invention provides a soil contamination detection device based on artificial intelligence. Referring to FIG3 , a structural diagram of a soil contamination detection device based on artificial intelligence provided by an embodiment of the present invention, the device includes:

训练数据获取模块310，用于获取样本土壤的样本数据，对样本数据进行标注；其中，标注的类别表征样本土壤的污染程度。The training data acquisition module 310 is used to acquire sample data of sample soil and label the sample data; wherein the labeled category represents the pollution degree of the sample soil.

数据扩充模块320，用于通过基于变分量子编码的生成对抗网络对标注后的样本数据进行数据扩充，得到扩充数据。The data expansion module 320 is used to perform data expansion on the labeled sample data through a generative adversarial network based on variational quantum coding to obtain expanded data.

数据处理模块330，用于将扩充数据进行数据处理，得到训练数据；其中，数据处理包括：特征提取处理和特征降维处理。The data processing module 330 is used to process the expanded data to obtain training data; wherein the data processing includes: feature extraction processing and feature dimension reduction processing.

数据分类模块340，用于将训练数据输入分类器中，得到最终分类模型；其中，分类器为基于自适应学习率调整的概率神经网络分类器。The data classification module 340 is used to input the training data into the classifier to obtain the final classification model; wherein the classifier is a probabilistic neural network classifier based on adaptive learning rate adjustment.

目标数据获取模块350，用于获取目标土壤对应的目标数据后，将目标数据进行数据处理，得到特征降为后的目标数据。The target data acquisition module 350 is used to acquire the target data corresponding to the target soil, and then process the target data to obtain the target data after feature reduction.

数据评估模块360，用于将特征降为后的目标数据输入最终分类模型中，得到目标土壤对应的污染程度。The data evaluation module 360 is used to input the target data after feature reduction into the final classification model to obtain the pollution degree corresponding to the target soil.

在本发明一些较佳的实施例中，数据扩充模块320，用于通变分量子编码构建第一量子潜在空间；生成对抗网络的生成器基于在第一量子潜在空间采集的特征生成数据；生成对抗网络的判别器判别生成器生成的数据和真实数据的区别；通过对抗训练迭代优化生成器的参数和判别器的参数，直至生成器生成的数据的多样性满足预设的多样性条件。In some preferred embodiments of the present invention, the data expansion module 320 is used to construct a first quantum latent space by variable component quantum coding; the generator of the generative adversarial network generates data based on the features collected in the first quantum latent space; the discriminator of the generative adversarial network distinguishes the difference between the data generated by the generator and the real data; and the parameters of the generator and the parameters of the discriminator are iteratively optimized through adversarial training until the diversity of the data generated by the generator meets the preset diversity conditions.

在本发明一些较佳的实施例中，数据处理模块330，用于通过基于次模函数优化的神经网络模型对扩充数据进行特征提取，得到特征提取后的数据；其中，基于策略约束次模函数的优化过程；将特征提取后的数据输入特征降维模型中，得到降维后的数据，将降为后的数据确定为训练数据；其中，特征降维模型基于潜在量子编码的自编码神经网络算法确定。In some preferred embodiments of the present invention, the data processing module 330 is used to extract features from the expanded data through a neural network model based on submodular function optimization to obtain feature-extracted data; wherein, the optimization process of the submodular function is based on strategy constraints; the feature-extracted data is input into a feature dimensionality reduction model to obtain reduced data, and the reduced data is determined as training data; wherein, the feature dimensionality reduction model is determined based on an autoencoding neural network algorithm of potential quantum coding.

在本发明一些较佳的实施例中，数据处理模块330，用于初始化神经网络模型的参数；设定优化过程中的策略约束条件；其中，策略约束的策略约束函数基于约束条件的数量、每个约束条件对应的权重和每个约束条件的计算公式约束；迭代进行训练操作，直至达到预设的最大迭代次数；训练操作包括：基于神经网络模型对输入数据进行特征提取；其中，特征提取的目标函数基于输入数据的损失函数、稀疏性约束项、调节稀疏性的超参数和策略约束影响力的超参数约束；评估特征提取后的数据的有效性，基于评估结果调整策略约束条件和神经网络模型的参数；其中，评估函数基于评估指标点数量、评估指标的计算公式约束。In some preferred embodiments of the present invention, the data processing module 330 is used to initialize the parameters of the neural network model; set the strategy constraints in the optimization process; wherein the strategy constraint function of the strategy constraint is based on the number of constraints, the weight corresponding to each constraint and the calculation formula constraints of each constraint; iterate the training operation until the preset maximum number of iterations is reached; the training operation includes: extracting features of the input data based on the neural network model; wherein the objective function of the feature extraction is based on the loss function of the input data, the sparsity constraint term, the hyperparameter for adjusting the sparsity and the hyperparameter constraint for the influence of the strategy constraint; evaluate the validity of the data after feature extraction, and adjust the strategy constraints and the parameters of the neural network model based on the evaluation results; wherein the evaluation function is based on the number of evaluation indicator points and the calculation formula constraints of the evaluation indicators.

在本发明一些较佳的实施例中，数据处理模块330，用于基于动态特征感知网络调整机制调整神经网络模型的参数；其中，调整函数基于神经网络模型的当前的参数、评估函数相对于当前的参数的梯度、网络结构调整函数、学习率和动态调整因子约束；评估函数相对于当前的参数的梯度基于提取的特征值、目标特征值和特征的总数约束；网络结构调整函数基于控制网络调整幅度的超参数、控制网络敏感度的超参数和针对生成器的优化方向约束。In some preferred embodiments of the present invention, the data processing module 330 is used to adjust the parameters of the neural network model based on the dynamic feature-aware network adjustment mechanism; wherein the adjustment function is based on the current parameters of the neural network model, the gradient of the evaluation function relative to the current parameters, the network structure adjustment function, the learning rate and the dynamic adjustment factor constraints; the gradient of the evaluation function relative to the current parameters is based on the extracted eigenvalues, the target eigenvalues and the total number of features constraints; the network structure adjustment function is based on the hyperparameters for controlling the network adjustment amplitude, the hyperparameters for controlling the network sensitivity and the optimization direction constraints for the generator.

在本发明一些较佳的实施例中，数据处理模块330，用于初始化特征降维模型的参数；迭代执行数据降维操作，直至达到预设的最大迭代次数；数据降为操作包括：将输入特征降维模型的原始数据通过编码器向前传播至第二量子潜在空间得到量子位的数据；基于解码器将量子位的数据红狗会原始空间，得到重构数据；基于重构数据和原始数据采用异步策略对编码器的参数和解码器的参数进行更新；其中，基于限制性优化策略调整量子位的状态。In some preferred embodiments of the present invention, the data processing module 330 is used to initialize the parameters of the feature dimensionality reduction model; iteratively perform the data dimensionality reduction operation until a preset maximum number of iterations is reached; the data dimensionality reduction operation includes: forwarding the original data of the input feature dimensionality reduction model through the encoder to the second quantum latent space to obtain the quantum bit data; based on the decoder, the quantum bit data is returned to the original space to obtain reconstructed data; based on the reconstructed data and the original data, the parameters of the encoder and the parameters of the decoder are updated using an asynchronous strategy; wherein, the state of the quantum bit is adjusted based on a restrictive optimization strategy.

在本发明一些较佳的实施例中，数据分类模块340，用于将训练数据分为训练集和验证集；基于训练集训练分类器，基于验证集验证训练后的分类器，直至训练集对应的损失函数达到预设的阈值，将训练完成的分类器确定为最终分类模型；其中，在训练过程中基于自适应学习率调整机制调整分类器的学习率。In some preferred embodiments of the present invention, the data classification module 340 is used to divide the training data into a training set and a validation set; train the classifier based on the training set, and validate the trained classifier based on the validation set until the loss function corresponding to the training set reaches a preset threshold, and determine the trained classifier as the final classification model; wherein, during the training process, the learning rate of the classifier is adjusted based on an adaptive learning rate adjustment mechanism.

所属领域的技术人员可以清楚地了解到，为描述的方便和简洁，上述描述的基于人工智能的土壤污染度检测装置的具体工作过程，可以参考前述的基于人工智能的土壤污染度检测方法的实施例中的对应过程，在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and brevity of description, the specific working process of the soil contamination detection device based on artificial intelligence described above can refer to the corresponding process in the embodiment of the aforementioned soil contamination detection method based on artificial intelligence, and will not be repeated here.

实施例四Embodiment 4

本发明实施例还提供了一种电子设备，用于运行基于人工智能的土壤污染度检测方法；参见图4所示的本发明实施例提供的一种电子设备的结构示意图，该电子设备包括存储器400和处理器401，其中，存储器400用于存储一条或多条计算机指令，一条或多条计算机指令被处理器401执行，以实现上述基于人工智能的土壤污染度检测方法。An embodiment of the present invention further provides an electronic device for running a soil contamination detection method based on artificial intelligence; referring to FIG4 , a schematic diagram of the structure of an electronic device provided by an embodiment of the present invention, the electronic device includes a memory 400 and a processor 401, wherein the memory 400 is used to store one or more computer instructions, and the one or more computer instructions are executed by the processor 401 to implement the above-mentioned soil contamination detection method based on artificial intelligence.

进一步地，图4所示的电子设备还包括总线402和通信接口403，处理器401、通信接口403和存储器400通过总线402连接。Furthermore, the electronic device shown in FIG. 4 further includes a bus 402 and a communication interface 403 , and the processor 401 , the communication interface 403 and the memory 400 are connected via the bus 402 .

其中，存储器400可能包含高速随机存取存储器(RAM，Random Access Memory)，也可能还包括非不稳定的存储器(non-volatile memory)，例如至少一个磁盘存储器。通过至少一个通信接口403(可以是有线或者无线)实现该系统网元与至少一个其他网元之间的通信连接，可以使用互联网，广域网，本地网，城域网等。总线402可以是ISA总线、PCI总线或EISA总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示，图4中仅用一个双向箭头表示，但并不表示仅有一根总线或一种类型的总线。The memory 400 may include a high-speed random access memory (RAM), and may also include a non-volatile memory, such as at least one disk storage. The communication connection between the system network element and at least one other network element is realized through at least one communication interface 403 (which may be wired or wireless), and the Internet, wide area network, local area network, metropolitan area network, etc. may be used. The bus 402 may be an ISA bus, a PCI bus, or an EISA bus, etc. The bus can be divided into an address bus, a data bus, a control bus, etc. For ease of representation, only one bidirectional arrow is used in FIG4, but it does not mean that there is only one bus or one type of bus.

处理器401可能是一种集成电路芯片，具有信号的处理能力。在实现过程中，上述方法的各步骤可以通过处理器401中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器401可以是通用处理器，包括中央处理器(Central Processing Unit，简称CPU)、网络处理器(Network Processor，简称NP)等；还可以是数字信号处理器(DigitalSignal Processor，简称DSP)、专用集成电路(Application Specific IntegratedCircuit，简称ASIC)、现场可编程门阵列(Field-Programmable Gate Array，简称FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本发明实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本发明实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成，或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器，闪存、只读存储器，可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器400，处理器401读取存储器400中的信息，结合其硬件完成前述实施例的方法的步骤。The processor 401 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method can be completed by the hardware integrated logic circuit or software instructions in the processor 401. The above processor 401 can be a general-purpose processor, including a central processing unit (CPU), a network processor (NP), etc.; it can also be a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components. The methods, steps and logic block diagrams disclosed in the embodiments of the present invention can be implemented or executed. The general processor can be a microprocessor or the processor can also be any conventional processor, etc. The steps of the method disclosed in conjunction with the embodiments of the present invention can be directly embodied as a hardware decoding processor for execution, or a combination of hardware and software modules in the decoding processor for execution. The software module may be located in a storage medium mature in the art, such as a random access memory, a flash memory, a read-only memory, a programmable read-only memory, or an electrically erasable programmable memory, a register, etc. The storage medium is located in the memory 400, and the processor 401 reads the information in the memory 400 and completes the steps of the method of the above embodiment in combination with its hardware.

本发明实施例还提供了一种计算机可读存储介质，该计算机可读存储介质存储有计算机可执行指令，该计算机可执行指令在被处理器调用和执行时，计算机可执行指令促使处理器实现上述业务推荐方法，具体实现可参见方法实施例，在此不再赘述。An embodiment of the present invention further provides a computer-readable storage medium, which stores computer-executable instructions. When the computer-executable instructions are called and executed by a processor, the computer-executable instructions prompt the processor to implement the above-mentioned business recommendation method. The specific implementation can be found in the method embodiment, which will not be repeated here.

本发明实施例所提供的基于人工智能的土壤污染度检测方法、装置和电子设备的计算机程序产品，包括存储了程序代码的计算机可读存储介质，程序代码包括的指令可用于执行前面方法实施例中的方法，具体实现可参见方法实施例，在此不再赘述。The computer program product of the soil contamination detection method, device and electronic device based on artificial intelligence provided in the embodiments of the present invention includes a computer-readable storage medium storing program code. The instructions included in the program code can be used to execute the method in the previous method embodiment. The specific implementation can be referred to the method embodiment, which will not be repeated here.

所属领域的技术人员可以清楚地了解到，为描述的方便和简洁，上述描述的系统和/或装置的具体工作过程，可以参考前述方法实施例中的对应过程，在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and brevity of description, the specific working process of the system and/or device described above can refer to the corresponding process in the aforementioned method embodiment, and will not be repeated here.

另外，在本发明实施例的描述中，除非另有明确的规定和限定，术语“安装”、“相连”、“连接”应做广义理解，例如，可以是固定连接，也可以是可拆卸连接，或一体地连接；可以是机械连接，也可以是电连接；可以是直接相连，也可以通过中间媒介间接相连，可以是两个元件内部的连通。对于本领域的普通技术人员而言，可以具体情况理解上述术语在本发明中的具体含义。In addition, in the description of the embodiments of the present invention, unless otherwise clearly specified and limited, the terms "installed", "connected", and "connected" should be understood in a broad sense, for example, it can be a fixed connection, a detachable connection, or an integral connection; it can be a mechanical connection or an electrical connection; it can be a direct connection or an indirect connection through an intermediate medium, or it can be the internal communication of two components. For ordinary technicians in this field, the specific meanings of the above terms in the present invention can be understood according to specific circumstances.

所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储介质中。基于这样的理解，本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括：U盘、移动硬盘、只读存储器(ROM，Read-Only Memory)、随机存取存储器(RAM，Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。If the functions are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention, or the part that contributes to the prior art or the part of the technical solution, can be embodied in the form of a software product. The computer software product is stored in a storage medium, including several instructions for enabling a computer device (which can be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in each embodiment of the present invention. The aforementioned storage medium includes: various media that can store program codes, such as a USB flash drive, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk.

最后应说明的是：以上各实施例仅用以说明本发明的技术方案，而非对其限制；尽管参照前述各实施例对本发明进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分或者全部技术特征进行等同替换；而这些修改或者替换，并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, rather than to limit it. Although the present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that they can still modify the technical solutions described in the aforementioned embodiments, or replace some or all of the technical features therein by equivalents. However, these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A soil contamination detection method based on artificial intelligence, characterized by comprising:

Acquire sample data of the sample soil, and label the sample data; wherein the labelled category represents the degree of contamination of the sample soil;

Performing data expansion on the labeled sample data through a generative adversarial network based on variational quantum coding to obtain expanded data;

Processing the expanded data to obtain training data; wherein the data processing includes: feature extraction processing and feature dimension reduction processing;

Inputting the training data into a classifier to obtain a final classification model; wherein the classifier is a probabilistic neural network classifier based on adaptive learning rate adjustment;

After acquiring the target data corresponding to the target soil, the target data is subjected to the data processing to obtain the target data after feature reduction;

The target data after feature reduction is input into the final classification model to obtain the pollution degree corresponding to the target soil.

2. The soil contamination detection method based on artificial intelligence according to claim 1 is characterized in that the step of training the generative adversarial network based on variational quantum coding comprises:

Constructing the first quantum potential space through variable component quantum coding;

The generator of the generative adversarial network generates data based on features collected in the first quantum latent space;

The discriminator of the generative adversarial network distinguishes the difference between the data generated by the generator and the real data;

The parameters of the generator and the parameters of the discriminator are iteratively optimized through adversarial training until the diversity of the data generated by the generator meets the preset diversity conditions.

3. According to the artificial intelligence-based soil contamination detection method of claim 2, it is characterized in that data is generated based on an incremental loss function constraint in the adversarial training; wherein the incremental loss function is based on the loss function of the generator, the first adjustment coefficient of the penalty term in the iterative process, and the second adjustment coefficient constraint of the penalty term in the iterative process.

4. The artificial intelligence-based soil contamination detection method according to claim 2 is characterized in that the diversity of the data generated by the generator is based on the generated data, the average value of the generated data, the number of the generated data and the dimensional evaluation of the generated data.

5. The soil contamination detection method based on artificial intelligence according to claim 1 is characterized in that the step of processing the expanded data to obtain training data comprises:

Extracting features from the expanded data through a neural network model optimized based on a submodular function to obtain feature-extracted data; wherein the optimization process of the submodular function is constrained based on a strategy;

The feature-extracted data is input into a feature dimension reduction model to obtain dimension-reduced data, and the dimension-reduced data is determined as the training data; wherein the feature dimension reduction model is determined based on a self-encoding neural network algorithm of potential quantum coding.

6. The soil contamination detection method based on artificial intelligence according to claim 5 is characterized in that the step of training the neural network model based on submodular function optimization comprises:

Initializing parameters of the neural network model;

Setting strategy constraints in the optimization process; wherein the strategy constraint function of the strategy constraint is based on the number of constraints, the weight corresponding to each constraint, and the calculation formula constraint of each constraint;

The training operation is iterated until a preset maximum number of iterations is reached; the training operation includes: extracting features from the input data based on the neural network model; wherein the objective function of the feature extraction is based on the loss function of the input data, the sparsity constraint term, the hyperparameter for adjusting the sparsity, and the hyperparameter constraint for the influence of the strategy constraint; evaluating the validity of the data after the feature extraction, and adjusting the strategy constraint conditions and the parameters of the neural network model based on the evaluation results; wherein the evaluation function is based on the number of evaluation index points and the calculation formula constraints of the evaluation index.

7. The soil contamination detection method based on artificial intelligence according to claim 6 is characterized in that the step of adjusting the parameters of the neural network model based on the evaluation results comprises:

The parameters of the neural network model are adjusted based on a dynamic feature-aware network adjustment mechanism; wherein the adjustment function is based on the current parameters of the neural network model, the gradient of the evaluation function relative to the current parameters, the network structure adjustment function, the learning rate and the dynamic adjustment factor constraints; the gradient of the evaluation function relative to the current parameters is based on the extracted eigenvalues, the target eigenvalues and the total number of features constraints; the network structure adjustment function is based on the hyperparameters for controlling the network adjustment amplitude, the hyperparameters for controlling the network sensitivity and the optimization direction constraints for the generator.

8. The soil contamination detection method based on artificial intelligence according to claim 5 is characterized in that the step of training the feature dimension reduction model comprises:

Initializing parameters of the feature dimensionality reduction model;

Iteratively perform a data dimension reduction operation until a preset maximum number of iterations is reached; the data reduction operation includes: forward propagating the original data input into the feature dimension reduction model to a second quantum potential space through an encoder to obtain quantum bit data; based on a decoder, the quantum bit data is transferred to the original space to obtain reconstructed data; based on the reconstructed data and the original data, an asynchronous strategy is used to update the parameters of the encoder and the decoder; wherein, the state of the quantum bit is adjusted based on a restrictive optimization strategy.

9. The soil contamination detection method based on artificial intelligence according to claim 1 is characterized in that the step of inputting the training data into the classifier to obtain the final classification model comprises:

Dividing the training data into a training set and a validation set;

The classifier is trained based on the training set, and the trained classifier is verified based on the verification set until the loss function corresponding to the training set reaches a preset threshold, and the trained classifier is determined as the final classification model; wherein, during the training process, the learning rate of the classifier is adjusted based on an adaptive learning rate adjustment mechanism.

10. A soil contamination detection device based on artificial intelligence, characterized by comprising:

A training data acquisition module is used to acquire sample data of sample soil and to label the sample data; wherein the labeled category represents the degree of contamination of the sample soil;

A data expansion module, used to perform data expansion on the labeled sample data through a generative adversarial network based on variational quantum coding to obtain expanded data;

A data processing module is used to process the expanded data to obtain training data; wherein the data processing includes: feature extraction processing and feature dimension reduction processing;

A data classification module, used for inputting the training data into a classifier to obtain a final classification model; wherein the classifier is a probabilistic neural network classifier based on adaptive learning rate adjustment;

A target data acquisition module, used for acquiring target data corresponding to the target soil, and then performing the data processing on the target data to obtain target data after feature reduction;

The data evaluation module is used to input the target data after feature reduction into the final classification model to obtain the pollution degree corresponding to the target soil.