CN117522066A

CN117522066A - Combined optimization method and system based on peak shaving power supply equipment combination prediction

Info

Publication number: CN117522066A
Application number: CN202311580471.XA
Authority: CN
Inventors: 荆岫岩; 郝峰; 黄坤; 金一川; 王子琪; 孙建业; 方冰; 范强; 许永鹏; 严英杰; 刘亚东; 江秀臣
Original assignee: State Grid Xinyuan Group Co Ltd; Shaoxing Power Supply Co of State Grid Zhejiang Electric Power Co Ltd; Shanghai Jiao Tong University; State Grid Corp of China SGCC
Current assignee: State Grid Xinyuan Group Co Ltd; Shaoxing Power Supply Co of State Grid Zhejiang Electric Power Co Ltd; Shanghai Jiao Tong University; State Grid Corp of China SGCC
Priority date: 2023-11-24
Filing date: 2023-11-24
Publication date: 2024-02-06

Abstract

The invention discloses a joint optimization method and system based on peak-shaving power equipment combination prediction, and relates to the technical field of hydropower unit status monitoring. According to the different characteristics of hydropower unit parameters, ARIMA models, random forest models, and LSTM neural network models are established to predict parameters and weighted averages to obtain predicted parameters; then the predicted parameters are input into the support vector machine SVM model and XGBoost model for hydropower unit fault classification and early warning. , the weighted evidence theory method is used to fuse the information of the early warning results to obtain the final fault classification early warning results. Through the calculation and fusion of multiple models, the problems of insufficient classification and early warning accuracy and low reliability of a single model are overcome, and accurate and reliable parameter prediction and status early warning of peak-shaving power equipment, especially hydropower equipment, are achieved. It provides strong support for the safe and stable operation of power equipment.

Description

A joint optimization method and system based on peak-shaving power equipment combination prediction

技术领域Technical field

本发明属于水电机组状态监测技术领域，具体涉及一种基于调峰电源设备组合预测的联合优化方法及系统。The invention belongs to the technical field of hydropower unit status monitoring, and specifically relates to a joint optimization method and system based on peak-shaving power equipment combination prediction.

背景技术Background technique

水电机组是电力系统的关键设备之一，其运行状态关乎电网能否可靠供电。因此运用可靠的状态监测技术进行分析，并对其未来的运行状态进行预测，及时获知故障信息，杜绝事故的进一步蔓延，避免重大电力事故发生，从而保证水电机组的正常运行，对于电网调峰供电具有重要意义。Hydropower units are one of the key equipment of the power system, and their operating status is related to whether the power grid can provide reliable power supply. Therefore, reliable condition monitoring technology is used to analyze and predict its future operating status, obtain fault information in a timely manner, prevent further spread of accidents, and avoid major power accidents, thereby ensuring the normal operation of hydropower units and for grid peak shaving power supply. of great significance.

但是，传统的水电机组诊断方法依赖设备的理论知识和人为经验，普适性较差、效率低下，无法有效地对水电机组状态作出预测和及时反映预警。而基于数据挖掘的状态监测方法主要有三大类，其中，统计分析方法受限于统计分析理论，只对特定数据集有较高精度，泛化性较差；信号处理方法通过提取时频域特征参数对系统状态进行表征，但时频变换容易出现信息的损失；基于机器学习的设备监测方法很多，但是存在单一预测模型监测分类通用性较差，预测可靠性可信度不足的问题。所以，上述人工或基于数据挖掘的检测方法并不能较好的广泛地用于水电机组运行状态的准确可靠预测。However, traditional hydropower unit diagnosis methods rely on theoretical knowledge of equipment and human experience, have poor universality and low efficiency, and cannot effectively predict the status of hydropower units and provide timely warnings. There are three main categories of condition monitoring methods based on data mining. Among them, statistical analysis methods are limited by statistical analysis theory and only have higher accuracy for specific data sets and poor generalization; signal processing methods extract time-frequency domain features Parameters characterize the system state, but time-frequency transformation is prone to information loss; there are many equipment monitoring methods based on machine learning, but there are problems such as poor versatility of single prediction model monitoring classification and insufficient prediction reliability. Therefore, the above-mentioned manual or data mining-based detection methods cannot be widely used for accurate and reliable prediction of the operating status of hydropower units.

因此，如何提出一种多模型组合的水电机组运行状态预测方法，从而提高水电机组运行状态预测可靠性和可信度，是本领域技术人员亟需解决的技术问题。Therefore, how to propose a multi-model combination method for predicting the operating status of hydropower units, thereby improving the reliability and credibility of hydropower unit operating status predictions, is a technical problem that technicians in the field urgently need to solve.

发明内容Contents of the invention

有鉴于此，本发明提供了一种基于调峰电源设备组合预测的联合优化方法及系统，以数据挖掘的方式，实现对调峰电源设备尤其是水电机组设备的参数预测和状态预警，从而提高水电机组运行状态预测的可靠性和可信度。In view of this, the present invention provides a joint optimization method and system based on combination prediction of peaking power supply equipment, which uses data mining to realize parameter prediction and status warning of peaking power supply equipment, especially hydroelectric power equipment, thereby improving Reliability and credibility of hydropower unit operating status prediction.

为了实现上述目的，本发明提供如下技术方案：In order to achieve the above objects, the present invention provides the following technical solutions:

本发明公开了一种基于调峰电源设备组合预测的联合优化方法，具体步骤如下：The invention discloses a joint optimization method based on peak-shaving power supply equipment combination prediction. The specific steps are as follows:

获取水电机组的内部工作参数、外部环境参数以及电网负荷参数，构建数据表；Obtain the internal working parameters, external environmental parameters and power grid load parameters of the hydropower unit and construct a data table;

根据所述数据表，确定初始数据集；According to the data table, determine the initial data set;

根据初始数据集中数据的线性特征、非线性特征和复杂特征，构建ARIMA模型、随机森林模型、LSTM神经网络模型，进行参数预测；计算组合权重，对预测参数进行加权平均，得到最终输出结果，作为预测数据集；According to the linear characteristics, nonlinear characteristics and complex characteristics of the data in the initial data set, construct the ARIMA model, random forest model, and LSTM neural network model to predict parameters; calculate the combination weight, perform a weighted average of the predicted parameters, and obtain the final output result as prediction data set;

将多种不同类型的水电机组故障定义为一个故障空间，并结合初始数据集，分别搭建支持向量机SVM模型和XGBoost模型，使用寻优算法对两个模型的参数进行优化；根据预测数据集和故障空间，对水电机组的故障进行分类和预警，分别得到初始预警结果；将所述初始预警结果进行信息融合，输出最终状态预警结果。Define multiple different types of hydropower unit faults as a fault space, and combine them with the initial data set to build a support vector machine SVM model and an XGBoost model respectively, and use an optimization algorithm to optimize the parameters of the two models; according to the prediction data set and In the fault space, the faults of the hydropower unit are classified and early-warned, and the initial early-warning results are obtained respectively; the initial early-warning results are information fused to output the final state early-warning results.

进一步的，所述内部工作参数包括：水轮机的进水压强、振动、摆渡系数数据，发电机转子的交流阻抗、磁极系数数据，变压器中的油温和气体浓度数据；所述部环境参数，包括：气温、地面温度、相对湿度、平均风速数据；所述电网负荷参数，包括：运行电压、有功功率、无功功率；然后，根据各参数获取时间构建数据表。Further, the internal working parameters include: the water inlet pressure, vibration, and ferry coefficient data of the turbine, the AC impedance and magnetic pole coefficient data of the generator rotor, and the oil temperature and gas concentration data in the transformer; the environmental parameters include: Air temperature, ground temperature, relative humidity, and average wind speed data; the power grid load parameters include: operating voltage, active power, and reactive power; then, a data table is constructed based on the acquisition time of each parameter.

进一步的，所述确定初始数据集，包括：首先，判断数据表中缺失值占样本量总量的比例，当所占比例大于阈值时，采用插补法补全数据集中的缺失值，小于阈值时，采用删除法直接删除缺失值；再，进行异常值处理；然后，采用Z-score方法进行数据标准化；最后，采用袋外数据特征置换法进行特征提取，得到初始数据集。Further, the determination of the initial data set includes: first, judging the proportion of missing values in the data table to the total sample size. When the proportion is greater than the threshold, the interpolation method is used to complete the missing values in the data set. When the proportion is less than the threshold, When , the deletion method is used to directly delete missing values; then, outlier processing is performed; then, the Z-score method is used for data standardization; finally, the out-of-bag data feature replacement method is used for feature extraction to obtain the initial data set.

进一步的，所述构建ARIMA模型、随机森林模型、LSTM神经网络模型进行参数预测，包括：Further, the construction of ARIMA model, random forest model, and LSTM neural network model for parameter prediction includes:

选取初始数据集中具有线性特征的参数，构建ARIMA模型并训练，利用训练好的ARIMA模型进行参数预测；选取初始数据集中具有非线性特征的参数，构建随机森林模型并训练，利用训练好的随机森林模型进行参数预测；选取初始数据集中具有复杂特征的参数，构建LSTM神经网络模型并训练，利用训练好的LSTM神经网络模型进行参数预测；Select parameters with linear characteristics in the initial data set, build an ARIMA model and train it, and use the trained ARIMA model to predict parameters; select parameters with nonlinear characteristics in the initial data set, build a random forest model and train it, and use the trained random forest The model performs parameter prediction; select parameters with complex characteristics in the initial data set, construct an LSTM neural network model and train it, and use the trained LSTM neural network model to perform parameter prediction;

所述计算组合权重，对预测参数进行加权平均，得到最终输出结果，包括：Calculate the combination weight, perform a weighted average of the prediction parameters, and obtain the final output result, including:

利用拟合优度算法分配ARIMA模型、随机森林模型、LSTM神经网络模型的权重值，首先计算拟合优度算法的可决系数p，p的正常取值范围为[0,1]，其计算公式为：Use the goodness-of-fit algorithm to allocate the weight values of the ARIMA model, random forest model, and LSTM neural network model. First, calculate the determination coefficient p of the goodness-of-fit algorithm. The normal value range of p is [0,1]. Its calculation The formula is:

其中，为第i个参数预测数值，/>为真实值的平均值，z_i为第i个参数真实值，n为预测参数总个数；in, Predict the value for the i-th parameter,/> is the average of the real values, z _i is the real value of the i-th parameter, and n is the total number of predicted parameters;

然后，使用tangent函数对可决系数p进行优化，最后得到优化后的组合权重计算公式：Then, use the tangent function to optimize the determination coefficient p, and finally obtain the optimized combination weight calculation formula:

其中，w_j代表第j个子模型的组合权重，z_j代表第j个子模型的p值，h(z_j)代表第j个子模型使用tangent函数变换后的p值；Among them, w _j represents the combined weight of the j-th sub-model, z _j represents the p-value of the j-th sub-model, and h(z _j ) represents the p-value of the j-th sub-model transformed using the tangent function;

根据权重值w_j，计算ARIMA模型、随机森林模型、LSTM神经网络模型的预测参数加权平均值得到最终输出结果，作为预测数据集。According to the weight value w _j , calculate the weighted average of the prediction parameters of the ARIMA model, random forest model, and LSTM neural network model to obtain the final output result as a prediction data set.

进一步的，所述ARIMA模型由引入差分操作的同时由自回归模型和移动平均模型组成，其初始化系数包含自回归项数和滑动平均项数；所述随机森林模型选择Bootstrap采样来构建每个基础决策树；所述LSTM神经网络模型使用随机梯度下降Adam算法优化，损失函数为交叉熵函数。Further, the ARIMA model is composed of an autoregressive model and a moving average model while introducing a differential operation, and its initialization coefficient includes the number of autoregressive terms and the number of moving average terms; the random forest model selects Bootstrap sampling to construct each basis Decision tree; the LSTM neural network model is optimized using the stochastic gradient descent Adam algorithm, and the loss function is the cross entropy function.

进一步的，所述故障空间为Θ{F1，F2，…，F10}作为水电机组故障辨识框架；Further, the fault space is Θ{F1, F2,...,F10} as the hydropower unit fault identification framework;

所述支持向量机SVM模型的构建，采用网格寻优方法，粗略选择惩罚参数c和核函数参数g；然后采用K-折交叉验证方法，选取最优的c值和g值作为支持向量机SVM模型的最优参数；To construct the support vector machine SVM model, the grid optimization method is used to roughly select the penalty parameter c and the kernel function parameter g; then the K-fold cross-validation method is used to select the optimal c value and g value as the support vector machine Optimal parameters of SVM model;

所述XGBoost模型的构建，采用自适应粒子群优化算法不断更新XGBoost模型的基学习器个数、学习率、最大树深度，确定最优化参数；In the construction of the XGBoost model, the adaptive particle swarm optimization algorithm is used to continuously update the number of base learners, learning rate, and maximum tree depth of the XGBoost model to determine the optimization parameters;

所述初始预警结果包括：利用优化好的支持向量机SVM模型和XGBoost模型，以预测数据集参数预测模型的最终输出结果作为输入，对水电机组的故障进行分类预警，分别得到两个初始预警结果；The initial warning results include: using the optimized support vector machine SVM model and XGBoost model, using the final output results of the prediction data set parameter prediction model as input, to classify and warn the faults of the hydropower unit, and obtain two initial warning results respectively. ;

所述最终状态预警结果为：通过加权证据方法，对两个初始预警结果进行信息融合，得到的水电机组状态预警结果。The final status early warning result is: the hydropower unit status early warning result obtained by information fusion of the two initial early warning results through the weighted evidence method.

进一步的，所述K-折交叉验证方法，具体为：将初始数据集中数据分为k个子集，其中k-1个子集作为训练子集，1子集个作为测试子集，选取一组c和g的值进行支持向量机SVM模型训练，并计算测试子集测试结果的均方误差MSE，计算选取多组不同的c和g值的MSE，以MSE最小值原则选取最优的c值和g值。Further, the K-fold cross-validation method is specifically: divide the data in the initial data set into k subsets, of which k-1 subsets are used as training subsets, 1 subsets are used as test subsets, and a group of c is selected. and g values for support vector machine SVM model training, and calculate the mean square error MSE of the test subset test results, calculate and select the MSE of multiple groups of different c and g values, and select the optimal c value and g value.

进一步的，所述自适应粒子群优化算法，具体为：Further, the adaptive particle swarm optimization algorithm is specifically:

确定自适应粒子群优化算法的惯性权重w值，其公式为：Determine the inertia weight w value of the adaptive particle swarm optimization algorithm, and its formula is:

其中，w_max和w_win分别表示惯性权重的最大和最小值，k为当前迭代次数，k_max为最大迭代次数；Among them, w _max and w _win represent the maximum and minimum values of inertia weight respectively, k is the current number of iterations, and k _max is the maximum number of iterations;

根据粒子群的进化程度和聚集程度，对自适应粒子群优化算法的惯性权重进行动态更新，改进后的自适应惯性权重更新公式为：According to the degree of evolution and aggregation of the particle swarm, the inertia weight of the adaptive particle swarm optimization algorithm is dynamically updated. The improved adaptive inertia weight update formula is:

w^*＝w_in-(w_in-w_min)*evol+(w_max-w_in)*aggrw ^* =w _in -(w _in -w _min )*evol+(w _max -w _in )*aggr

其中，w^*表示自适应惯性权重，w_in表示初始惯性权重值，evol和aggr分别为进化程度和聚集程度。Among them, w ^* represents the adaptive inertia weight, w _in represents the initial inertia weight value, evol and aggr are the degree of evolution and the degree of aggregation respectively.

进一步的，所述加权证据方法，具体为：Further, the weighted evidence method is specifically:

定义两个初始预警结果中每一种水电机组故障有两种待组合证据体E1，E2，对应基本概率分配函数分别为m1、m2，对应焦元分别为Ap、Bq，Ap、Bq的公共焦元为C，m1、m2之间的冲突程度系数为H，则融合公式为：It is defined that each type of hydropower unit failure in the two initial warning results has two evidence bodies to be combined E1 and E2, the corresponding basic probability distribution functions are m1 and m2 respectively, the corresponding focal elements are Ap and Bq respectively, and the common focal points of Ap and Bq The element is C, the conflict degree coefficient between m1 and m2 is H, then the fusion formula is:

其中，为证据源对焦元的平均支持程度。in, It is the average support degree of the focal element of the evidence source.

通过融合公式，分别融合得到不同类型水电机组故障的预警结果。Through the fusion formula, the early warning results of different types of hydropower unit failures are obtained through fusion.

进一步的，还包括通过计算平均精度指标，评估水电机组状态预警性能，所述平均精度指标包括：平均准确率、查准率、查全率和调和均值F1-score。Further, it also includes evaluating the performance of hydropower unit status early warning by calculating average accuracy indicators, which include: average accuracy, precision rate, recall rate and harmonic mean F1-score.

本发明还公开了一种基于调峰电源设备组合预测的联合优化系统，包括：The invention also discloses a joint optimization system based on peak-shaving power equipment combination prediction, including:

数据获取模块：获取水电机组的内部工作参数、外部环境参数以及电网负荷参数，构建数据表；Data acquisition module: Obtain the internal working parameters, external environmental parameters and power grid load parameters of the hydropower unit, and construct a data table;

数据预处理模块：根据所述数据表，确定初始数据集；Data preprocessing module: determine the initial data set according to the data table;

参数预测模块：根据初始数据集中数据的线性特征、非线性特征和复杂特征，构建ARIMA模型、随机森林模型、LSTM神经网络模型进行参数预测；计算组合权重，对预测参数进行加权平均，得到最终输出结果，作为预测数据集；Parameter prediction module: Based on the linear characteristics, non-linear characteristics and complex characteristics of the data in the initial data set, construct ARIMA models, random forest models, and LSTM neural network models for parameter prediction; calculate the combination weight, perform a weighted average of the prediction parameters, and obtain the final output The results, as a prediction data set;

状态预警模块：将多种不同类型的水电机组故障定义为一个故障空间，并结合初始数据集，分别搭建支持向量机SVM模型和XGBoost模型，使用寻优算法对两个模型的参数进行优化；根据预测数据集和故障空间，对水电机组的故障进行分类和预警，分别得到初始预警结果；将所述初始预警结果进行信息融合，输出最终状态预警结果。Status early warning module: Define multiple different types of hydropower unit faults as a fault space, and combine them with the initial data set to build a support vector machine SVM model and an XGBoost model respectively, and use an optimization algorithm to optimize the parameters of the two models; according to Predict the data set and fault space, classify and early-warn the faults of the hydropower unit, and obtain the initial early-warning results respectively; perform information fusion on the initial early-warning results, and output the final state early-warning results.

进一步的，所述参数预测模块包括：Further, the parameter prediction module includes:

ARIMA模型单元：选取初始数据集中具有线性特征的参数，构建ARIMA模型并训练，利用训练好的ARIMA模型进行参数预测；ARIMA model unit: Select parameters with linear characteristics in the initial data set, build an ARIMA model and train it, and use the trained ARIMA model to predict parameters;

随机森林模型单元：选取初始数据集中具有非线性特征的参数，构建随机森林模型并训练，利用训练好的随机森林模型进行参数预测；Random forest model unit: select parameters with non-linear characteristics in the initial data set, build a random forest model and train it, and use the trained random forest model to predict parameters;

LSTM神经网络模型单元：选取初始数据集中具有复杂特征的参数，构建LSTM神经网络模型并训练，利用训练好的LSTM神经网络模型进行参数预测；LSTM neural network model unit: Select parameters with complex characteristics in the initial data set, build and train an LSTM neural network model, and use the trained LSTM neural network model to predict parameters;

组合权重单元：利用拟合优度算法分配ARIMA模型、随机森林模型、LSTM神经网络模型的权重值，并根据权重值，计算ARIMA模型、随机森林模型、LSTM神经网络模型的预测参数加权平均值得到最终输出结果，作为预测数据集。Combined weight unit: Use the goodness-of-fit algorithm to allocate the weight values of the ARIMA model, random forest model, and LSTM neural network model, and calculate the weighted average of the prediction parameters of the ARIMA model, random forest model, and LSTM neural network model based on the weight values. The final output result is used as a prediction data set.

进一步的，所述状态预警模块包括：Further, the status warning module includes:

SVM模型单元：采用网格寻优方法，粗略选择惩罚参数c和核函数参数g；然后采用K-折交叉验证方法，选取最优的c值和g值作为支持向量机SVM模型的最优参数，以预测数据集作为输入，对水电机组的故障进行分类预警；SVM model unit: Use the grid optimization method to roughly select the penalty parameter c and kernel function parameter g; then use the K-fold cross-validation method to select the optimal c value and g value as the optimal parameters of the support vector machine SVM model , using the prediction data set as input to classify and warn hydropower unit failures;

XGBoost模型单元：采用自适应粒子群优化算法不断更新XGBoost模型的基学习器个数、学习率、最大树深度，确定最优化参数，以预测数据集作为输入，对水电机组的故障进行分类预警；XGBoost model unit: The adaptive particle swarm optimization algorithm is used to continuously update the number of base learners, learning rate, and maximum tree depth of the XGBoost model, determine the optimization parameters, and use the prediction data set as input to classify and warn hydropower unit failures;

预警结果融合单元：通过加权证据方法，对SVM模型单元和XGBoost模型单元得到的初始预警结果进行信息融合，得到水电机组最终状态预警结果。Early warning result fusion unit: Through the weighted evidence method, the initial early warning results obtained by the SVM model unit and the XGBoost model unit are information fused to obtain the final status early warning result of the hydropower unit.

经由上述的技术方案可知，与现有技术相比，本发明公开提供了一种基于调峰电源设备组合预测的联合优化方法及系统，通过对水电机组的历史参数数据进行预处理和特征提取，根据线性特征、非线性特征以及复杂特征进行分类，分别建立不同的参数预测模型进行参数预测和加权融合，能够更准确地预测水电机组相关参数的未来数据。通过将预测参数输入支持向量机SVM模型和XGBoost模型进行水电机组故障预警，采用加权证据理论思想将二者预警结果进行信息融合，进一步提供了水电机组预警的准确性和可信度，多种模型的融合，克服了单一的预测模型的可靠性不足，分类预警准确性和可信度低的问题，以数据挖掘的方式，实现对调峰电源设备尤其是水电机组设备的参数预测和状态预警，为水电机组及电力系统的安全稳定运行给予了有力支持。It can be seen from the above technical solutions that compared with the existing technology, the present invention provides a joint optimization method and system based on peak-shaving power equipment combination prediction. By preprocessing and feature extraction of historical parameter data of hydropower units, According to the classification of linear features, nonlinear features and complex features, different parameter prediction models are established for parameter prediction and weighted fusion, which can more accurately predict future data of relevant parameters of hydropower units. By inputting the prediction parameters into the support vector machine SVM model and the XGBoost model for hydropower unit fault early warning, the weighted evidence theory is used to fuse the two early warning results, further providing the accuracy and credibility of hydropower unit early warning, and multiple models The integration of this method overcomes the insufficient reliability of a single prediction model and the low accuracy and credibility of classification warnings. It uses data mining to achieve parameter prediction and status warning for peaking power supply equipment, especially hydropower equipment. It provides strong support for the safe and stable operation of hydropower units and power systems.

附图说明Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据提供的附图获得其他的附图。In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings in the following description are only These are embodiments of the present invention. For those of ordinary skill in the art, other drawings can be obtained based on the provided drawings without exerting creative efforts.

图1为本发明的整体流程示意图。Figure 1 is a schematic diagram of the overall process of the present invention.

图2为本发明的状态预警流程示意图。Figure 2 is a schematic diagram of the status warning process of the present invention.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection of the present invention.

本发明实施例公开了一种基于调峰电源设备组合预测的联合优化方法，整体流程如图1所示，具体步骤如下：The embodiment of the present invention discloses a joint optimization method based on peak-shaving power equipment combination prediction. The overall process is shown in Figure 1. The specific steps are as follows:

获取水电机组的内部工作参数、外部环境参数以及电网负荷参数，构建数据表。Obtain the internal working parameters, external environmental parameters and power grid load parameters of the hydropower unit and construct a data table.

对数据表中的数据进行预处理和特征提取得到初始数据集。Perform preprocessing and feature extraction on the data in the data table to obtain the initial data set.

构建参数预测模型，根据初始数据集中数据的线性特征、非线性特征和复杂特征，构建ARIMA模型、随机森林模型、LSTM神经网络模型，进行参数预测；计算组合权重，对预测参数进行加权平均，得到最终输出结果，作为预测数据集。Construct a parameter prediction model. Based on the linear characteristics, nonlinear characteristics and complex characteristics of the data in the initial data set, construct an ARIMA model, a random forest model, and an LSTM neural network model to predict parameters; calculate the combination weight and perform a weighted average of the prediction parameters to obtain The final output result is used as a prediction data set.

构建状态预警模型，将多种不同类型的水电机组故障定义为一个故障空间，并结合初始数据集，分别搭建支持向量机SVM模型和XGBoost模型，使用寻优算法对两个模型的参数进行优化；根据预测数据集和故障空间，对水电机组的故障进行分类和预警，分别得到初始预警结果；对两种模型的初始预警结果进行信息融合，输出最终状态预警结果。Construct a state early warning model, define multiple different types of hydropower unit faults as a fault space, and combine the initial data set to build a support vector machine SVM model and an XGBoost model respectively, and use optimization algorithms to optimize the parameters of the two models; According to the prediction data set and fault space, the faults of hydropower units are classified and early-warned, and the initial early-warning results are obtained respectively; the initial early-warning results of the two models are information fused to output the final state early-warning results.

本发明的一个实施例中，获取水电机组内部工作参数，包括：水轮机的进水压强、振动、摆渡系数数据，发电机转子的交流阻抗、磁极系数数据，变压器中的油温和气体浓度数据；获取水电机组外部环境参数，包括：气温、地面温度、相对湿度、平均风速数据；获取电网负荷参数，包括：运行电压、有功功率、无功功率。然后，根据各参数获取时间构建数据表，数据表见表1。In one embodiment of the present invention, the internal working parameters of the hydropower unit are obtained, including: the inlet water pressure, vibration, and ferry coefficient data of the hydraulic turbine, the AC impedance and magnetic pole coefficient data of the generator rotor, and the oil temperature and gas concentration data in the transformer; obtain The external environmental parameters of the hydropower unit include: air temperature, ground temperature, relative humidity, and average wind speed data; the grid load parameters are obtained, including: operating voltage, active power, and reactive power. Then, a data table is constructed based on the acquisition time of each parameter. The data table is shown in Table 1.

表1数据表Table 1 Data Sheet

本发明的一个实施例中，确定初始数据集还包括：In one embodiment of the present invention, determining the initial data set further includes:

首先，判断数据表中缺失值占样本量总量的比例，当所占比例大于阈值时，采用插补法中的统计法，即采用同一类别对应数据的均值进行补全，从而补全数据集中的缺失值；小于阈值时，采用删除法直接删除缺失值。在缺失样本较多时，样本量较少，采用插补法可以保证一定的样本数量，确保数据集中参数数量，保证训练模型的准确性，在缺失样本较少时，样本数量本申存在一定冗余，直接删除法可以减少插补法所带来的额外计算量。First, determine the proportion of missing values in the data table to the total sample size. When the proportion is greater than the threshold, the statistical method in the interpolation method is used, that is, the mean value of the corresponding data in the same category is used to complete, thereby completing the data set missing values; when less than the threshold, the deletion method is used to directly delete the missing values. When there are many missing samples, the sample size is small. The interpolation method can ensure a certain number of samples, ensure the number of parameters in the data set, and ensure the accuracy of the training model. When there are few missing samples, there is a certain redundancy in the number of samples. , the direct deletion method can reduce the additional calculation amount caused by the interpolation method.

再进行异常值处理，对于不存在水电机组故障的样本，对异常值直接进行删除操作，对于存在水电机组故障的样本，保留异常值。Then perform outlier processing. For samples without hydropower unit failure, the outliers are directly deleted. For samples with hydropower unit failure, the outliers are retained.

然后，采用Z-score方法进行数据标准化，其转换函数为：Then, the Z-score method is used for data standardization, and its transformation function is:

其中，μ为所有样本数据的均值；σ为所有样本数据的标准差，x^*为标准化后的数值，x是处理前的数值。Among them, μ is the mean of all sample data; σ is the standard deviation of all sample data, x ^* is the standardized value, and x is the value before processing.

最后，采用袋外数据特征置换法进行特征提取，得到初始数据集。Finally, the out-of-bag data feature replacement method is used for feature extraction to obtain the initial data set.

本发明的一个实施例中，构建ARIMA模型、随机森林模型、LSTM神经网络模型进行参数预测，包括：In one embodiment of the present invention, an ARIMA model, a random forest model, and an LSTM neural network model are constructed for parameter prediction, including:

选取初始数据集中具有线性特征的参数，构建ARIMA模型并训练，利用训练好的ARIMA模型进行参数预测；例如水轮机的输出功率与进水压力成正比，在正常工作范围内通常具有线性特征。选取初始数据集中具有非线性特征的参数，构建随机森林模型并训练，利用训练好的随机森林模型进行参数预测；例如，油箱中气体浓度通常具有非线性特征，该参数受到温度、压力和油质量等多个因素影响。选取初始数据集中具有复杂特征的参数，构建LSTM神经网络模型并训练，利用训练好的LSTM神经网络模型进行参数预测；例如，摆度系数与水轮机等设备的结构和运行条件有关，难以通过线性或非线性模型描述，具有复杂特征。单一的预测模型往往侧重于处理参数的线性或非线性特征，难以识别出所有的关系，结合多种模型，构建组合状态预测模型，以取得比单一模型更好的效果。Select parameters with linear characteristics in the initial data set, build an ARIMA model and train it, and use the trained ARIMA model to predict parameters; for example, the output power of a hydraulic turbine is proportional to the inlet water pressure and usually has linear characteristics within the normal operating range. Select parameters with nonlinear characteristics in the initial data set, build and train a random forest model, and use the trained random forest model to predict parameters; for example, gas concentration in a fuel tank usually has nonlinear characteristics, and this parameter is affected by temperature, pressure, and oil quality. and many other factors. Select parameters with complex characteristics in the initial data set, build and train an LSTM neural network model, and use the trained LSTM neural network model to predict parameters; for example, the swing coefficient is related to the structure and operating conditions of equipment such as hydraulic turbines, and it is difficult to use linear or Nonlinear model description,with complex characteristics. A single prediction model often focuses on processing the linear or nonlinear characteristics of parameters, making it difficult to identify all relationships. Combining multiple models to build a combined state prediction model can achieve better results than a single model.

ARIMA模型由引入差分操作的同时由自回归模型和移动平均模型组成，其初始化系数包含自回归项数，即自由回归模型的阶数，还包含滑动平均项数。The ARIMA model is composed of an autoregressive model and a moving average model while introducing a difference operation. Its initialization coefficient includes the number of autoregressive terms, that is, the order of the free regression model, and the number of moving average terms.

随机森林模型将多棵CART回归树的预测值取平均后再输出，本发明中选择的回归树的数量为100-200之间，采样方法选择Bootstrap采样来构建每个基础决策树，最小样本数目设置为数据大小的1％。The random forest model averages the predicted values of multiple CART regression trees before outputting them. The number of regression trees selected in this invention is between 100 and 200. The sampling method selects Bootstrap sampling to build each basic decision tree. The minimum number of samples is Set to 1% of data size.

LSTM神经网络模型的输入维数为9，输出维数为1，隐层神经元个数为15，使用随机梯度下降Adam算法优化，损失函数为交叉熵函数。本发明采用深度学习库TensorFlow构建LSTM预测模型，后端采用基于C++开发的TensorFlow-GPU开源框架。将数据按照一定比例进行划分，其中4000组特征数据作为训练集样本，120组数据作为测试集样本。LSTM模型构建时对比不同层数下的预测效果，选取合适的隐藏层层数，确定模型相关参数，其中神经元个数为15，步长为10，学习率为0.01，Dropout的比例为0.5。The input dimension of the LSTM neural network model is 9, the output dimension is 1, and the number of hidden layer neurons is 15. It is optimized using the stochastic gradient descent Adam algorithm, and the loss function is the cross-entropy function. This invention uses the deep learning library TensorFlow to build an LSTM prediction model, and the backend uses the TensorFlow-GPU open source framework developed based on C++. The data is divided according to a certain proportion, with 4000 sets of feature data as training set samples and 120 sets of data as test set samples. When constructing the LSTM model, compare the prediction effects under different numbers of layers, select the appropriate number of hidden layers, and determine the relevant parameters of the model. The number of neurons is 15, the step size is 10, the learning rate is 0.01, and the Dropout ratio is 0.5.

其中，为第i个参数预测数值，/>为真实值的平均值，z_i为第i个参数真实值，n为预测参数总个数。in, Predict the value for the i-th parameter,/> is the average of the real values, z _i is the real value of the i-th parameter, and n is the total number of predicted parameters.

然后，使用tangent函数对可决系数p进行优化，从而对预测效果进行差异放大，最后得到优化后的组合权重计算公式：Then, use the tangent function to optimize the determination coefficient p, thereby amplifying the difference in the prediction effect, and finally obtain the optimized combination weight calculation formula:

其中，w_j代表第j个子模型的组合权重，z_j代表第j个子模型的p值，h(z_j)代表第j个子模型使用tangent函数变换后的p值。Among them, w _j represents the combined weight of the j-th sub-model, z _j represents the p-value of the j-th sub-model, and h(z _j ) represents the p-value of the j-th sub-model transformed using the tangent function.

如图2所示，本申请的一个实施例中，还包括：As shown in Figure 2, one embodiment of this application also includes:

将多种不同类型的水电机组故障定义为一个空间Θ{F1，F2，…，F10}作为辨识框架。Multiple different types of hydropower unit faults are defined as a space Θ{F1, F2,...,F10} as the identification framework.

支持向量机SVM模型的构建，采用网格寻优方法，粗略选择惩罚参数c和核函数参数g；然后采用K-折交叉验证方法，选取最优的c值和g值作为支持向量机SVM模型的最优参数；To construct the support vector machine SVM model, the grid optimization method is used to roughly select the penalty parameter c and the kernel function parameter g; then the K-fold cross-validation method is used to select the optimal c value and g value as the support vector machine SVM model. the optimal parameters;

XGBoost模型的构建，采用自适应粒子群优化算法不断更新XGBoost模型的基学习器个数、学习率、最大树深度，确定最优化参数；In the construction of the XGBoost model, the adaptive particle swarm optimization algorithm is used to continuously update the number of base learners, learning rate, and maximum tree depth of the XGBoost model to determine the optimization parameters;

初始预警结果包括：利用优化好的支持向量机SVM模型和XGBoost模型，以预测数据集作为输入，对水电机组的故障进行分类预警，分别得到两个初始预警结果；The initial warning results include: using the optimized support vector machine SVM model and XGBoost model, using the prediction data set as input, to classify and warn the faults of the hydropower unit, and obtain two initial warning results respectively;

最终状态预警结果为：采用加权证据方法，对两个初始预警结果进行信息融合，得到最终的水电机组状态预警结果。The final status warning result is: using the weighted evidence method to fuse the two initial warning results to obtain the final hydropower unit status warning result.

其中，K-折交叉验证方法，具体为：Among them, the K-fold cross-validation method is specifically:

采用K-折交叉验证方法，选取最优的c值和g值，具体为：将初始数据集中数据分为k个子集，其中k-1个子集作为训练子集，1子集个作为测试子集，选取一组c和g的值进行支持向量机SVM模型训练，并计算测试子集测试结果的均方误差MSE，计算选取多组不同的c和g值的MSE，以MSE最小值原则选取最优的c值和g值。本发明中取K＝4，即4折交叉验证方法，首先把水电机组的数据样本分类成4组子集，将前3组样本作为训练集，最后1组样本用于测试，然后循环多次，每次都得出一个均方误差MSE；将数据进行10次训练，且每次训练后将得到的数据进行整理分析，然后将10次的MSE取平均值；最后以MSE最小原则选取c值和g值作为SVM建模的最优参数。Use K-fold cross-validation method to select the optimal c value and g value, specifically: divide the data in the initial data set into k subsets, of which k-1 subsets are used as training subsets and 1 subset is used as test subsets Set, select a set of c and g values for support vector machine SVM model training, and calculate the mean square error MSE of the test subset test results. Calculate and select the MSE of multiple sets of different c and g values, and select based on the MSE minimum value principle. Optimal c and g values. In this invention, K = 4, that is, the 4-fold cross-validation method, first classifies the data samples of the hydropower unit into 4 groups of subsets, uses the first 3 groups of samples as training sets, and the last group of samples is used for testing, and then cycles multiple times , each time a mean square error MSE is obtained; the data is trained 10 times, and the data obtained is sorted and analyzed after each training, and then the 10 MSEs are averaged; finally, the c value is selected based on the MSE minimum principle and g value as the optimal parameters for SVM modeling.

自适应粒子群优化算法，具体为：Adaptive particle swarm optimization algorithm, specifically:

其中，w_max和w_win分别表示惯性权重的最大和最小值，k为当前迭代次数，k_max为最大迭代次数。Among them, w _max and w _win represent the maximum and minimum values of inertia weight respectively, k is the current number of iterations, and k _max is the maximum number of iterations.

惯性权重ω是平衡算法局部和全局搜索的关键参数，较大的ω具有较强的全局搜索能力，但是容易越过最优解，而较小的ω局部搜索能力较强，但是容易陷入局部最优。根据粒子群的进化程度和聚集程度，对自适应粒子群优化算法的惯性权重进行动态更新，优化粒子在搜索空间中的运动过程，改进后的自适应惯性权重更新公式为：The inertia weight ω is a key parameter for balancing the local and global search of the algorithm. A larger ω has a strong global search ability, but it is easy to cross the optimal solution, while a smaller ω has a strong local search ability, but it is easy to fall into the local optimum. . According to the degree of evolution and aggregation of the particle swarm, the inertia weight of the adaptive particle swarm optimization algorithm is dynamically updated to optimize the movement process of particles in the search space. The improved adaptive inertia weight update formula is:

进化程度公式为：The formula for the degree of evolution is:

式中，evol在[0，1]范围内表示粒子群的进化程度；e为自然常数；c^k和c^k-1分别表示第k次和第k-1次迭代所带来的适应度变化量。当本次变化量比上一次要大时，表明全局最优解还有较大的提升空间，进化程度evol会取得较小的值，并且规定在c^k-1＝0时，evol取得最小值0；反之，evol取得最小值1；特殊的，当c^k-1＝0且c^k＝0时，全局最优解位置连续多次迭代都没有发生变化，规定此时的evol为1，粒子群的进化程度达到最大。In the formula, evol represents the degree of evolution of the particle swarm in the range of [0, 1]; e is a natural constant; c ^k and c ^k-1 represent the fitness changes brought about by the k-th and k-1-th iterations respectively. quantity. When the amount of change this time is larger than the last time, it indicates that the global optimal solution still has a large room for improvement, and the degree of evolution evol will obtain a smaller value, and it is stipulated that when c ^k-1 = 0, evol obtains the minimum value 0; otherwise, evol obtains the minimum value 1; specifically, when c ^k-1 = 0 and c ^k = 0, the global optimal solution position does not change for many consecutive iterations, and the evol at this time is 1, and the particle The degree of evolution of the group reaches its maximum.

聚集程度公式为：The formula for the degree of aggregation is:

其中，aggr在[0，1]范围内表示粒子群的聚集程度；N代表每个子空间所含有的粒子数(N1、N2、N3…Nn)，log函数的底数等于子空间个数n。当所有粒子被平均分配到每一个子空间中，即N1＝N2＝N3＝…Nn，aggr取得最小值0，此时粒子群的聚集程度最低；当所有粒子都在同一个子空间中，即存在Ni＝N时，aggr取得最大值1，此时粒子群的聚集程度最高。Among them, aggr represents the aggregation degree of the particle swarm in the range of [0, 1]; N represents the number of particles contained in each subspace (N1, N2, N3...Nn), and the base of the log function is equal to the number of subspaces n. When all particles are evenly distributed into each subspace, that is, N1=N2=N3=...Nn, aggr obtains the minimum value 0, and the degree of aggregation of the particle swarm is the lowest at this time; when all particles are in the same subspace, that is, there is When Ni=N, aggr obtains the maximum value 1, and the particle swarm has the highest degree of aggregation at this time.

加权证据方法，具体为：Weighted evidence method, specifically:

其中，为证据源对焦元的平均支持程度，等于0时表示其中一个诊断模型输出结果不属于水电机组故障空间。in, is the average support degree of the focus element of the evidence source. When it is equal to 0, it means that one of the diagnostic model output results does not belong to the hydropower unit fault space.

本申请的一个实施例中，还包括通过计算平均精度指标，评估水电机组状态预警性能，平均精度指标包括：平均准确率、查准率、查全率和调和均值F1-score，其中：One embodiment of this application also includes evaluating the performance of hydropower unit status early warning by calculating average accuracy indicators. The average accuracy indicators include: average accuracy rate, precision rate, recall rate and harmonic mean F1-score, where:

平均准确率＝(真正例+真负例)/总样本数；Average accuracy = (true examples + true negative examples)/total number of samples;

查准率是指模型正确预测为正类别的样本数占所有预测为正类别的样本数的比例，查准率衡量了模型在所有正类别样本中的正确率，计算公式如下：The precision rate refers to the proportion of the number of samples correctly predicted as positive categories by the model to the number of samples predicted as positive categories. The precision rate measures the accuracy of the model in all positive category samples. The calculation formula is as follows:

查准率＝真正例/(真正例+假正例)；Precision rate = true examples/(true examples + false positive examples);

查全率是指模型正确预测为正类别的样本数占所有真正正类别样本数的比例，查全率衡量了模型识别出的正类别样本占总正类别样本的比例，计算公式如下：The recall rate refers to the proportion of the number of samples correctly predicted as positive categories by the model to the number of all true positive category samples. The recall rate measures the proportion of positive category samples identified by the model to the total positive category samples. The calculation formula is as follows:

查全率＝真正例/(真正例+假负例)；Recall rate = true examples/(true examples + false negative examples);

F1-score是查准率和查全率的调和均值，它综合考虑了模型的精确性和召回率，用于平衡两者之间的关系，计算公式如下：F1-score is the harmonic mean of precision rate and recall rate. It takes into account the precision and recall rate of the model and is used to balance the relationship between the two. The calculation formula is as follows:

F1-score＝2*(查准率*查全率)/(查准率+查全率)。F1-score=2*(precision rate*recall rate)/(precision rate+recall rate).

本发明的一个实施例公开了一种基于调峰电源设备组合预测的联合优化系统，包括：One embodiment of the present invention discloses a joint optimization system based on peak-shaving power equipment combination prediction, including:

数据预处理模块：根据数据表，确定初始数据集；Data preprocessing module: determine the initial data set according to the data table;

状态预警模块：将多种不同类型的水电机组故障定义为一个故障空间，并结合初始数据集，分别搭建支持向量机SVM模型和XGBoost模型，使用寻优算法对两个模型的参数进行优化；根据预测数据集和故障空间，对水电机组的故障进行分类和预警，分别得到初始预警结果；将初始预警结果进行信息融合，输出最终状态预警结果。Status early warning module: Define multiple different types of hydropower unit faults as a fault space, and combine them with the initial data set to build a support vector machine SVM model and an XGBoost model respectively, and use an optimization algorithm to optimize the parameters of the two models; according to Predict the data set and fault space to classify and early-warn the faults of hydropower units, and obtain the initial early-warning results respectively; conduct information fusion on the initial early-warning results, and output the final status early-warning results.

本发明的一个实施例中，参数预测模块包括：In one embodiment of the present invention, the parameter prediction module includes:

本发明的一个实施例中，状态预警模块包括：In one embodiment of the present invention, the status warning module includes:

本说明书中各个实施例采用递进的方式描述，每个实施例重点说明的都是与其他实施例的不同之处，各个实施例之间相同相似部分互相参见即可。对于实施例公开的装置而言，由于其与实施例公开的方法相对应，所以描述的比较简单，相关之处参见方法部分说明即可。Each embodiment in this specification is described in a progressive manner. Each embodiment focuses on its differences from other embodiments. The same and similar parts between the various embodiments can be referred to each other. As for the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple. For relevant details, please refer to the description in the method section.

对所公开的实施例的上述说明，使本领域专业技术人员能够实现或使用本发明。对这些实施例的多种修改对本领域的专业技术人员来说将是显而易见的，本文中所定义的一般原理可以在不脱离本发明的精神或范围的情况下，在其它实施例中实现。因此，本发明将不会被限制于本文所示的这些实施例，而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。The above description of the disclosed embodiments enables those skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be practiced in other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A joint optimization method based on peak-shaving power equipment combination prediction, which is characterized by the following specific steps:

Obtain the internal working parameters, external environmental parameters and power grid load parameters of the hydropower unit and construct a data table;

According to the data table, determine the initial data set;

According to the linear characteristics, nonlinear characteristics and complex characteristics of the data in the initial data set, construct the ARIMA model, random forest model, and LSTM neural network model for parameter prediction; calculate the combination weight, perform a weighted average of the prediction parameters, and obtain the final output result as the prediction data set;

Define multiple different types of hydropower unit faults as a fault space, and combine them with the initial data set to build a support vector machine SVM model and an XGBoost model respectively, and use an optimization algorithm to optimize the parameters of the two models; according to the prediction data set and In the fault space, the faults of the hydropower unit are classified and early-warned, and the initial early-warning results are obtained respectively; the initial early-warning results are information fused to output the final state early-warning results.

2. A joint optimization method based on peak-shaving power equipment combination prediction according to claim 1, characterized in that the internal working parameters include: water turbine inlet pressure, vibration, ferry coefficient data, generator rotor AC impedance, magnetic pole coefficient data, oil temperature and gas concentration data in the transformer; the environmental parameters include: air temperature, ground temperature, relative humidity, average wind speed data; the power grid load parameters include: operating voltage, active power, Reactive power;

Build a data table based on the acquisition time of each parameter.

3. A joint optimization method based on peak-shaving power equipment combination prediction according to claim 1, characterized in that the determination of the initial data set includes:

First, determine the proportion of missing values in the data table to the total sample size. When the proportion is greater than the threshold, the interpolation method is used to complete the missing values in the data set. When it is less than the threshold, the deletion method is used to directly delete the missing values; then, Perform outlier processing; then, use the Z-score method for data standardization; finally, use the out-of-bag data feature replacement method for feature extraction to obtain the initial data set.

4. A joint optimization method based on combination prediction of peaking power supply equipment according to claim 1, characterized in that the construction of an ARIMA model, a random forest model, and an LSTM neural network model for parameter prediction includes:

Select parameters with linear characteristics in the initial data set, build an ARIMA model and train it, and use the trained ARIMA model to predict parameters; select parameters with nonlinear characteristics in the initial data set, build a random forest model and train it, and use the trained random forest The model performs parameter prediction; select parameters with complex characteristics in the initial data set, construct an LSTM neural network model and train it, and use the trained LSTM neural network model to perform parameter prediction;

Calculate the combination weight, perform a weighted average of the prediction parameters, and obtain the final output result, including:

Use the goodness-of-fit algorithm to allocate the weight values of the ARIMA model, random forest model, and LSTM neural network model. First, calculate the determination coefficient p of the goodness-of-fit algorithm. The normal value range of p is [0,1]. Its calculation The formula is:

in, Predict the value for the i-th parameter,/> is the average of the real values, zi is the real value of the i-th parameter, and n is the total number of predicted parameters;

Then, use the tangent function to optimize the determination coefficient p, and finally obtain the optimized combination weight calculation formula:

Among them, w _j represents the combined weight of the j-th sub-model, z _j represents the p-value of the j-th sub-model, and h(z _j ) represents the p-value of the j-th sub-model transformed using the tangent function;

According to the weight value w _j , calculate the weighted average of the prediction parameters of the ARIMA model, random forest model, and LSTM neural network model to obtain the final output result as a prediction data set.

5. A joint optimization method based on peak-shaving power equipment combination prediction according to claim 4, characterized in that,

The ARIMA model is composed of an autoregressive model and a moving average model while introducing a difference operation, and its initialization coefficient includes the number of autoregressive terms and the number of moving average terms;

The random forest model selects Bootstrap sampling to build each basic decision tree;

The LSTM neural network model is optimized using the stochastic gradient descent Adam algorithm, and the loss function is the cross-entropy function.

6. A joint optimization method based on peak-shaving power equipment combination prediction according to claim 1, characterized in that,

The fault space is Θ{F1, F2,...,F10} as the hydropower unit fault identification framework;

To construct the support vector machine SVM model, the grid optimization method is used to roughly select the penalty parameter c and the kernel function parameter g; then the K-fold cross-validation method is used to select the optimal c value and g value as the support vector machine Optimal parameters of SVM model;

In the construction of the XGBoost model, the adaptive particle swarm optimization algorithm is used to continuously update the number of base learners, learning rate, and maximum tree depth of the XGBoost model to determine the optimization parameters;

The initial warning results include: using the optimized support vector machine SVM model and XGBoost model, using the prediction data set as input, to classify and warn the faults of the hydropower unit, and obtain the initial warning results respectively;

The final status early warning result is: the hydropower unit status early warning result obtained by fusing the information of the two initial early warning results through the weighted evidence method

The K-fold cross-validation method is specifically:

Divide the data in the initial data set into k subsets, of which k-1 subsets are used as training subsets, and 1 subset is used as the test subset. Select a set of c and g values to train the support vector machine SVM model, and calculate the test For the mean square error MSE of the subset test results, calculate and select the MSE of multiple sets of different c and g values, and select the optimal c and g values based on the MSE minimum principle;

The adaptive particle swarm optimization algorithm is specifically:

Determine the inertia weight w value of the adaptive particle swarm optimization algorithm, and its formula is:

Among them, w _max and w _win represent the maximum and minimum values of inertia weight respectively, k is the current number of iterations, and k _max is the maximum number of iterations;

According to the degree of evolution and aggregation of the particle swarm, the inertia weight of the adaptive particle swarm optimization algorithm is dynamically updated. The improved adaptive inertia weight update formula is:

w ^* =w _in -(w _in -w _min )*evol+(w _max -w _in )*aggr

Among them, w ^* represents the adaptive inertia weight, w _in represents the initial inertia weight value, evol and aggr are the degree of evolution and the degree of aggregation respectively;

The weighted evidence method is specifically:

It is defined that each hydropower unit failure in the two initial warning results has two evidence bodies to be combined E1 and E2, the corresponding basic probability distribution functions are m1 and m2 respectively, and the corresponding focal elements are Ap, Bq, Ap and Bq respectively. The common focal element is C, and the conflict degree coefficient between m1 and m2 is H. The fusion formula is:

in, The average support level for the focal element of the evidence source;

Through the fusion formula, the early warning results of different types of hydropower unit failures are obtained through fusion.

7. A joint optimization method based on peak-shaving power equipment combination prediction according to claim 1, characterized in that it also includes evaluating the status early warning performance of hydropower units by calculating an average accuracy index, and the average accuracy index includes: average Accuracy, precision, recall and harmonic mean F1-score.

8. A joint optimization system based on peak-shaving power equipment combination prediction, which is characterized by including:

Data acquisition module: Obtain the internal working parameters, external environmental parameters and power grid load parameters of the hydropower unit, and construct a data table;

Data preprocessing module: determine the initial data set according to the data table;

Parameter prediction module: Based on the linear characteristics, non-linear characteristics and complex characteristics of the data in the initial data set, construct ARIMA models, random forest models, and LSTM neural network models for parameter prediction; calculate the combination weight, perform a weighted average of the prediction parameters, and obtain the final output The results, as a prediction data set;

Status early warning module: Define multiple different types of hydropower unit faults as a fault space, and combine them with the initial data set to build a support vector machine SVM model and an XGBoost model respectively, and use an optimization algorithm to optimize the parameters of the two models; according to Predict the data set and fault space, classify and early-warn the faults of the hydropower unit, and obtain the initial early-warning results respectively; perform information fusion on the initial early-warning results, and output the final state early-warning results.

9. A joint optimization system based on peak-shaving power equipment combination prediction according to claim 8, characterized in that the parameter prediction module includes:

ARIMA model unit: Select parameters with linear characteristics in the initial data set, build an ARIMA model and train it, and use the trained ARIMA model to predict parameters;

Random forest model unit: select parameters with non-linear characteristics in the initial data set, build a random forest model and train it, and use the trained random forest model to predict parameters;

LSTM neural network model unit: Select parameters with complex characteristics in the initial data set, build and train an LSTM neural network model, and use the trained LSTM neural network model to predict parameters;

Combined weight unit: Use the goodness-of-fit algorithm to allocate the weight values of the ARIMA model, random forest model, and LSTM neural network model, and calculate the weighted average of the prediction parameters of the ARIMA model, random forest model, and LSTM neural network model based on the weight values. The final output result is used as a prediction data set.

10. A joint optimization system based on peak-shaving power equipment combination prediction according to claim 8, characterized in that the status warning module includes:

SVM model unit: Use the grid optimization method to roughly select the penalty parameter c and kernel function parameter g; then use the K-fold cross-validation method to select the optimal c value and g value as the optimal parameters of the support vector machine SVM model , using the prediction data set as input to classify and warn hydropower unit failures;

XGBoost model unit: The adaptive particle swarm optimization algorithm is used to continuously update the number of base learners, learning rate, and maximum tree depth of the XGBoost model, determine the optimization parameters, and use the prediction data set as input to classify and warn hydropower unit failures;

Early warning result fusion unit: Through the weighted evidence method, the initial early warning results obtained by the SVM model unit and the XGBoost model unit are information fused to obtain the final status early warning result of the hydropower unit.