+

CN115497574A - A method and system for predicting compressive strength of HPC based on model fusion - Google Patents

A method and system for predicting compressive strength of HPC based on model fusion Download PDF

Info

Publication number
CN115497574A
CN115497574A CN202211078389.2A CN202211078389A CN115497574A CN 115497574 A CN115497574 A CN 115497574A CN 202211078389 A CN202211078389 A CN 202211078389A CN 115497574 A CN115497574 A CN 115497574A
Authority
CN
China
Prior art keywords
model
compressive strength
hpc
prediction
fusion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211078389.2A
Other languages
Chinese (zh)
Other versions
CN115497574B (en
Inventor
张永涛
田唯
肖垚
王永威
朱浩
李焜耀
杨华东
郑建新
王紫超
刘志昂
陈圆
薛现凯
李�浩
代百华
周浩
孙南昌
杨切
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CCCC Second Harbor Engineering Co
CCCC Highway Long Bridge Construction National Engineering Research Center Co Ltd
Original Assignee
CCCC Second Harbor Engineering Co
CCCC Highway Long Bridge Construction National Engineering Research Center Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CCCC Second Harbor Engineering Co, CCCC Highway Long Bridge Construction National Engineering Research Center Co Ltd filed Critical CCCC Second Harbor Engineering Co
Priority to CN202211078389.2A priority Critical patent/CN115497574B/en
Publication of CN115497574A publication Critical patent/CN115497574A/en
Application granted granted Critical
Publication of CN115497574B publication Critical patent/CN115497574B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/30Prediction of properties of chemical compounds, compositions or mixtures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/10Analysis or design of chemical reactions, syntheses or processes
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Medical Informatics (AREA)
  • Analytical Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a HPC compressive strength prediction method and a HPC compressive strength prediction system based on model fusion, which comprises the steps of collecting relevant parameter data of high-performance concrete; exploratory data analysis and data cleaning; processing abnormal values of concrete data and transforming the data; concrete data characteristic engineering; constructing a concrete compressive strength prediction model; optimizing parameters of a concrete compressive strength prediction model; fusing a concrete prediction compressive strength prediction model; performing interpretability analysis on a concrete compressive strength prediction model based on SHAP; by utilizing the method and the system, the defects that the traditional neural network model is difficult to train and has high requirements on data quantity are overcome; meanwhile, the comprehensive average of the results of multiple models is more reliable than the prediction result of a single model; meanwhile, the method has the advantages of short test period, high precision and low test cost, and has stronger engineering feasibility compared with the traditional empirical formula method and test method.

Description

一种基于模型融合的HPC抗压强度预测方法和系统A HPC compressive strength prediction method and system based on model fusion

技术领域technical field

本发明属于大型沉井建造技术领域,具体涉及一种基于模型融合的HPC抗压强度预测方法和系统。The invention belongs to the technical field of large-scale caisson construction, and in particular relates to an HPC compressive strength prediction method and system based on model fusion.

背景技术Background technique

HPC(高性能混凝土)得益于其突出的高强度和高耐久性等特点,在大跨桥梁建设中得到了广泛应用。混凝土抗压强度作为混凝土质量评价的重要指标,极大程度上反映了建筑结构的安全性能。因此研究高性能混凝土抗压强度的精准预测方法对于施工项目的精准把控和对工程项目的科学评估具有重要意义。HPC (High Performance Concrete) has been widely used in the construction of long-span bridges due to its outstanding characteristics such as high strength and high durability. The compressive strength of concrete is an important indicator of concrete quality evaluation, which largely reflects the safety performance of building structures. Therefore, it is of great significance to study the accurate prediction method of the compressive strength of high performance concrete for the precise control of construction projects and the scientific evaluation of engineering projects.

现阶段用于抗压强度预测的方法主要有:基于经验的公式法、试验法和基于统计机器学习的方法几类。At present, the methods for compressive strength prediction mainly include: formula method based on experience, test method and method based on statistical machine learning.

其中,基于经验的公式法即以人工经验为基础,通过建立复杂的数学模型来拟合混凝土的各项参数指标,从而建立起抗压强度计算模型。此类方法高度依赖于人工经验,迭代计算过程复杂,拟合精度极为有限,且不适用于混合材料种类多、配比配方含量极为复杂的高性能混凝土这一类材料的抗压强度的计算。Among them, the experience-based formula method is based on artificial experience, through the establishment of complex mathematical models to fit the various parameters of concrete, so as to establish a calculation model for compressive strength. This type of method is highly dependent on manual experience, the iterative calculation process is complex, the fitting accuracy is extremely limited, and it is not suitable for the calculation of the compressive strength of materials such as high-performance concrete with many types of mixed materials and extremely complex formula content.

基于试验法即通过各种试验仪器设备对高性能混凝土材料成型前后的结构进行监测,从而获取抗压强度。这类方法测试精度较高,测试结果具有较高的可信度;然而此类方法的测试周期长、测试成本高,且在现场复杂的施工环境中进行测试具有极高的危险系数,因此更多地用于实验室环境中的抗压强度测试。Based on the test method, various test instruments and equipment are used to monitor the structure of the high-performance concrete material before and after forming, so as to obtain the compressive strength. This type of method has high test accuracy and high reliability of test results; however, this type of method has a long test cycle, high test cost, and a high risk factor when testing in a complex construction environment on site. It is widely used for compressive strength testing in laboratory environment.

基于统计机器学习的方法是以机器学习为理论基础,通过数据驱动的方法,无需过多的前提假设,直接建立起抗压强度预测模型,其成本低廉、测试周期短、测试结果精度高,因此具有极高的研究和应用价值。The method based on statistical machine learning is based on machine learning. Through data-driven methods, without too many assumptions, the compressive strength prediction model is directly established. It has low cost, short test cycle, and high accuracy of test results. Therefore, It has high research and application value.

截止目前,已有一些研究利用机器学习的方法进行混凝土抗压强度预测,如基于AdaBoost算法、基于随机森林和智能算法、基于BP神经网络或RBF神经网络、基于支持向量机(SVM)、基于线性回归(LR)、基于深度学习(DL)的方法等;然而,上述方法仍存在一些缺陷,如上述方法都基于单个模型的输出结果来进行抗压强度的预测,缺乏一定的可靠性;基于人工神经网络或深度学习的方法尤其依赖于大量的实验数据,不适用于土木工程这类高风险且数据采集困难的场景,同时此类方法的模型训练困难,容易陷入局部最优或者出现模型过拟合情况;基于SVM或LR的方法极容易受到异常值的影响,以至于预测精度和实际结果相差较大,且LR方法难以拟合HPC各组成成分之间复杂的线性关系。同时,现有预测方法整体上都是一个黑箱模型,缺乏可解释性,无法明确知晓每条数据样本对抗压强度的具体影响,不利于对实际工程项目的具体指导。So far, some studies have used machine learning methods to predict the compressive strength of concrete, such as based on AdaBoost algorithm, based on random forest and intelligent algorithm, based on BP neural network or RBF neural network, based on support vector machine (SVM), based on linear Regression (LR), methods based on deep learning (DL), etc.; however, the above methods still have some defects. For example, the above methods are based on the output of a single model to predict the compressive strength, which lacks certain reliability; Neural network or deep learning methods especially rely on a large amount of experimental data, and are not suitable for high-risk and difficult data collection scenarios such as civil engineering. At the same time, the model training of such methods is difficult, and it is easy to fall into local optimum or model overfitting However, the SVM or LR-based methods are extremely susceptible to the influence of outliers, so that the prediction accuracy is quite different from the actual results, and it is difficult for the LR method to fit the complex linear relationship between the various components of HPC. At the same time, the existing prediction methods are generally a black-box model, which lacks interpretability and cannot clearly know the specific impact of each data sample on the compressive strength, which is not conducive to specific guidance for actual engineering projects.

发明内容Contents of the invention

因此,针对目前HPC(高性能混凝土)抗压强度预测方法对人工经验依赖性强、数据需求量大且模型训练过程复杂、模型预测结果精度不高、模型预测结果缺乏可解释性的问题,本发明提出了一种适用于普通混凝土尤其适用于HPC抗压强度的精准预测的方法。Therefore, in view of the problems that the current HPC (high performance concrete) compressive strength prediction method is highly dependent on manual experience, requires a large amount of data, the model training process is complicated, the accuracy of model prediction results is not high, and the model prediction results lack interpretability. The invention proposes a method for accurate prediction of the compressive strength of ordinary concrete, especially HPC.

实现本发明目的之一的一种基于模型融合的HPC抗压强度预测方法,包括如下步骤:A kind of HPC compressive strength prediction method based on model fusion of one of object of the present invention comprises the following steps:

S1、根据采集的HPC的历史参数数据分别对第一预测模型和第二预测模型进行训练,分别得到训练完成的用于预测HPC抗压强度的第一预测模型和第二预测模型;所述HPC即高性能混凝土;S1. According to the collected historical parameter data of HPC, the first prediction model and the second prediction model are respectively trained, and the first prediction model and the second prediction model for predicting the compressive strength of HPC are respectively obtained after training; the HPC i.e. high performance concrete;

S2、对训练完成的第一预测模型和第二预测模型进行复合运算得到HPC抗压强度预测的融合模型,所述HPC抗压强度预测的融合模型输出最终的HPC抗压强度的预测结果。S2. Composite calculation is performed on the trained first prediction model and the second prediction model to obtain an HPC compressive strength prediction fusion model, and the HPC compressive strength prediction fusion model outputs a final HPC compressive strength prediction result.

进一步的技术方案包括,所述步骤S2后还包括步骤S3:A further technical solution includes, step S3 is also included after the step S2:

S3、利用第一算法对所述HPC抗压强度预测的融合模型进行解释分析,得到每一种HPC参数的成分含量对HPC抗压强度预测的融合模型输出的HPC抗压强度的贡献值;所述贡献值用于度量HPC的各成分含量对HPC的抗压强度的预测值是否具有增强HPC抗压强度的效果,其用于指导实际的混凝土配合比设计。S3. Using the first algorithm to interpret and analyze the fusion model for predicting the HPC compressive strength, and obtain the contribution value of the component content of each HPC parameter to the HPC compressive strength output by the fusion model for predicting the HPC compressive strength; The above contribution value is used to measure whether the content of each component of HPC has the effect of enhancing the compressive strength of HPC on the predicted value of compressive strength of HPC, which is used to guide the actual concrete mix design.

进一步的技术方案包括:所述第一算法为基于SHAP可解释性算法。所述贡献值用Shapley Value表示,并简记为Ψ,其定义如下:A further technical solution includes: the first algorithm is an interpretability algorithm based on SHAP. The contribution value is represented by Shapley Value and abbreviated as Ψ, which is defined as follows:

Figure BDA0003831971760000031
Figure BDA0003831971760000031

式中:In the formula:

S为待解释模型输入的特征子集,即输入到HPC抗压强度预测的融合模型的混凝土参数集合;S is the feature subset input by the model to be explained, that is, the concrete parameter set input to the fusion model of HPC compressive strength prediction;

xj为待解释样本的第j个特征变量;即第j个混凝土参数;x j is the jth characteristic variable of the sample to be explained; that is, the jth concrete parameter;

p为特征总数,即混凝土参数的总个数;p is the total number of features, that is, the total number of concrete parameters;

valx(S)表示以S为输入特征的时候,模型对样本x的预测结果,即模型输出的抗压强度结果,其中x为样本,x的元素记为xi,xi就是第i个特征变量对应的取值;val x (S) indicates the prediction result of the model for sample x when S is the input feature, that is, the compressive strength result output by the model, where x is the sample, and the elements of x are recorded as x i , and x i is the i-th The value corresponding to the characteristic variable;

SHAP值表示第j个特征对于模型输出结果的重要性程度,即边际贡献,而ShapleyValue就是各边际贡献的均值。对于HPC抗压强度预测的融合模型的模型解释结果定义如下式:The SHAP value indicates the importance of the jth feature to the model output, that is, the marginal contribution, and the ShapleyValue is the mean value of each marginal contribution. The model interpretation results of the fusion model for HPC compressive strength prediction are defined as follows:

Figure BDA0003831971760000041
Figure BDA0003831971760000041

式中:In the formula:

g是待解释的抗压强度预测模型,即HPC抗压强度预测的融合模型;g is the compressive strength prediction model to be explained, that is, the fusion model of HPC compressive strength prediction;

z'∈{0,1}M为组合向量,代表特征zj(j∈[1,M])是否存在,其中zj为输入到融合后的HPC抗压强度的预测模型的第j个混凝土参数,z'用于标识z1~zM在输入到HPC抗压强度预测的融合模型的参数集合中是否存在;z'∈{0,1} M is a combination vector, representing whether the feature z j (j∈[1,M]) exists, where z j is the jth concrete input to the prediction model of the fused HPC compressive strength parameter, z' is used to identify whether z 1 ~ z M exists in the parameter set input to the fusion model of HPC compressive strength prediction;

M为组合特征个数,即输入到HPC抗压强度预测的融合模型的混凝土参数个数;M is the number of combined features, that is, the number of concrete parameters input to the fusion model for HPC compressive strength prediction;

Figure BDA0003831971760000042
为特征j的特征归因Shapley Value,即第j个参数对HPC抗压强度预测的融合模型的预测结果,即对抗压强度的贡献值;
Figure BDA0003831971760000042
Attributing the Shapley Value to the characteristic of feature j, that is, the prediction result of the fusion model of the jth parameter to the HPC compressive strength prediction, that is, the contribution value to the compressive strength;

Ψ0为HPC抗压强度预测的融合模型的平均预测结果,即抗压强度预测结果的均值。Ψ 0 is the average prediction result of the fusion model for HPC compressive strength prediction, that is, the mean value of the compressive strength prediction results.

Shapley Value度量了特征对于总体预测结果的贡献,Ψj>0时,说明该特征对抗压强度的预测值具有积极提升效果,即有增强HPC抗压强度的效果。Shapley Value measures the contribution of the feature to the overall prediction result. When Ψ j > 0, it indicates that the feature has a positive effect on improving the predictive value of compressive strength, that is, it has the effect of enhancing the compressive strength of HPC.

SHAP全局特征重要性即为每个特征的Shapley Value绝对值求和的平均,即

Figure BDA0003831971760000051
The SHAP global feature importance is the average of the sum of the absolute value of the Shapley Value of each feature, that is
Figure BDA0003831971760000051

进一步的技术方案包括:所述步骤S2中,得到HPC抗压强度预测的融合模型的方法包括采用加权平均法对第一预测模型和第二预测模型进行复合运算。A further technical solution includes: in the step S2, the method of obtaining the fusion model for predicting the HPC compressive strength includes performing compound calculations on the first prediction model and the second prediction model by using a weighted average method.

更进一步地技术方案包括:所述采用加权平均法对第一预测模型和第二预测模型进行复合运算的方法包括:A further technical solution includes: the method for compounding the first prediction model and the second prediction model by using the weighted average method includes:

minimiz e(Loss)s.t.w1+w2=1 and w1≥0,w2≥0minimize e(Loss)stw 1 +w 2 =1 and w 1 ≥0,w 2 ≥0

式中:In the formula:

w1表示第一预测模型的权重;w 1 represents the weight of the first prediction model;

w2表示第二预测模型的权重;w 2 represents the weight of the second prediction model;

Loss为HPC抗压强度预测的融合模型H(x)的损失函数;其计算方法如下:Loss is the loss function of the fusion model H(x) for HPC compressive strength prediction; its calculation method is as follows:

Figure BDA0003831971760000052
Figure BDA0003831971760000052

式中:In the formula:

N为为样本容量,即所收集得到的混凝土样本数据总条数;N is the sample size, that is, the total number of concrete sample data collected;

Figure BDA0003831971760000062
为第i条样本对应的混凝土实际抗压强度;
Figure BDA0003831971760000062
is the actual compressive strength of concrete corresponding to the i-th sample;

Figure BDA0003831971760000063
为HPC抗压强度预测的融合模型H(x)对第i条样本的混凝土抗压强度输出的预测值;
Figure BDA0003831971760000063
is the predicted value of the concrete compressive strength output of the i-th sample by the fusion model H(x) predicted by HPC compressive strength;

H(x)的表达式如下:The expression of H(x) is as follows:

Figure BDA0003831971760000061
Figure BDA0003831971760000061

式中:In the formula:

H(x):表示HPC抗压强度预测的融合模型输出的最终预测的HPC抗压强度;H(x): indicates the final predicted HPC compressive strength output by the fusion model of HPC compressive strength prediction;

w1、w2:分别表示第一预测模型和第二预测模型的权重;w 1 , w 2 : represent the weights of the first prediction model and the second prediction model respectively;

h1、h2:分别表示第一预测模型和第二预测模型所预测的混凝土的抗压强度。h 1 , h 2 : represent the compressive strength of concrete predicted by the first prediction model and the second prediction model, respectively.

进一步的技术方案包括:所述第一预测模型基于AdaBoost算法对HPC的抗压强度进行预测。A further technical solution includes: the first prediction model predicts the compressive strength of HPC based on the AdaBoost algorithm.

进一步的技术方案包括:所述第二预测模型基于CatBoost算法对HPC的抗压强度进行预测。A further technical solution includes: the second prediction model predicts the compressive strength of HPC based on the CatBoost algorithm.

进一步的技术方案包括:采用加权平均法对第一预测模型和第二预测模型进行模型融合时,基于CatBoost算法的第二预测模型的权重大于基于AdaBoost算法的第一预测模型的权重。A further technical solution includes: when using the weighted average method to fuse the first prediction model and the second prediction model, the weight of the second prediction model based on the CatBoost algorithm is greater than the weight of the first prediction model based on the AdaBoost algorithm.

进一步的技术方案包括:所述步骤S2之前还包括采用贝叶斯优化方法对所述第一预测模型和第一预测模型进行超参数调优;并对参数调整后的模型进行交叉验证;所述超参数包括棵树和深度,对超参数调优后得到调优后的用于预测HPC抗压强度的第一预测模型和第二预测模型。A further technical solution includes: before the step S2, it also includes using a Bayesian optimization method to optimize the hyperparameters of the first prediction model and the first prediction model; and perform cross-validation on the model after parameter adjustment; the The hyperparameters include the tree and the depth, and after tuning the hyperparameters, the tuned first prediction model and the second prediction model for predicting the HPC compressive strength are obtained.

AdaBoost模型和CatBoost模型都是基于树的集成模型,二者的基模型都是树(Decision Tree)、即很多个基模型(即很多棵树)一起构成集成模型AdaBoost和CatBoost,AdaBoost和CatBoost整体是基于Boosting集成学习框架的。AdaBoost模型和CatBoost模型的棵树就是Decision Tree树的个数;AdaBoost模型的深度和CatBoost模型的深度就是AdaBoost模型的层数和CatBoost模型的层数;Both the AdaBoost model and the CatBoost model are tree-based integrated models. The base model of both is a tree (Decision Tree), that is, many base models (that is, many trees) together form the integrated model AdaBoost and CatBoost. AdaBoost and CatBoost are as a whole Based on the Boosting integrated learning framework. The trees of AdaBoost model and CatBoost model are the number of Decision Tree trees; the depth of AdaBoost model and the depth of CatBoost model are the number of layers of AdaBoost model and the number of layers of CatBoost model;

进一步的技术方案包括:步骤S1中还包括对采集的HPC的历史参数数据通过特征构造的方式得到新的特征参数,用于扩充数据集,使预测的HPC的抗压强度更准确;所述特征构造方式即结合工程经验对不同的混凝土参数进行数学计算得到不同的混凝土参数的比值关系。A further technical solution includes: Step S1 also includes obtaining new characteristic parameters for the collected historical parameter data of the HPC by means of feature construction, which is used to expand the data set and make the predicted compressive strength of the HPC more accurate; the feature The construction method is to combine engineering experience with mathematical calculation of different concrete parameters to obtain the ratio relationship of different concrete parameters.

实现本发明目的之二的一种基于模型融合的HPC抗压强度预测系统,包括模型训练模块和模型复合运算模块;A kind of HPC compressive strength prediction system based on model fusion that realizes the second object of the present invention, including a model training module and a model composite operation module;

所述模型训练模块用于根据采集的HPC的历史参数数据分别对第一模型和第二模型进行训练,分别得到训练完成的用于预测HPC抗压强度的第一预测模型和第二预测模型;The model training module is used to train the first model and the second model respectively according to the historical parameter data of the collected HPC, and respectively obtain the first prediction model and the second prediction model for predicting the compressive strength of the HPC that have been trained;

所述模型复合运算模块用于对训练完成的用于预测HPC抗压强度的第一预测模型和第二预测模型输出的HPC抗压强度进行复合运算,得到HPC抗压强度预测的融合模型,HPC抗压强度预测的融合模型输出最终的HPC抗压强度的预测结果。The model composite operation module is used to perform composite operations on the HPC compressive strength output by the first prediction model for predicting the HPC compressive strength and the second prediction model that have been trained to obtain a fusion model for predicting the HPC compressive strength, HPC The fusion model of compressive strength prediction outputs the final prediction result of HPC compressive strength.

进一步地,所述模型复合运算模块中,采用加权平均法对第一预测模型和第二预测模型输出的HPC抗压强度进行复合运算。Further, in the model composite operation module, a weighted average method is used to perform composite operations on the HPC compressive strength output by the first prediction model and the second prediction model.

进一步地,还包括参数调优模块,用于分别对训练完成的用于预测HPC抗压强度的第一预测模型和第二预测模型进行超参数调优,得到参数优化后的第一预测模型和第二预测模型,同时对调优后的抗压强度预测模型进行交叉验证;所述交叉验证方法包括五折交叉验证;Further, it also includes a parameter tuning module, which is used to tune the hyperparameters of the first prediction model and the second prediction model for predicting the compressive strength of HPC after training respectively, so as to obtain the first prediction model and the second prediction model after parameter optimization. The second predictive model is to perform cross-validation on the optimized compressive strength predictive model at the same time; the cross-validation method includes five-fold cross-validation;

进一步地,还包括模型解释分析模块,用于利用第一算法对所述HPC抗压强度预测的融合模型进行解释分析,得到每一种HPC参数的成分含量对HPC抗压强度的影响。Further, it also includes a model interpretation and analysis module, which is used to interpret and analyze the fusion model of the HPC compressive strength prediction by using the first algorithm, so as to obtain the influence of the component content of each HPC parameter on the HPC compressive strength.

进一步地,还包括异常值处理模块,用于对所采集的混凝土的历史参数数据中的异常值进行检测;所述异常值检测的方法包括采用基于K-Means++聚类与孤立森林相结合的算法。Further, it also includes an outlier processing module, which is used to detect outliers in the historical parameter data of the collected concrete; the method for outlier detection includes using an algorithm based on the combination of K-Means++ clustering and isolated forest .

其中K-Means++算法步骤如下:The K-Means++ algorithm steps are as follows:

a)初始化一个空的集合M,用于存储初始聚类中心;a) Initialize an empty set M to store the initial cluster center;

b)从初始样本中随机选择第一个聚类中心μ(j),并将其指派到集合M中;b) randomly select the first cluster center μ (j ) from the initial sample, and assign it to the set M;

c)对每个样本x(i)(该样本不属于集合M),求出其与集合M中所有初始聚类中心的最小平方距离d(x(i),M)2c) For each sample x (i) (the sample does not belong to the set M), find the minimum square distance d(x (i) , M) 2 between it and all initial cluster centers in the set M;

d)基于加权概率分布

Figure BDA0003831971760000081
随机选择下一个质心μ(p);d) Based on weighted probability distribution
Figure BDA0003831971760000081
Randomly select the next centroid μ (p) ;

e)重复上述步骤b、c,直到选出了K个聚类中心;e) Repeat steps b and c above until K cluster centers are selected;

f)基于上述集合M,继续使用经典的K-Means算法;f) Based on the above set M, continue to use the classic K-Means algorithm;

g)根据SSE,即残差平方和选择具有最佳性能的K-Means模型,从而得到最佳聚类中心;g) Select the K-Means model with the best performance according to SSE, that is, the residual sum of squares, so as to obtain the best cluster center;

结合上述所得K个聚类中心,利用孤立森林算法进行数据集的异常值检测。Combined with the K cluster centers obtained above, the outlier detection of the data set is performed using the isolation forest algorithm.

孤立森林算法是基于划分和集成学习的异常值检测算法,若不经过前期的聚类分析,直接使用孤立森林算法进行异常值检测,则会面临计算量大、运行周期长、划分过程人为性太强的问题。基于K-Means++算法进行数据聚类分析能极大提升孤立森林算法的检测效率。The isolated forest algorithm is an outlier detection algorithm based on division and ensemble learning. If you directly use the isolated forest algorithm for outlier detection without the previous cluster analysis, you will face a large amount of calculation, a long running cycle, and the artificiality of the division process. strong question. Data clustering analysis based on the K-Means++ algorithm can greatly improve the detection efficiency of the isolation forest algorithm.

有益效果:Beneficial effect:

(1)采用了基于集成学习的方式建立混凝土抗压强度预测模型,并进一步对集成模型进行了再次融合,此种群体决策的建模方式不仅提高了模型预测精度,更克服了传统神经网络模型难以训练且对数据量具有很高需求的缺陷;同时相较于使用单个模型的预测结果,多模型结果的综合平均更具有可靠性;(1) The concrete compressive strength prediction model based on integrated learning is established, and the integrated model is further integrated. This group decision-making modeling method not only improves the prediction accuracy of the model, but also overcomes the traditional neural network model. It is difficult to train and has a high demand for data volume; at the same time, compared with the prediction results of a single model, the comprehensive average of multi-model results is more reliable;

(2)基于统计机器学习方法的HPC抗压强度预测方法,测试周期短,精度高,同时测试成本低,比传统的经验公式法、试验法等具有更强的工程可行性;(2) The HPC compressive strength prediction method based on the statistical machine learning method has a short test period, high precision, and low test cost, and is more engineering feasible than the traditional empirical formula method and test method;

(3)本发明将模型融合与SHAP可解释性算法相结合,用于混凝土抗压强度的预测,克服了传统方法建模过程的不可预见性与黑箱性,更是结合SHAP可解释性分析为混凝土工程的发展提供了便利,有利于更确切了解各组成成分对抗压强度的具体影响;(3) The present invention combines model fusion with SHAP interpretability algorithm for the prediction of concrete compressive strength, overcomes the unpredictability and black-box nature of the traditional modeling process, and combines SHAP interpretability analysis as The development of concrete engineering provides convenience and is conducive to a more accurate understanding of the specific impact of each component on the compressive strength;

(4)混凝土参数数据的异常值处理过程,采用了K-Means++聚类与孤立森林相结合的方式,克服了直接使用孤立森林方法进行处理所面临的高计算复杂度和数据划分的人工依赖性;(4) The outlier processing process of concrete parameter data adopts the combination of K-Means++ clustering and isolation forest, which overcomes the high computational complexity and artificial dependence of data division faced by the direct use of isolation forest method. ;

(5)本发明通过特征工程手段实现了对数据集的扩充,新构建了如水灰比、水胶比等比例特征,有利于避免模型过拟合。(5) The present invention realizes the expansion of the data set by means of feature engineering, and newly constructs proportional features such as water-cement ratio and water-cement ratio, which is beneficial to avoid model overfitting.

附图说明Description of drawings

图1为本发明的HPC抗压强度预测模型建模流程图;Fig. 1 is the modeling flowchart of HPC compressive strength prediction model of the present invention;

图2为本发明的HPC抗压强度预测模型模型拟合效果图一;Fig. 2 is HPC compressive strength prediction model model fitting effect figure one of the present invention;

图3为本发明的HPC抗压强度预测模型模型拟合效果图二;Fig. 3 is HPC compressive strength prediction model model fitting effect figure two of the present invention;

图4为本发明的HPC抗压强度预测模型模型融合过程示意图;Fig. 4 is the schematic diagram of the fusion process of the HPC compressive strength prediction model model of the present invention;

图5为本发明的SHAP模型可解释性算法的单样本解释结果示意图一;Fig. 5 is a schematic diagram 1 of the single-sample interpretation result of the SHAP model interpretability algorithm of the present invention;

图6为本发明的SHAP模型可解释性算法的单样本解释结果示意图二;Fig. 6 is the second schematic diagram of the single-sample interpretation result of the SHAP model interpretability algorithm of the present invention;

图7为本发明的SHAP模型可解释性算法的在整个数据集上的解释结果示意图。FIG. 7 is a schematic diagram of the interpretation results of the SHAP model interpretability algorithm of the present invention on the entire data set.

具体实施方式detailed description

下列具体实施方式用于对本发明权利要求技术方案的解释,以便本领域的技术人员理解本权利要求书。本发明的保护范围不限于下列具体的实施结构。本领域的技术人员做出的包含有本发明权利要求书技术方案而不同于下列具体实施方式的也是本发明的保护范围。The following specific implementation methods are used to explain the technical solutions of the claims of the present invention, so that those skilled in the art can understand the claims. The protection scope of the present invention is not limited to the following specific implementation structures. The protection scope of the present invention includes the technical solution of the claims of the present invention made by those skilled in the art and is different from the following specific embodiments.

如图1所示,本实施例包括如下步骤:As shown in Figure 1, this embodiment includes the following steps:

步骤1、采集高性能混凝土相关参数数据在混凝土工厂、搅拌站等现场采集混凝土相关参数数据,所述混凝土相关参数数据包括但不限于水泥含量、粉煤灰含量、矿渣含量、减水剂含量、粗/细骨料含量、水含量、养护期、温度、坍落度,构成样本集

Figure BDA0003831971760000113
其中水泥含量、粉煤灰含量、矿渣含量、减水剂含量、粗/细骨料含量、水含量的单位均为(kg/m3),即每立方米混凝土中对应成分的质量;养护期的单位是day(天数)、温度单位是℃(摄氏度)、坍落度单位是mm(毫米),均可在配置混凝土时称重或测量得到;Step 1. Collect high-performance concrete related parameter data. Collect concrete related parameter data at concrete factories, mixing stations, etc., and the concrete related parameter data includes but not limited to cement content, fly ash content, slag content, water reducer content, Coarse/fine aggregate content, water content, curing period, temperature, slump, constitute a sample set
Figure BDA0003831971760000113
The units of cement content, fly ash content, slag content, superplasticizer content, coarse/fine aggregate content, and water content are (kg/m 3 ), that is, the mass of the corresponding components in each cubic meter of concrete; curing period The unit of temperature is day (number of days), the unit of temperature is ℃ (degree Celsius), and the unit of slump is mm (millimeter), which can be obtained by weighing or measuring when configuring concrete;

同时,将每个样本所对应的实际抗压强度作为目标变量

Figure BDA0003831971760000111
从而构建实验数据集
Figure BDA0003831971760000112
数据构建完成后,将数据集存放于本地磁盘或者关系型数据库中;At the same time, the actual compressive strength corresponding to each sample is used as the target variable
Figure BDA0003831971760000111
To build the experimental data set
Figure BDA0003831971760000112
After the data construction is completed, store the dataset in a local disk or a relational database;

步骤2、探索性数据分析与数据清洗Step 2. Exploratory data analysis and data cleaning

利用可视化和统计分析等手段,对数据集进行初步探索了解。结合可视化分析结果,对所采集的采集高性能混凝土相关参数数据中的缺失值、重复值、异常值等进行处理,同时了解数据分布情况。对缺失值较少(10%以内)且对应特征极值相差不大的特征变量,其缺失值采用均值填充;对缺失值较少且其对应特征极值相差较大的特征变量,其缺失值采用中位数填充;对缺失值比例达到10%-50%左右的特征变量,其缺失值采用基于决策树算法方式进行预测填充;对缺失值比例达到50%以上的特征变量予以剔除;对数据集中的重复值进行删除;Conduct initial exploration and understanding of datasets using visualization and statistical analysis. Combined with the visual analysis results, the missing values, repeated values, abnormal values, etc. in the collected high-performance concrete related parameter data are processed, and the data distribution is understood at the same time. For the characteristic variables with few missing values (within 10%) and the corresponding characteristic extreme values are not much different, the missing values are filled with the mean value; for the characteristic variables with few missing values and the corresponding characteristic extreme values are greatly different, the missing values Median filling is used; for the characteristic variables with a missing value ratio of about 10%-50%, the missing values are predicted and filled based on the decision tree algorithm; the characteristic variables with a missing value ratio of more than 50% are eliminated; the data Centralized duplicate values are deleted;

步骤3、混凝土数据异常值处理与数据变换Step 3. Concrete data outlier processing and data transformation

步骤3.1、数据变换Step 3.1, data transformation

对特征值极值相差较大的特征变量进行归一化处理,本实施例中所述归一化处理采用对异常值具有鲁棒性的Robust Scaler方法处理,该方法步骤如下:Perform normalization processing on the eigenvariables with large differences in eigenvalue extreme values. The normalization processing described in this embodiment is processed by the Robust Scaler method that is robust to outliers. The steps of the method are as follows:

a)计算待处理数据的

Figure BDA0003831971760000121
分位数,其中移除
Figure BDA0003831971760000122
分位数(即中位数),然后存储相应分位数;a) Calculate the data to be processed
Figure BDA0003831971760000121
quantile, where remove
Figure BDA0003831971760000122
Quantile (ie median), and then store the corresponding quantile;

b)计算IQR,其定义为

Figure BDA0003831971760000123
分位数与
Figure BDA0003831971760000124
分位数的差值;b) Calculate the IQR, which is defined as
Figure BDA0003831971760000123
quantile with
Figure BDA0003831971760000124
difference in quantiles;

c)利用IQR对特征变量进行缩放以达到统一尺度;c) Use IQR to scale the feature variables to achieve a unified scale;

根据数据可视化结果,对不满足当前算法归纳偏置的数据集或特征变量进行对数变换,即对相应特征变量值加1后取对数,从而使得各特征变量的分布更趋近于正态分布,避免数据分布的偏态对模型预测结果产生不利影响;According to the data visualization results, logarithmic transformation is performed on the data sets or feature variables that do not satisfy the inductive bias of the current algorithm, that is, the logarithm is taken after adding 1 to the value of the corresponding feature variable, so that the distribution of each feature variable is closer to normal distribution, to avoid the skewness of the data distribution from adversely affecting the prediction results of the model;

步骤3.2、异常值处理Step 3.2, outlier processing

对数据集的异常值采用箱线图与聚类+孤立森林相组合的方式进行处理,其中,箱线图boxplot用于异常值处理前后的效果对比与处理效益的确认;本实施例所采用的是基于K-Means++聚类算法+孤立森林的异常值检测,K-Means++算法通过将初始质心放置在远离彼此的位置,从而产生比传统K-Means更加一致的结果。The outliers of the data set are processed by a combination of boxplot and clustering+isolation forest. Among them, the boxplot is used for the comparison of the effect before and after the outlier processing and the confirmation of the processing benefit; the method adopted in this embodiment Based on K-Means++ clustering algorithm + outlier detection of isolated forest, K-Means++ algorithm produces more consistent results than traditional K-Means by placing initial centroids far away from each other.

步骤4、混凝土数据特征工程Step 4. Concrete Data Feature Engineering

由于步骤1中所采集得到的原始数据集中仅包含了组成HPC的各成分的数值含量,没有具体的含量配比关系;因此进一步通过特征构造的方式(即,结合工程经验经不同成分的数值特征进行数学计算)扩充数据集。例如将原始特征“水含量”与“水泥含量”求比值关系,进而得到水灰比;通过“水含量”与“(水泥含量+矿渣含量+粉煤灰含量)简称凝胶”求比值关系,进而得到水胶比;通过特征工程,新构造得到的特征如下表1所示:Since the original data set collected in step 1 only contains the numerical content of each component that makes up HPC, there is no specific content ratio relationship; perform mathematical calculations) to augment the data set. For example, calculate the ratio relationship between the original feature "water content" and "cement content", and then obtain the water-cement ratio; through the ratio relationship between "water content" and "(cement content + slag content + fly ash content) referred to as gel", Then the water-binder ratio is obtained; through feature engineering, the features obtained by the new structure are shown in Table 1 below:

Figure BDA0003831971760000131
Figure BDA0003831971760000131

表1Table 1

步骤5、混凝土抗压强度预测模型构建与评估Step 5. Construction and evaluation of concrete compressive strength prediction model

基于上述预处理与特征工程之后的数据集,以8:2的比例划分训练集和测试集,基于Boosting框架,分别建立基于AdaBoost算法的第一预测模型和基于CatBoost算法的第二预测模型。同时,结合5折交叉验证,利用回归模型评估指标分别对第一预测模型和第二预测模型性能进行评估。其中,本实施例所采用的评估指标分别定义如下:Based on the data set after the above preprocessing and feature engineering, the training set and test set are divided into a ratio of 8:2. Based on the Boosting framework, the first prediction model based on the AdaBoost algorithm and the second prediction model based on the CatBoost algorithm are respectively established. At the same time, combined with 5-fold cross-validation, the performance of the first prediction model and the second prediction model were evaluated by using the regression model evaluation index. Among them, the evaluation indicators adopted in this embodiment are respectively defined as follows:

Figure BDA0003831971760000132
Figure BDA0003831971760000132

Figure BDA0003831971760000133
Figure BDA0003831971760000133

Figure BDA0003831971760000134
Figure BDA0003831971760000134

Figure BDA0003831971760000135
Figure BDA0003831971760000135

Figure BDA0003831971760000136
Figure BDA0003831971760000136

式中:In the formula:

N为所收集得到的混凝土样本数据总条数;N is the total number of concrete sample data collected;

i为样本编号,即第几条样本;i is the sample number, that is, which sample;

Figure BDA0003831971760000141
为模型对第i条样本所预测的抗压强度值,即预测值;
Figure BDA0003831971760000141
is the compressive strength value predicted by the model for the i-th sample, that is, the predicted value;

Figure BDA0003831971760000142
为第i条样本所对应的实际抗压强度值,即观测值;
Figure BDA0003831971760000142
is the actual compressive strength value corresponding to the i-th sample, that is, the observed value;

步骤6、混凝土抗压强度预测模型参数调优Step 6. Parameter tuning of concrete compressive strength prediction model

结合上述模型性能的评估结果,利用贝叶斯优化的方法进行第一预测模型和第二预测模型的超参数调优,从而进一步提升模型性能;其中,调整的部分关键超参数包括基模型树的棵树、集成模型树的深度,贝叶斯优化的过程以模型误差为目标函数,通过参数的组合,找到对应误差最小的参数。Combined with the evaluation results of the above model performance, the Bayesian optimization method is used to optimize the hyperparameters of the first prediction model and the second prediction model, so as to further improve the performance of the model; among them, some of the key hyperparameters to be adjusted include the base model tree. The depth of the tree and the integrated model tree. The Bayesian optimization process takes the model error as the objective function, and finds the parameter with the smallest corresponding error through the combination of parameters.

步骤7、第一预测模型与第二预测模型融合Step 7. Fusion of the first prediction model and the second prediction model

如图2所示为基于CatBoost算法的第二预测模型在测试集上的拟合效果图,图3所示为基于AdaBoost算法的第一预测模型在测试集上的拟合效果图。从图中可看出,基于CatBoost算法的第二预测模型对抗压强度的预测效果明显优于基于AdaBoost算法的第一预测模型的预测效果;Figure 2 shows the fitting effect diagram of the second prediction model based on the CatBoost algorithm on the test set, and Figure 3 shows the fitting effect diagram of the first prediction model based on the AdaBoost algorithm on the test set. It can be seen from the figure that the prediction effect of the second prediction model based on the CatBoost algorithm on the compressive strength is significantly better than that of the first prediction model based on the AdaBoost algorithm;

为了提升模型预测结果的可靠性,采用群体决策的方式对多个模型预测的结果进行集成,在模型决策层面进行融合,提升模型预测结果的准确性,如图4所示,本实施例中采用加权平均法对CatBoost模型与AdaBoost模型进行模型融合,并在融合过程中赋予了CatBoost模型更大的权重。In order to improve the reliability of model prediction results, group decision-making is used to integrate the results of multiple model predictions, and fusion is performed at the model decision-making level to improve the accuracy of model prediction results, as shown in Figure 4. In this embodiment, The weighted average method fuses the CatBoost model and the AdaBoost model, and gives the CatBoost model a greater weight during the fusion process.

加权平均模型融合过程如下:The weighted average model fusion process is as follows:

将基于AdaBoost算法的第一预测模型与基于CatBoost算法的第二预测模型作为基模型h(x),记HPC抗压强度预测的融合模型为H(x),表示如下:Taking the first prediction model based on the AdaBoost algorithm and the second prediction model based on the CatBoost algorithm as the base model h(x), record the fusion model of HPC compressive strength prediction as H(x), expressed as follows:

Figure BDA0003831971760000151
Figure BDA0003831971760000151

式中:In the formula:

wi:表示第i个基模型的权重,在本实施例中,wi≥0且满足

Figure BDA0003831971760000152
w i : represents the weight of the i-th base model, in this embodiment, w i ≥ 0 and satisfies
Figure BDA0003831971760000152

hi(x):表示第i个基模型所预测的HPC的抗压强度;h i (x): indicates the compressive strength of HPC predicted by the i-th base model;

T:表示要进行模型融合的基模型个数,本实施例中的基模型为AdaBoost与CatBoost,因此T等于2;T: indicates the number of base models for model fusion, the base models in this embodiment are AdaBoost and CatBoost, so T is equal to 2;

H(x):表示混凝土的抗压强度的最终预测结果。H(x): Indicates the final prediction result of the compressive strength of concrete.

其中,基模型的融合权重wi的确定过程如下:Among them, the determination process of the fusion weight w i of the base model is as follows:

结合RMSE定义融合模型的损失函数如下:Combined with RMSE, the loss function of the fusion model is defined as follows:

Figure BDA0003831971760000153
Figure BDA0003831971760000153

式中:In the formula:

N为为样本容量,即所收集得到的混凝土样本数据总条数;N is the sample size, that is, the total number of concrete sample data collected;

Figure BDA0003831971760000154
为第i条样本对应的混凝土实际抗压强度;
Figure BDA0003831971760000154
is the actual compressive strength of concrete corresponding to the i-th sample;

Figure BDA0003831971760000155
为HPC抗压强度预测的融合模型H(x)对第i条样本的混凝土抗压强度的预测值;
Figure BDA0003831971760000155
is the prediction value of the concrete compressive strength of the i-th sample by the fusion model H(x) predicted by HPC compressive strength;

因此,融合模型的最终优化目标定义如下:Therefore, the final optimization objective of the fusion model is defined as follows:

minimiz e(Loss)s.t.w1+w2=1 and w1≥0,w2≥0 式(8)minimize e(Loss)stw 1 +w 2 =1 and w 1 ≥0,w 2 ≥0 Equation (8)

式中:In the formula:

Loss为集成模型H(x)的损失函数;Loss is the loss function of the integrated model H(x);

s.t为约束条件的缩写;s.t is the abbreviation of constraints;

w1和w2为基模型CatBoost和AdaBoost的融合权重;w 1 and w 2 are the fusion weights of the base models CatBoost and AdaBoost;

minimize表示最小化;minimize means minimize;

通过求解式(8)所示的带约束的最小优化问题,得到了基模型的融合权重w。By solving the constrained minimum optimization problem shown in Equation (8), the fusion weight w of the base model is obtained.

步骤8、基于SHAP的混凝土抗压强度预测模型可解释性分析Step 8. Interpretability analysis of SHAP-based concrete compressive strength prediction model

通过前述的模型构建与评估、模型参数调整与模型融合,得到了具有良好预测能力的混凝土抗压强度预测模型。进一步,本实施例结合SHAP模型可解释性算法对模型预测结果进行解释分析,从而更好地了解每一个特征值对于模型预测的混凝土的抗压强度的影响。将该影响用Shapley Value表示,并简记为Ψ,定义如下:Through the aforementioned model construction and evaluation, model parameter adjustment and model fusion, a concrete compressive strength prediction model with good predictive ability is obtained. Further, this embodiment interprets and analyzes the prediction results of the model in combination with the interpretability algorithm of the SHAP model, so as to better understand the influence of each eigenvalue on the compressive strength of concrete predicted by the model. The influence is represented by Shapley Value and abbreviated as Ψ, defined as follows:

Figure BDA0003831971760000161
Figure BDA0003831971760000161

式中:In the formula:

S为待解释模型输入的特征子集,即输入到HPC抗压强度预测的融合模型的混凝土参数集合;S is the feature subset input by the model to be explained, that is, the concrete parameter set input to the fusion model of HPC compressive strength prediction;

xj为待解释样本的第j个特征变量;即第j个混凝土参数;x j is the jth characteristic variable of the sample to be explained; that is, the jth concrete parameter;

p为特征总数,即混凝土参数的总个数;p is the total number of features, that is, the total number of concrete parameters;

valx(S)表示以S为输入特征的时候,待解释模型对样本x的预测结果,即待解释模型输出的抗压强度结果,其中x为样本,x的元素记为xi,xi就是第i个特征变量对应的取值;val x (S) indicates the prediction result of the model to be explained for sample x when S is the input feature, that is, the compressive strength result output by the model to be explained, where x is the sample, and the elements of x are denoted as x i , x i is the value corresponding to the i-th feature variable;

SHAP值,就是第j个特征对于待解释模型输出结果的重要性程度,即边际贡献,而Shapley Value就是各边际贡献的均值。对于待解释模型的模型解释结果定义如下:The SHAP value is the importance of the jth feature to the output of the model to be explained, that is, the marginal contribution, and the Shapley Value is the mean value of each marginal contribution. The model interpretation result of the model to be explained is defined as follows:

Figure BDA0003831971760000171
Figure BDA0003831971760000171

式中:In the formula:

g是待解释的模型,在本实施例中即CatBoost和AdaBoost融合之后的HPC抗压强度预测的融合模型H(x);G is the model to be explained, in the present embodiment, the fusion model H(x) of the HPC compressive strength prediction after the fusion of CatBoost and AdaBoost;

z'∈{0,1}M为组合向量,代表特征zj(j∈[1,M])是否存在,其中zj在本实施例中为输入到HPC抗压强度预测的融合模型的第j个混凝土参数,z'用于标识z1~zM在输入到HPC抗压强度预测的融合模型的参数集合中是否存在;z'∈{0,1} M is a combination vector, representing whether the feature z j (j∈[1,M]) exists, where z j is the first input to the fusion model of HPC compressive strength prediction in this embodiment j concrete parameters, z' is used to identify whether z 1 ~ z M exists in the parameter set input to the fusion model for HPC compressive strength prediction;

M为组合特征个数,即输入到待解释的模型g的输入参数的个数;M is the number of combined features, that is, the number of input parameters input to the model g to be explained;

Figure BDA0003831971760000172
为特征j的特征归因Shapley Value,即第j个参数对待解释的模型g的预测结果,本实施例中即对抗压强度的贡献值;
Figure BDA0003831971760000172
Be the characteristic attribution Shapley Value of feature j, that is, the prediction result of the model g to be explained by the jth parameter, which is the contribution value to the compressive strength in this embodiment;

Ψ0为待解释的模型g平均预测结果,本实施例中即HPC抗压强度预测的融合模型输出的抗压强度预测结果的均值。Ψ 0 is the average prediction result of the model g to be explained, which is the average value of the prediction results of compressive strength output by the fusion model of HPC compressive strength prediction in this embodiment.

Shapley Value度量了特征对于总体预测结果的贡献,Ψj>0时,说明该特征对预测值具有积极提升效果,即有增强抗压强度的效果。Shapley Value measures the contribution of the feature to the overall prediction result. When Ψ j > 0, it indicates that the feature has a positive effect on the prediction value, that is, it has the effect of enhancing the compressive strength.

SHAP全局特征重要性即为每个特征的Shapley Value绝对值求和的平均,即

Figure BDA0003831971760000173
其中,本实施例利用SHAP算法进行模型可解释性分析的部分图表示例见图5~7;The SHAP global feature importance is the average of the sum of the absolute value of the Shapley Value of each feature, that is
Figure BDA0003831971760000173
Among them, some chart examples of using the SHAP algorithm for model interpretability analysis in this embodiment are shown in Figures 5-7;

图5为第一条样本的SHAP局部解释,图6为第十条样本的SHAP局部解释;图中,模型平均预测结果为35.25Mpa,模型对第一条样本的抗压强度预测结果为76.17Mpa。模型对第10条样本的抗压强度预测结果为38.73Mpa。基于模型预测结果和实际观测结果,以及各参数的取值情况,可以实现对混凝土配合比的设计提供指导。Figure 5 shows the local SHAP interpretation of the first sample, and Figure 6 shows the local SHAP interpretation of the tenth sample; in the figure, the average prediction result of the model is 35.25Mpa, and the compressive strength prediction result of the model for the first sample is 76.17Mpa . The compressive strength prediction result of the model for the 10th sample is 38.73Mpa. Based on the model prediction results and actual observation results, as well as the value of each parameter, it can provide guidance for the design of concrete mix ratio.

图7为SHAP全局解释摘要图,Y轴从上到下按贡献度大小依次排列了影响抗压强度的各特征因素;其中,养护期、水灰比、水泥含量为前三个对HPC抗压强度有显著影响的因素,其次是水、水胶比等。X轴为各因素对抗压强度预测模型预测结果的平均影响值。在当前实验结果中,随着养护期的增加,抗压强度平均会有8Mpa的增加。Figure 7 is a summary of SHAP’s global interpretation. The Y-axis arranges the characteristic factors that affect the compressive strength from top to bottom according to the degree of contribution; among them, the curing period, water-cement ratio, and cement content are the first three factors that affect the compressive strength of HPC. Strength has a significant impact on factors, followed by water, water-binder ratio and so on. The X-axis is the average influence value of each factor on the prediction results of the compressive strength prediction model. In the current experimental results, as the curing period increases, the compressive strength will increase by an average of 8Mpa.

应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that the sequence numbers of the steps in the above embodiments do not mean the order of execution, and the execution order of each process should be determined by its function and internal logic, and should not constitute any limitation to the implementation process of the embodiment of the present application.

本申请实施例还提供一个所述系统的实施例,包括模型训练模块和模型复合运算模块;The embodiment of the present application also provides an embodiment of the system, including a model training module and a model composite operation module;

模型训练模块用于根据采集的HPC的历史参数数据分别对第一模型和第二模型进行训练,分别得到训练完成的用于预测HPC抗压强度的第一预测模型和第二预测模型;The model training module is used to train the first model and the second model respectively according to the historical parameter data of the collected HPC, and respectively obtain the first prediction model and the second prediction model for predicting the compressive strength of the HPC after training;

模型复合运算模块用于对训练完成的用于预测HPC抗压强度的第一预测模型和第二预测模型输出的HPC抗压强度进行复合运算,得到HPC抗压强度预测的融合模型,HPC抗压强度预测的融合模型输出最终的HPC抗压强度的预测结果。The model composite operation module is used to perform composite operations on the HPC compressive strength output by the first prediction model for predicting the HPC compressive strength and the second prediction model that have been trained to obtain a fusion model for predicting the HPC compressive strength, HPC compressive strength The fusion model of strength prediction outputs the final prediction result of HPC compressive strength.

模型复合运算模块中,采用加权平均法对第一预测模型和第二预测模型输出的HPC抗压强度进行复合运算。In the model composite operation module, the weighted average method is used to perform composite operations on the HPC compressive strength output by the first prediction model and the second prediction model.

在另一个实施例中还包括模型解释分析模块,用于利用第一算法对所述HPC抗压强度预测的融合模型进行解释分析,得到每一种HPC参数的成分含量对HPC抗压强度的影响。In another embodiment, a model interpretation and analysis module is also included, which is used to interpret and analyze the fusion model of the HPC compressive strength prediction by using the first algorithm, so as to obtain the influence of the component content of each HPC parameter on the HPC compressive strength .

本说明书未作详细描述的内容属于本领域专业技术人员公知的现有技术。The content not described in detail in this specification belongs to the prior art known to those skilled in the art.

Claims (10)

1.一种基于模型融合的HPC抗压强度预测方法,其特征在于,包括如下步骤:1. a method for predicting HPC compressive strength based on model fusion, is characterized in that, comprises the steps: S1、根据采集的HPC的历史参数数据分别对第一预测模型和第二预测模型进行训练,分别得到训练完成的用于预测HPC抗压强度的第一预测模型和第二预测模型;S1. The first prediction model and the second prediction model are respectively trained according to the historical parameter data of the collected HPC, and the first prediction model and the second prediction model for predicting the HPC compressive strength after training are respectively obtained; S2、对训练完成的第一预测模型和第二预测模型进行复合运算得到HPC抗压强度预测的融合模型,所述HPC抗压强度预测的融合模型输出最终的HPC抗压强度的预测结果。S2. Composite calculation is performed on the trained first prediction model and the second prediction model to obtain an HPC compressive strength prediction fusion model, and the HPC compressive strength prediction fusion model outputs a final HPC compressive strength prediction result. 2.如权利要求1所述的基于模型融合的HPC抗压强度预测方法,其特征在于,所述步骤S2后还包括步骤S3:2. the HPC compressive strength prediction method based on model fusion as claimed in claim 1, is characterized in that, also comprises step S3 after described step S2: S3、利用第一算法对所述HPC抗压强度预测的融合模型进行解释分析,得到每一种HPC参数的成分含量对HPC抗压强度预测的融合模型输出的HPC抗压强度的贡献值。S3. Using the first algorithm to interpret and analyze the fusion model for predicting HPC compressive strength, and obtain the contribution value of the component content of each HPC parameter to the HPC compressive strength output by the fusion model for predicting HPC compressive strength. 3.如权利要求2所述的基于模型融合的HPC抗压强度预测方法,其特征在于,所述第一算法为基于SHAP可解释性算法。3. the HPC compressive strength prediction method based on model fusion as claimed in claim 2, is characterized in that, described first algorithm is based on SHAP interpretability algorithm. 4.如权利要求1所述的基于模型融合的HPC抗压强度预测方法,其特征在于,所述步骤S2中,得到HPC抗压强度预测的融合模型的方法包括采用加权平均法对第一预测模型和第二预测模型进行复合运算。4. the HPC compressive strength prediction method based on model fusion as claimed in claim 1, is characterized in that, in described step S2, the method for obtaining the fusion model of HPC compressive strength prediction comprises adopting weighted average method to first prediction The model is compounded with the second predictive model. 5.如权利要求4所述的基于模型融合的HPC抗压强度预测方法,其特征在于,所述第一预测模型基于AdaBoost算法对HPC的抗压强度进行预测;所述第二预测模型基于CatBoost算法对HPC的抗压强度进行预测。5. the HPC compressive strength prediction method based on model fusion as claimed in claim 4, is characterized in that, described first predictive model predicts the compressive strength of HPC based on AdaBoost algorithm; Described second predictive model is based on CatBoost The algorithm predicts the compressive strength of the HPC. 6.如权利要求5所述的基于模型融合的HPC抗压强度预测方法,其特征在于,采用加权平均法对第一预测模型和第二预测模型进行模复合运算时,基于CatBoost算法的第二预测模型的权重大于基于AdaBoost算法的第一预测模型的权重。6. the HPC compressive strength prediction method based on model fusion as claimed in claim 5, is characterized in that, when adopting weighted average method to carry out model compound operation to first prediction model and second prediction model, based on the second of CatBoost algorithm The weight of the prediction model is greater than the weight of the first prediction model based on the AdaBoost algorithm. 7.如权利要求6所述的基于模型融合的HPC抗压强度预测方法,其特征在于,所述加权平均法中每个预测模型的权重的计算方法包括:通过求解下式所示的带约束的最小优化问题,得到了第一预测模型和第二预测模型的权重:7. the HPC compressive strength prediction method based on model fusion as claimed in claim 6, is characterized in that, the calculation method of the weight of each prediction model in the described weighted average method comprises: by solving the band constraints shown in the following formula The minimum optimization problem of , the weights of the first prediction model and the second prediction model are obtained: minimiz e(Loss)s.t.w1+w2=1and w1≥0,w2≥0minimize e(Loss)stw 1 +w 2 =1and w 1 ≥0,w 2 ≥0 式中:In the formula: w1和w2分别对应第一预测模型和第二预测模型的权重;w 1 and w 2 correspond to the weights of the first prediction model and the second prediction model respectively; Loss为下述H(x)的损失函数;Loss is the loss function of the following H(x);
Figure FDA0003831971750000021
Figure FDA0003831971750000021
式中:In the formula: H(x):表示HPC抗压强度预测的融合模型输出的最终预测的HPC抗压强度;H(x): indicates the final predicted HPC compressive strength output by the fusion model of HPC compressive strength prediction; w1、w2:分别表示第一预测模型和第二预测模型的权重;w 1 , w 2 : represent the weights of the first prediction model and the second prediction model respectively; h1(x)、h2(x):分别表示第一预测模型和第二预测模型所预测的HPC抗压强度。h 1 (x), h 2 (x): represent the HPC compressive strength predicted by the first prediction model and the second prediction model, respectively.
8.一种如权利要求1所述方法的基于模型融合的HPC抗压强度预测系统,其特征在于,包括:模型训练模块和模型复合运算模块;8. a HPC compressive strength prediction system based on model fusion of method as claimed in claim 1, is characterized in that, comprises: model training module and model composite operation module; 所述模型训练模块用于根据采集的HPC的历史参数数据分别对第一模型和第二模型进行训练,分别得到训练完成的用于预测HPC抗压强度的第一预测模型和第二预测模型;The model training module is used to train the first model and the second model respectively according to the historical parameter data of the collected HPC, and respectively obtain the first prediction model and the second prediction model for predicting the compressive strength of the HPC that have been trained; 所述模型复合运算模块用于对训练完成的第一预测模型和第二预测模型进行复合运算得到HPC抗压强度预测的融合模型,所述HPC抗压强度预测的融合模型输出最终的HPC抗压强度的预测结果。The model composite operation module is used to perform composite operations on the trained first prediction model and the second prediction model to obtain a fusion model of HPC compressive strength prediction, and the fusion model of HPC compressive strength prediction outputs the final HPC compressive strength strength predictions. 9.如权利要求8所述的基于模型融合的HPC抗压强度预测系统,其特征在于,还包括模型解释分析模块,用于利用第一算法对所述HPC抗压强度预测的融合模型进行解释分析,得到每一种HPC参数的成分含量对HPC抗压强度预测的融合模型输出的HPC抗压强度的贡献值。9. the HPC compressive strength prediction system based on model fusion as claimed in claim 8, is characterized in that, also comprises model interpretation analysis module, is used for utilizing the first algorithm to explain the fusion model of described HPC compressive strength prediction The contribution value of the component content of each HPC parameter to the HPC compressive strength output by the fusion model of HPC compressive strength prediction is obtained. 10.如权利要求8所述的基于模型融合的HPC抗压强度预测系统,其特征在于,所述模型复合运算模块中,采用加权平均法对第一预测模型和第二预测模型输出的HPC抗压强度进行复合运算。10. The HPC compressive strength prediction system based on model fusion as claimed in claim 8, it is characterized in that, in the described model composite operation module, adopt weighted average method to the HPC compressive strength of the first prediction model and the second prediction model output The compressive strength is compounded.
CN202211078389.2A 2022-09-05 2022-09-05 A HPC compressive strength prediction method and system based on model fusion Active CN115497574B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211078389.2A CN115497574B (en) 2022-09-05 2022-09-05 A HPC compressive strength prediction method and system based on model fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211078389.2A CN115497574B (en) 2022-09-05 2022-09-05 A HPC compressive strength prediction method and system based on model fusion

Publications (2)

Publication Number Publication Date
CN115497574A true CN115497574A (en) 2022-12-20
CN115497574B CN115497574B (en) 2025-05-27

Family

ID=84468246

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211078389.2A Active CN115497574B (en) 2022-09-05 2022-09-05 A HPC compressive strength prediction method and system based on model fusion

Country Status (1)

Country Link
CN (1) CN115497574B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116092297A (en) * 2023-04-07 2023-05-09 南京航空航天大学 Edge calculation method and system for low-permeability distributed differential signal control
CN117420011A (en) * 2023-12-18 2024-01-19 南京建正建设工程质量检测有限责任公司 Concrete brick multipoint compressive strength detection system
CN117763701A (en) * 2024-02-22 2024-03-26 四川省交通勘察设计研究院有限公司 method for predicting strength of steel-concrete connection transition surface of steel arch bridge and related products
CN119827318A (en) * 2025-01-02 2025-04-15 中电建路桥集团有限公司 Reinforced concrete strength detection method in marine environment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100774301B1 (en) * 2007-02-13 2007-11-08 군산대학교산학협력단 Prediction Method of Concrete Compressive Strength
CN107133446A (en) * 2017-03-24 2017-09-05 广东工业大学 A kind of method for predicting super high-early concrete compression strength
CN109923264A (en) * 2016-09-14 2019-06-21 爱马特伦系统有限责任公司 Method for strengthening cementitious construction by high-speed extrusion printing and apparatus using the same
CN110163430A (en) * 2019-05-10 2019-08-23 东南大学 Concrete material Prediction of compressive strength method based on AdaBoost algorithm
CN112069567A (en) * 2020-08-07 2020-12-11 湖北交投十巫高速公路有限公司 Method for predicting compressive strength of concrete based on random forest and intelligent algorithm
CN114611775A (en) * 2022-03-03 2022-06-10 中国计量大学 A construction method of concrete 28-day compressive strength classification prediction model

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100774301B1 (en) * 2007-02-13 2007-11-08 군산대학교산학협력단 Prediction Method of Concrete Compressive Strength
CN109923264A (en) * 2016-09-14 2019-06-21 爱马特伦系统有限责任公司 Method for strengthening cementitious construction by high-speed extrusion printing and apparatus using the same
CN107133446A (en) * 2017-03-24 2017-09-05 广东工业大学 A kind of method for predicting super high-early concrete compression strength
CN110163430A (en) * 2019-05-10 2019-08-23 东南大学 Concrete material Prediction of compressive strength method based on AdaBoost algorithm
CN112069567A (en) * 2020-08-07 2020-12-11 湖北交投十巫高速公路有限公司 Method for predicting compressive strength of concrete based on random forest and intelligent algorithm
CN114611775A (en) * 2022-03-03 2022-06-10 中国计量大学 A construction method of concrete 28-day compressive strength classification prediction model

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116092297A (en) * 2023-04-07 2023-05-09 南京航空航天大学 Edge calculation method and system for low-permeability distributed differential signal control
CN116092297B (en) * 2023-04-07 2023-06-27 南京航空航天大学 Edge calculation method and system for low-permeability distributed differential signal control
CN117420011A (en) * 2023-12-18 2024-01-19 南京建正建设工程质量检测有限责任公司 Concrete brick multipoint compressive strength detection system
CN117420011B (en) * 2023-12-18 2024-03-15 南京建正建设工程质量检测有限责任公司 Concrete brick multipoint compressive strength detection system
CN117763701A (en) * 2024-02-22 2024-03-26 四川省交通勘察设计研究院有限公司 method for predicting strength of steel-concrete connection transition surface of steel arch bridge and related products
CN117763701B (en) * 2024-02-22 2024-05-07 四川省交通勘察设计研究院有限公司 Method for predicting strength of steel-concrete connection transition surface of steel arch bridge and related products
CN119827318A (en) * 2025-01-02 2025-04-15 中电建路桥集团有限公司 Reinforced concrete strength detection method in marine environment

Also Published As

Publication number Publication date
CN115497574B (en) 2025-05-27

Similar Documents

Publication Publication Date Title
CN115497574A (en) A method and system for predicting compressive strength of HPC based on model fusion
CN104991051B (en) Method for predicting concrete strength based on hybrid model
CN114969953B (en) Optimal Design Method and Equipment for Underpassing Shield Tunnel Based on CatBoost-NSGA-Ⅲ
CN116448419A (en) Zero-sample bearing fault diagnosis method based on high-dimensional parameter multi-objective efficient optimization of deep model
CN119049595B (en) Density decision method for dense-medium separation guided by deep learning and physical model
CN116090696A (en) Landslide geological disaster risk classification prediction method suitable for mountain railway line
Bagheri et al. Formulation of mix design for 3D printing of geopolymers: A machine learning approach
CN110222387A (en) The polynary drilling time sequence prediction method of integral CRJ network is leaked based on mixing
CN104881707A (en) Sintering energy consumption prediction method based on integrated model
Ziolkowski Computational complexity and its influence on predictive capabilities of machine learning models for concrete mix design
Trinh et al. Enhancing Compressive strength prediction of Roller Compacted concrete using Machine learning techniques
CN114169594A (en) Gas concentration prediction method based on LSTM-LightGBM variable weight combined model
CN117935988A (en) Method for predicting compressive strength of recycled coarse aggregate concrete based on support vector regression
CN113919729A (en) Regional three-generation space influence and cooperation level evaluation method and system
Zhang et al. Developments and Applications of Neutrosophic Theory in Civil Engineering Fields: A Review.
CN102621953A (en) Automatic online quality monitoring and prediction model updating method for rubber hardness
Musleh et al. Comparative analysis of machine learning techniques for concrete compressive strength prediction
Gu et al. Ensemble learning soft sensor method of endpoint carbon content and temperature of BOF based on GCN embedding supervised ensemble clustering
CN110289098B (en) Risk prediction method based on clinical examination and medication intervention data
Dong et al. Application of Fully Connected Neural Network‐Based PyTorch in Concrete Compressive Strength Prediction
Li et al. Enhanced Prediction and Evaluation of Hydraulic Concrete Compressive Strength Using Multiple Soft Computing and Metaheuristic Optimization Algorithms
Turkey et al. Concrete compressive strength prediction using machine learning algorithms
CN107679020A (en) A kind of method for directly predicting sea ice first, whole ice day
CN115374570A (en) Multi-source weighted training set construction method for deformation prediction of engineering tunnel crossing
Sun et al. A soft-sensing model for predicting cement-specific surface area based on inception-residual-quasi-recurrent neural networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载