CN114492627A - A prediction method of shale brittleness index based on improved KNN algorithm - Google Patents
A prediction method of shale brittleness index based on improved KNN algorithm Download PDFInfo
- Publication number
- CN114492627A CN114492627A CN202210084514.4A CN202210084514A CN114492627A CN 114492627 A CN114492627 A CN 114492627A CN 202210084514 A CN202210084514 A CN 202210084514A CN 114492627 A CN114492627 A CN 114492627A
- Authority
- CN
- China
- Prior art keywords
- training
- brittleness
- prediction
- brittleness index
- matrix
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24147—Distances to closest patterns, e.g. nearest neighbour classification
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01V—GEOPHYSICS; GRAVITATIONAL MEASUREMENTS; DETECTING MASSES OR OBJECTS; TAGS
- G01V11/00—Prospecting or detecting by methods combining techniques covered by two or more of main groups G01V1/00 - G01V9/00
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01V—GEOPHYSICS; GRAVITATIONAL MEASUREMENTS; DETECTING MASSES OR OBJECTS; TAGS
- G01V20/00—Geomodelling in general
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- General Life Sciences & Earth Sciences (AREA)
- Geophysics (AREA)
- Geophysics And Detection Of Objects (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
本发明公开了一种基于改进KNN算法的页岩脆性指数预测方法,包括确定训练井和测试井;将训练井中的每个测井参数分别与脆性指数进行相关性分析并对比,选择相关系数绝对值在前三的三个测井参数作为自变量,脆性指数作为因变量;利用训练井中的数据构建训练样本和训练数据库;利用KNN算法迭代优化训练数据库,得到最优训练数据库;将最优训练数据库中数据再作为训练数据,将测试井与训练井中对应的数据作为测试数据,采用交叉验证法得到最优K值;用最优训练数据库、最优K值建立KNN模型得到预测模型,本发明能提高模型预测的准确率和稳定性,提高预测精度。
The invention discloses a shale brittleness index prediction method based on an improved KNN algorithm, which includes determining training wells and test wells; performing correlation analysis and comparison between each logging parameter in the training wells and the brittleness index, and selecting the absolute correlation coefficient. The three logging parameters whose values are in the top three are used as independent variables, and the brittleness index is used as the dependent variable; the data in the training wells are used to construct training samples and training database; the KNN algorithm is used to iteratively optimize the training database to obtain the optimal training database; The data in the database is then used as training data, the data corresponding to the test wells and the training wells are used as test data, and the cross-validation method is used to obtain the optimal K value; the optimal training database and the optimal K value are used to establish a KNN model to obtain a prediction model. It can improve the accuracy and stability of model prediction, and improve the prediction accuracy.
Description
技术领域technical field
本发明涉及一种页岩脆性指数预测方法,尤其涉种一种基于改进KNN算法的页岩脆性指数预测方法。The invention relates to a shale brittleness index prediction method, in particular to a shale brittleness index prediction method based on an improved KNN algorithm.
背景技术Background technique
对于页岩储集层脆性的表征,在已有的岩石脆性评价方法的基础之上,结合勘探开发经验,相关学者相继提出了许多表征页岩脆性指数的评价方法。但是,基于不同的评价目的,不同领域的学者提出不同的定义和评价公式,至今业界对于页岩脆性的定义和评价方法并没有统一的认识。所以,这也是页岩相关研究的重点和难点。For the characterization of shale reservoir brittleness, on the basis of existing rock brittleness evaluation methods and combined with exploration and development experience, related scholars have successively proposed many evaluation methods to characterize shale brittleness index. However, based on different evaluation purposes, scholars in different fields have proposed different definitions and evaluation formulas. So far, the industry has not had a unified understanding of the definition and evaluation methods of shale brittleness. Therefore, this is also the focus and difficulty of shale-related research.
Morley等(1944)把脆性定义为塑性的缺失。Ramsay(1967)认为脆性是指岩石内部的粘聚力被破坏;Obert等(1967)则提出,脆性是材料在屈服应力或稍大于屈服应力下断裂的性质。Evans等(1990)把变形程度小于1%定义为脆性,大于5%定义为延性,其他为脆性-延性过渡。George(1995)定义,当对岩石施加超过产生微裂纹所需应力时,岩石连续变形而不产生永久形变的能力为岩石脆性。Goktan和Gunes(2005)定义,脆性为低应力下无明显变形的断裂倾向。Holt(2011)认为脆性没有特定的定义,从实验室和油田现场数据中可以获得相当多的指数参数.紧密联系现场以找到最合适的脆性定义。李庆辉等(2012)在总结页岩宏微观破坏特征的基础上,认为脆性是材料的综合特性,是在自身天然非均质性和外在特定加载条件下产生内部非均匀应力,并导致局部破坏,进而形成多维破裂面的能力。Tarasov和PotVin(2013)认为脆性是在自身天然非均质性和外在特定加载条件下材料弹性能量积累而在峰后破坏过程中表现出的自我维持的宏观破坏的岩石能力。王宇等(2014)认为脆性是岩石材料的综合特性,受内外因素共同控制,内因是岩石材料的非均质性,主要指组成岩石的矿物颗粒、结构及构造,外因是岩石在特定加载条件下产生内部非均匀应力,并导致局部破坏,进而形成多维破裂面的能力。岩石力学中解释,脆性是指物体受力后形变很小就发生破裂的性质。地质学中解释,材料断裂或破坏前表现出极少或没有塑性形变的特征即为脆性。脆性虽然没有明确的定义,但是目前已有的共识是,岩石在破坏时表现出以下特征则为高脆性:①低应变时即发生破坏;②裂缝主导的断裂破坏;③岩石由细粒组成;④高抗压/抗拉强度比;⑤高回弹能;⑥内摩擦角大;⑦硬度测试时裂纹发育完全。Morley et al. (1944) defined brittleness as the absence of plasticity. Ramsay (1967) believed that brittleness refers to the destruction of cohesion within the rock; Obert et al. (1967) proposed that brittleness is the property of a material to fracture at or slightly greater than the yield stress. Evans et al. (1990) defined the deformation degree less than 1% as brittleness, more than 5% as ductile, and the rest as brittle-ductile transition. George (1995) defined rock brittleness as the ability of rock to deform continuously without permanent deformation when a stress exceeding that required to generate microcracks is applied to the rock. Goktan and Gunes (2005) defined brittleness as the tendency to fracture without significant deformation under low stress. Holt (2011) believes that there is no specific definition of brittleness, and a considerable number of index parameters can be obtained from laboratory and oilfield field data. The field is closely linked to find the most appropriate definition of brittleness. On the basis of summarizing the macro and micro failure characteristics of shale, Li Qinghui et al. (2012) believed that brittleness is a comprehensive characteristic of the material, which is the internal non-uniform stress generated under its own natural heterogeneity and external specific loading conditions, and lead to local failure. , and then the ability to form multi-dimensional fracture surfaces. Tarasov and PotVin (2013) considered brittleness to be the ability of a rock to self-sustain macroscopic failure in the post-peak failure process due to the accumulation of elastic energy in the material under its own natural heterogeneity and external specific loading conditions. Wang Yu et al. (2014) believed that brittleness is a comprehensive characteristic of rock materials, controlled by both internal and external factors. The internal factor is the heterogeneity of rock materials, mainly referring to the mineral particles, structure and structure of the rock, and the external factor is the rock under specific loading conditions. The ability to generate internal non-uniform stress and lead to local failure, thereby forming a multi-dimensional fracture surface. According to rock mechanics, brittleness refers to the property of an object to rupture with little deformation after being subjected to force. Brittleness is explained in geology when a material exhibits little or no plastic deformation before fracture or failure. Although there is no clear definition of brittleness, the current consensus is that rocks are highly brittle if they exhibit the following characteristics during failure: (1) failure occurs at low strain; (2) fracture-dominated fracture failure; (3) rock is composed of fine grains; ④ High compressive/tensile strength ratio; ⑤ High rebound energy; ⑥ Large internal friction angle; ⑦ Complete crack development during hardness test.
利用测井资料评价页岩脆性指数是一种方便经济实用的方法,而目前应用最广泛的是岩石力学弹性参数法和基于岩石矿物学的脆性矿物法。脆性矿物法在实际应用中具有一定的优势,经过试验校正的测井矿物解释结果能够获得全井段脆性表征剖面,实用性强。但是单纯考虑岩石矿物组成,容易忽略成岩作用的影响。同时,在不同的地质条件下,岩石中脆性矿物组分可能不同,并且由于每一种矿物的化学组成不同,其岩石物理的脆性表现并不相同,所具有的脆性也就不同,所以该方法具有一定的局限性。The evaluation of shale brittleness index by logging data is a convenient, economical and practical method, and the rock mechanics elastic parameter method and the brittle mineral method based on rock mineralogy are the most widely used at present. The brittle mineral method has certain advantages in practical application, and the log mineral interpretation results corrected by the test can obtain the brittleness characterization profile of the whole well section, which is highly practical. However, simply considering the mineral composition of the rock, it is easy to ignore the influence of diagenesis. At the same time, under different geological conditions, the composition of brittle minerals in rocks may be different, and due to the different chemical composition of each mineral, its petrophysical brittleness is not the same, and its brittleness is also different, so this method has certain limitations.
岩石力学弹性参数法可以利用测井数据和地震数据实现有效预测脆性指数,但是缺乏理论支撑,并且无法反映围压影响和破裂微观特征,难以表征深层和浅层页岩脆性差异。另外,在计算弹性参数的时候,不同的计算方法或者反演方法所对应的结果可能会出现差异,导致弹性参数法无法具有较好的普适性和兼容性。The rock mechanics elastic parameter method can effectively predict the brittleness index by using logging data and seismic data, but it lacks theoretical support, and cannot reflect the influence of confining pressure and the microscopic characteristics of fracture, so it is difficult to characterize the brittleness difference between deep and shallow shale. In addition, when calculating elastic parameters, the results corresponding to different calculation methods or inversion methods may be different, which makes the elastic parameter method unable to have good universality and compatibility.
发明内容SUMMARY OF THE INVENTION
本发明的目的就在于提供一种解决上述问题,能大幅度提升了KNN算法的预测精度的,一种基于改进KNN算法的页岩脆性指数预测方法。The purpose of the present invention is to provide a shale brittleness index prediction method based on the improved KNN algorithm, which solves the above problems and can greatly improve the prediction accuracy of the KNN algorithm.
为了实现上述目的,本发明采用的技术方案是这样的:一种基于改进KNN算法的页岩脆性指数预测方法,包括以下步骤;In order to achieve the above object, the technical solution adopted in the present invention is as follows: a shale brittleness index prediction method based on an improved KNN algorithm, comprising the following steps;
(1)已知数口井的原始测井资料,所述原始测井资料包括井内n个不同深度处的测井参数和脆性指数;所述测井参数包括中子孔隙度CNL,横波时差DTS、钾含量K、声波时差AC、自然伽马GR和有机质含量TOC,选择一口井作为训练井,其余为测试井;(1) The original logging data of several wells are known, and the original logging data includes logging parameters and brittleness indices at n different depths in the well; the logging parameters include neutron porosity CNL, shear wave transit time DTS , potassium content K, acoustic time difference AC, natural gamma GR and organic matter content TOC, select one well as training well, and the rest are test wells;
(2)将训练井中的每个测井参数分别与脆性指数进行相关性分析并对比,选择相关系数绝对值在前三的三个测井参数作为自变量,分别标记为自变量A、自变量B、自变量C,脆性指数作为因变量;(2) Conduct correlation analysis and comparison between each logging parameter in the training well and the brittleness index, select the three logging parameters whose absolute value of the correlation coefficient is in the top three as independent variables, and mark them as independent variable A, independent variable B, independent variable C, brittleness index as the dependent variable;
(3)构建训练样本和训练数据库;(3) Construct training samples and training database;
将训练井同一深度处的三个自变量构成训练样本xi={xAi,xBi,xCi},i表示第i个深度,i=1~n,xAi,xBi,xCi分别表示第i个深度处自变量A的值、自变量B的值、自变量C的值,该深度对应的脆性指数为yi,将yi作为该训练样本的标签,所有训练样本构成训练数据库;Three independent variables at the same depth of the training well constitute the training sample x i ={x Ai , x Bi , x Ci }, i represents the i-th depth, i=1~n, x Ai , x Bi , x Ci respectively Represents the value of the independent variable A, the value of the independent variable B, and the value of the independent variable C at the i-th depth. The fragility index corresponding to this depth is y i , and y i is used as the label of the training sample, and all training samples constitute the training database. ;
(4)利用KNN算法迭代优化训练数据库,得到最优训练数据库;包括步骤(41)-(49);(4) using the KNN algorithm to iteratively optimize the training database to obtain the optimal training database; including steps (41)-(49);
(41)将所有训练样本构成矩阵X={XA,XB,XC},其中XA、XB、XC分别为所有xAi、xBi、xCi构成的列向量,将所有脆性指数按深度构成列向量Y;预设一迭代矩阵Xt={XAt,XBt,XCt}、迭代次数M,t=1~M;且当t=1时,XAt=XA、XBt=XB、XCt=XC;(41) All training samples are formed into a matrix X={X A , X B , X C }, where X A , X B , X C are column vectors formed by all x Ai , x Bi , and x Ci respectively, and all brittleness The index forms a column vector Y according to the depth; preset an iterative matrix X t ={X At , X Bt , X Ct }, the number of iterations M, t=1~M; and when t=1, X At =X A , X Bt =X B , X Ct =XC;
(42)对矩阵X中的元素分别归一化处理,得到归一化后的矩阵X′;(42) The elements in the matrix X are respectively normalized to obtain the normalized matrix X';
(43)对矩阵Xt中的元素分别归一化处理,得到归一化后的矩阵Xt′;(43) Normalize the elements in the matrix X t respectively to obtain the normalized matrix X t ′;
(44)计算矩阵Xt′的预测值,包括(a1)-(a5);(44) Calculate the predicted value of the matrix X t ', including (a1)-(a5);
(a1)将矩阵X′的每一行,分别与矩阵Xt′的第一行求欧氏距离,得到数个欧式距离值;(a1) Calculate the Euclidean distance between each row of the matrix X' and the first row of the matrix X t ' respectively, and obtain several Euclidean distance values;
(a2)利用冒泡排序算法对所有欧式距离值从小到大排序,得到一距离序列;(a2) Use the bubble sort algorithm to sort all Euclidean distance values from small to large to obtain a distance sequence;
(a3)查找距离序列中前K个欧氏距离值对应的训练样本的标签,得到K个标签,将K个标签求均值,作为Xt′第一行的脆性指数预测值;(a3) Find the labels of the training samples corresponding to the first K Euclidean distance values in the distance sequence, obtain K labels, and average the K labels as the predicted value of the fragility index in the first row of X t ';
(a4)按照(a1)-(a3)的方法,得到Xt′每一行的脆性指数预测值;(a4) According to the method of (a1)-(a3), obtain the predicted value of brittleness index for each row of X t ';
(a5)将所有脆性指数预测值按深度排序构成一列向量Yt,作为矩阵Xt′的预测值;(a5) Sort all the predicted values of brittleness index by depth to form a column vector Y t , which is used as the predicted value of the matrix X t ′;
(45)计算Yt与Y间的均方差MSE、允许绝对误差E、和预测率rate;(45) Calculate the mean square error MSE between Y t and Y, the allowable absolute error E, and the prediction rate rate;
(46)预设一绝对误差,保留矩阵X中绝对误差≤E的训练样本,迭代更新Xt;(46) preset an absolute error, retain the training samples with absolute error≤E in the matrix X, and iteratively update X t ;
(47)重复步骤(43)-(46),直到达到迭代次数;(47) Repeat steps (43)-(46) until the number of iterations is reached;
(48)对比每次迭代计算的均方差MSE和预测率,选择均方差小且预测率高的迭代次数对应的的迭代矩阵,作为最优训练数据库;(48) Compare the mean square error MSE and the prediction rate calculated by each iteration, and select the iteration matrix corresponding to the number of iterations with small mean square error and high prediction rate as the optimal training database;
(5)在KNN算法中,将最优训练数据库中的数据再作为训练数据,将测试井与训练井中对应的数据作为测试数据,采用交叉验证法得到最优K值;(5) In the KNN algorithm, the data in the optimal training database is used as the training data, the data corresponding to the test wells and the training wells are used as the test data, and the optimal K value is obtained by the cross-validation method;
(6)基于KNN算法利用最优训练数据库、最优K值建立一KNN模型,作为页岩脆性预测模型;(6) Based on the KNN algorithm, a KNN model is established by using the optimal training database and the optimal K value as a shale brittleness prediction model;
(7)选择一待测井,获取其原始测井资料中与最优训练数据库对应的测井数据,输入页岩脆性预测模型中,输出其预测值。(7) Select a well to be logged, obtain the logging data corresponding to the optimal training database in its original logging data, input it into the shale brittleness prediction model, and output its predicted value.
作为优选:步骤(1)中,选择训练井的方法为:对比每口井的脆性指数数据量、脆性分布、高脆性指数比例,选择脆性指数数据量大、脆性分布均匀、高脆性指数比例大的井作为训练井。As a preference: in step (1), the method for selecting training wells is: comparing the amount of brittleness index data, brittleness distribution, and high brittleness index ratio of each well, and selecting a large amount of brittleness index data, a uniform brittleness distribution, and a large proportion of high brittleness index well as a training well.
作为优选:步骤(2)中,所述异常值包括零值、负值、和异常大值。Preferably: in step (2), the abnormal value includes zero value, negative value, and abnormally large value.
作为优选:所述步骤(31)中,用下式进行归一化处理;As preferably: in the step (31), the following formula is used for normalization;
x表示向量中的元素,x′表示归一化处理后的元素,xmax和xmin分别表示该向量中的最大值和最小值。x represents the element in the vector, x' represents the normalized element, and x max and x min represent the maximum and minimum values in the vector, respectively.
作为优选:步骤(45)中,计算Yt与Y间的均方差MSE、允许绝对误差E、和预测率rate,具体采用以下公式计算:As a preference: in step (45), calculate the mean square error MSE, allowable absolute error E, and prediction rate rate between Y t and Y, and specifically calculate by the following formula:
E=(BImax-BImin)×100% (2)E=(BI max -BI min )×100% (2)
式(1)中,yi为步骤(3)中训练井第i个深度处的脆性指数为,yti为列向量Yt的第i个元素;In formula (1), y i is the brittleness index at the i-th depth of the training well in step (3), and y ti is the i-th element of the column vector Y t ;
式(2)中,BImax和BImin为本次迭代的Yt中的最大值和最小值;In formula (2), BI max and BI min are the maximum and minimum values in Y t of this iteration;
式(3)中,M为迭代次数,time为预测准确的次数,其判定标准为:若本次迭代中,MSE<E,则认为本次预测准确。In formula (3), M is the number of iterations, time is the number of times the prediction is accurate, and the criterion is: if MSE<E in this iteration, the prediction is considered accurate.
与现有技术相比,本发明的优点在于:Compared with the prior art, the advantages of the present invention are:
通过相关性分析找出适合预测模型的测井参数,构成训练样本和训练数据库,该训练样本与脆性指数高度相关;Find out the logging parameters suitable for the prediction model through correlation analysis, and constitute the training sample and training database. The training sample is highly correlated with the brittleness index;
对KNN算法进行了改进,将传统的前K个点出现频数最高的类别作为要测试的预测分类,改为前K个点的平均值作为测试的预测值,从而形成一种改进型KNN算法,去迭代优化训练数据库,得到最优训练数据库;The KNN algorithm has been improved. The traditional category with the highest frequency of occurrence of the first K points is used as the prediction classification to be tested, and the average value of the first K points is changed to be the prediction value of the test, thus forming an improved KNN algorithm. To iteratively optimize the training database to obtain the optimal training database;
利用最优训练数据库再作为训练数据,将测试井与训练井中对应的数据作为测试数据,采用交叉验证法得到最优K值;The optimal training database is used as the training data, and the corresponding data in the test wells and the training wells are used as the test data, and the optimal K value is obtained by the cross-validation method;
最后将最优训练数据库、最优K值带入KNN模型中进行训练,得到一种KNN算法-页岩脆性预测模型,本发明方法能提高模型预测的准确率和稳定性,提高预测精度。Finally, the optimal training database and the optimal K value are brought into the KNN model for training, and a KNN algorithm-shale brittleness prediction model is obtained.
附图说明Description of drawings
图1为本发明流程图。Fig. 1 is a flow chart of the present invention.
图2为训练井原始BI值随深度的分布图;Fig. 2 is the distribution diagram of original BI value of training well with depth;
图3a为中子孔隙度CNL与BI的交会图;Figure 3a is the intersection diagram of neutron porosity CNL and BI;
图3b为横波时差DTS与BI的交会图;Fig. 3b is the intersection diagram of shear wave time difference DTS and BI;
图3c为钾K与BI的交会图;Fig. 3c is the intersection diagram of potassium K and BI;
图4为将最优训练数据库输入页岩脆性预测模型得到的BI预测值随深度的分布图;Fig. 4 is the distribution diagram of the BI predicted value obtained by inputting the optimal training database into the shale brittleness prediction model with depth;
图5为随着迭代次数的增加均方差MSE变化的柱状图;Figure 5 is a histogram of the change of the mean square error MSE with the increase of the number of iterations;
图6为随着迭代次数的预测率rate变化的柱状图;Figure 6 is a histogram of the change of the prediction rate rate with the number of iterations;
图7为随着K值的改变均方差MSE变化的折线图;Fig. 7 is the line graph of the change of mean square error MSE with the change of K value;
图8为随着K值的改变预测率rate变化的折线图;Figure 8 is a line graph showing the change of the prediction rate rate with the change of the K value;
图9为测试井原始BI值随深度的分布图;Fig. 9 is the distribution diagram of original BI value of test well with depth;
图10为测试井预测BI值随深度的分布图。Fig. 10 is a distribution diagram of the predicted BI value of test wells with depth.
具体实施方式Detailed ways
下面将结合附图对本发明作进一步说明。The present invention will be further described below with reference to the accompanying drawings.
实施例1:Example 1:
参见图1-图3,一种基于改进KNN算法的页岩脆性指数预测方法,包括以下步骤;Referring to Fig. 1-Fig. 3, a method for predicting shale brittleness index based on improved KNN algorithm, including the following steps;
(1)已知数口井的原始测井资料,所述原始测井资料包括井内n个不同深度处的测井参数和脆性指数;所述测井参数包括中子孔隙度CNL,横波时差DTS、钾含量K、声波时差AC、自然伽马GR和有机质含量TOC,选择一口井作为训练井,其余为测试井;(1) The original logging data of several wells are known, and the original logging data includes logging parameters and brittleness indices at n different depths in the well; the logging parameters include neutron porosity CNL, shear wave transit time DTS , potassium content K, acoustic time difference AC, natural gamma GR and organic matter content TOC, select one well as training well, and the rest are test wells;
(2)将训练井中的每个测井参数分别与脆性指数进行相关性分析并对比,选择相关系数绝对值在前三的三个测井参数作为自变量,分别标记为自变量A、自变量B、自变量C,脆性指数作为因变量;(2) Conduct correlation analysis and comparison between each logging parameter in the training well and the brittleness index, select the three logging parameters whose absolute value of the correlation coefficient is in the top three as independent variables, and mark them as independent variable A, independent variable B, independent variable C, brittleness index as the dependent variable;
(3)构建训练样本和训练数据库;(3) Construct training samples and training database;
将训练井同一深度处的三个自变量构成训练样本xi={xAi,xBi,xCi},i表示第i个深度,i=1~n,xAi,xBi,xCi分别表示第i个深度处自变量A的值、自变量B的值、自变量C的值,该深度对应的脆性指数为yi,将yi作为该训练样本的标签,所有训练样本构成训练数据库;Three independent variables at the same depth of the training well constitute the training sample x i ={x Ai , x Bi , x Ci }, i represents the i-th depth, i=1~n, x Ai , x Bi , x Ci respectively Represents the value of the independent variable A, the value of the independent variable B, and the value of the independent variable C at the i-th depth. The fragility index corresponding to this depth is y i , and y i is used as the label of the training sample, and all training samples constitute the training database. ;
(4)利用KNN算法迭代优化训练数据库,得到最优训练数据库;包括步骤(41)-(49);(4) using the KNN algorithm to iteratively optimize the training database to obtain the optimal training database; including steps (41)-(49);
(41)将所有训练样本构成矩阵X={XA,XB,XC},其中XA、XB、XC分别为所有xAi、xBi、xCi构成的列向量,将所有脆性指数按深度构成列向量Y;预设一迭代矩阵Xt={XAt,XBt,XCt}、迭代次数M,t=1~M;且当t=1时,XAt=XA、XBt=XB、XCt=XC;(41) All training samples are formed into a matrix X={X A , X B , X C }, where X A , X B , X C are column vectors formed by all x Ai , x Bi , and x Ci respectively, and all brittleness The index forms a column vector Y according to the depth; preset an iterative matrix X t ={X At , X Bt , X Ct }, the number of iterations M, t=1~M; and when t=1, X At =X A , X Bt =X B , X Ct =X C ;
(42)对矩阵X中的元素分别归一化处理,得到归一化后的矩阵X′;(42) The elements in the matrix X are respectively normalized to obtain the normalized matrix X';
(43)对矩阵Xt中的元素分别归一化处理,得到归一化后的矩阵Xt′;(43) Normalize the elements in the matrix X t respectively to obtain the normalized matrix X t ′;
(44)计算矩阵Xt′的预测值,包括(a1)-(a5);(44) Calculate the predicted value of the matrix X t ', including (a1)-(a5);
(a1)将矩阵X′的每一行,分别与矩阵Xt′的第一行求欧氏距离,得到数个欧式距离值;(a1) Calculate the Euclidean distance between each row of the matrix X' and the first row of the matrix X t ' respectively, and obtain several Euclidean distance values;
(a2)利用冒泡排序算法对所有欧式距离值从小到大排序,得到一距离序列;(a2) Use the bubble sort algorithm to sort all Euclidean distance values from small to large to obtain a distance sequence;
(a3)查找距离序列中前K个欧氏距离值对应的训练样本的标签,得到K个标签,将K个标签求均值,作为Xt′第一行的脆性指数预测值;(a3) Find the labels of the training samples corresponding to the first K Euclidean distance values in the distance sequence, obtain K labels, and average the K labels as the predicted value of the fragility index in the first row of X t ';
(a4)按照(a1)-(a3)的方法,得到Xt′每一行的脆性指数预测值;(a4) According to the method of (a1)-(a3), obtain the predicted value of brittleness index for each row of X t ';
(a5)将所有脆性指数预测值按深度排序构成一列向量Yt,作为矩阵Xt′的预测值;(a5) Sort all the predicted values of brittleness index by depth to form a column vector Y t , which is used as the predicted value of the matrix X t ′;
(45)计算Yt与Y间的均方差MSE、允许绝对误差E、和预测率rate;(45) Calculate the mean square error MSE between Y t and Y, the allowable absolute error E, and the prediction rate rate;
(46)预设一绝对误差,保留矩阵X中绝对误差≤E的训练样本,迭代更新Xt;(46) preset an absolute error, retain the training samples with absolute error≤E in the matrix X, and iteratively update X t ;
(47)重复步骤(43)-(46),直到达到迭代次数;(47) Repeat steps (43)-(46) until the number of iterations is reached;
(48)对比每次迭代计算的均方差MSE和预测率,选择均方差小且预测率高的迭代次数对应的的迭代矩阵,作为最优训练数据库;(48) Compare the mean square error MSE and the prediction rate calculated by each iteration, and select the iteration matrix corresponding to the number of iterations with small mean square error and high prediction rate as the optimal training database;
(5)在KNN算法中,将最优训练数据库中的数据再作为训练数据,将测试井与训练井中对应的数据作为测试数据,采用交叉验证法得到最优K值;(5) In the KNN algorithm, the data in the optimal training database is used as the training data, the data corresponding to the test wells and the training wells are used as the test data, and the optimal K value is obtained by the cross-validation method;
(6)基于KNN算法利用最优训练数据库、最优K值建立一KNN模型,作为页岩脆性预测模型;(6) Based on the KNN algorithm, a KNN model is established by using the optimal training database and the optimal K value as a shale brittleness prediction model;
(7)选择一待测井,获取其原始测井资料中与最优训练数据库对应的测井数据,输入页岩脆性预测模型中,输出其预测值。(7) Select a well to be logged, obtain the logging data corresponding to the optimal training database in its original logging data, input it into the shale brittleness prediction model, and output its predicted value.
本实施例中,步骤(1)中选择训练井的方法为:对比每口井的脆性指数数据量、脆性分布、高脆性指数比例,选择脆性指数数据量大、脆性分布均匀、高脆性指数比例大的井作为训练井。当然方法不仅限于此。In this embodiment, the method for selecting training wells in step (1) is as follows: comparing the brittleness index data volume, brittleness distribution, and high brittleness index ratio of each well, and selecting a large brittleness index data volume, uniform brittleness distribution, and high brittleness index ratio The large well acts as a training well. Of course the method is not limited to this.
步骤(2)中所述异常值包括零值、负值、和异常大值。The abnormal values in step (2) include zero values, negative values, and abnormally large values.
步骤(31)中,用下式进行归一化处理;In step (31), normalization is carried out with the following formula;
x表示向量中的元素,x′表示归一化处理后的元素,xmax和xmin分别表示该向量中的最大值和最小值。x represents the element in the vector, x' represents the normalized element, and x max and x min represent the maximum and minimum values in the vector, respectively.
步骤(45)中,计算Yt与Y间的均方差MSE、允许绝对误差E、和预测率rate,具体采用以下公式计算:In step (45), the mean square error MSE, the allowable absolute error E, and the prediction rate rate between Y t and Y are calculated, and the following formula is specifically used to calculate:
E=(BImax-BImin)×100% (2)E=(BI max -BI min )×100% (2)
式(1)中,yi为步骤(3)中训练井第i个深度处的脆性指数为,yti为列向量Yt的第i个元素;In formula (1), y i is the brittleness index at the i-th depth of the training well in step (3), and y ti is the i-th element of the column vector Y t ;
式(2)中,BImax和BImin为本次迭代的Yt中的最大值和最小值;In formula (2), BI max and BI min are the maximum and minimum values in Y t of this iteration;
式(3)中,M为迭代次数,time为预测准确的次数,其判定标准为:若本次迭代中,MSE<E,则认为本次预测准确。In formula (3), M is the number of iterations, time is the number of times the prediction is accurate, and the criterion is: if MSE<E in this iteration, the prediction is considered accurate.
实施例2:Example 2:
参见图1-图10;为了更好的说明本发明方案,我们针对实施例1作进一步描述。1-10; in order to better illustrate the solution of the present invention, we will further describe
步骤(1)中关于训练井的选取,参见图2,图2中横坐标为深度,纵坐标为脆性指数BI,从图2中可以看出,该井具有脆性样本数量多、脆性分布均匀、高脆性样本数量比例大的特性,其中的数据适合作为预测模型的训练数据。For the selection of training wells in step (1), see Fig. 2. In Fig. 2, the abscissa is the depth, and the ordinate is the brittleness index BI. It can be seen from Fig. 2 that the well has a large number of brittle samples, uniform brittleness It is characterized by the large proportion of the number of samples with high brittleness, and the data in it is suitable as the training data of the prediction model.
步骤(2)中,在将训练井中的每个测井参数分别与脆性指数进行相关性分析并对比时,我们分析发现,中子孔隙度CNL、横波时差DTS和钾含量K,这三个测井参数与脆性指数相关性较强,呈较为一致的负相关趋势,并且它们都可以通过常规测井测得,预测成本较低,所以选择这三个测井参数作为预测模型的自变量。这三个测井参数与脆性指数BI的交会图如图3a-图3c所示,三幅图中,横坐标为脆性指数BI,纵坐标依次为中子孔隙度CNL、横波时差DTS和钾含量K。In step (2), when each logging parameter in the training well is correlated and compared with the brittleness index, we find that the neutron porosity CNL, the shear wave transit time DTS and the potassium content K are the three measurements. Well parameters have strong correlation with brittleness index, showing a relatively consistent negative correlation trend, and they can be measured by conventional logging, and the prediction cost is low, so these three logging parameters are selected as independent variables of the prediction model. The intersection diagrams of these three logging parameters and brittleness index BI are shown in Fig. 3a-Fig. 3c. In the three graphs, the abscissa is the brittleness index BI, and the ordinate is the neutron porosity CNL, shear wave transit time DTS and potassium content. K.
参见图4,将本发明得到的最优训练数据库输入页岩脆性预测模型,得到的BI预测值随深度的分布图如图4所示,横坐标为深度,纵坐标为脆性指数BI,与图1原始脆性指数比较,预测结果的平均绝对误差MAE=0.48,与图2的曲线误差率为1.2%,预测准确率为98%,井内页岩脆性指数的预测效果极佳。Referring to Fig. 4, the optimal training database obtained by the present invention is input into the shale brittleness prediction model, and the distribution diagram of the obtained BI prediction value with depth is shown in Fig. 4, where the abscissa is the depth, and the ordinate is the brittleness index BI, which is the same as Fig. 4. 1 Compared with the original brittleness index, the average absolute error of the prediction result is MAE=0.48, the error rate compared with the curve in Figure 2 is 1.2%, and the prediction accuracy is 98%. The prediction effect of the shale brittleness index in the well is excellent.
参见图5、图6,为了寻找最佳的迭代优化次数即找到最优训练数据库,将本发明步骤(41)构建的迭代矩阵,输入到KNN算法中,取K=4,通过预测结果的均方差MSE和预测准确率rate来反映每次迭代优化效果,结果如簇状柱形图5和图6所示,图中横坐标为迭代优化次数,0代表没有优化,当迭代优化次数为2或3次时,MSE达到极小谷值并且rate达到极大峰值,表示模型的稳定性较强和预测准确率较高,此时得到最优数据库。Referring to Fig. 5 and Fig. 6, in order to find the optimal number of iterative optimizations, that is, to find the optimal training database, the iterative matrix constructed in step (41) of the present invention is input into the KNN algorithm, and K=4 is taken. The variance MSE and the prediction accuracy rate are used to reflect the optimization effect of each iteration. The results are shown in the clustered
参见图7、图8:为了寻找最佳K值,将最优训练数据库中数据输入到KNN算法中,通过预测结果的均方差MSE和预测准确率rate,反映取不同K值时模型的预测效果,结果如折线图7和图8所示,当K值为5-8时,MSE达到极小谷值并且rate达到极大峰值,此时模型的稳定性和预测准确率较好,性能最佳,即为最优的K值。See Figure 7 and Figure 8: In order to find the best K value, the data in the optimal training database is input into the KNN algorithm, and the prediction effect of the model with different K values is reflected by the mean square error MSE and the prediction accuracy rate of the prediction results. , the results are shown in Figure 7 and Figure 8. When the K value is 5-8, the MSE reaches a minimum valley value and the rate reaches a maximum peak value. At this time, the stability and prediction accuracy of the model are better, and the performance is the best. is the optimal K value.
为了验证本发明效果,参见图9、图10:我们选择一待测井,获取其原始测井资料中与最优训练数据库对应的测井数据,已知其测井数据对应的原始BI深度分布图如图9所示,将这些测井数据送入本发明的页岩脆性预测模型,得到的BI的预测值随深度的分布如图10所示,与图9原始脆性指数比较,预测结果的平均绝对误差MAE=1.78,与图9的曲线误差率为4.5%,预测准确率为85%,井间页岩脆性指数的预测效果较好。In order to verify the effect of the present invention, see Figure 9 and Figure 10: we select a well to be logged, obtain the logging data corresponding to the optimal training database in its original logging data, and know the original BI depth distribution corresponding to its logging data As shown in Fig. 9, these logging data are sent into the shale brittleness prediction model of the present invention, and the distribution of the obtained BI prediction value with depth is shown in Fig. 10. Compared with the original brittleness index in Fig. 9, the prediction result is The mean absolute error MAE=1.78, the error rate from the curve in Fig. 9 is 4.5%, the prediction accuracy is 85%, and the prediction effect of the interwell shale brittleness index is good.
值得提出的是,训练井、测试井、待测井位于同一工区。It is worth mentioning that the training wells, test wells, and wells to be logged are located in the same work area.
以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention and are not intended to limit the present invention. Any modifications, equivalent replacements and improvements made within the spirit and principles of the present invention shall be included in the protection of the present invention. within the range.
Claims (5)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202210084514.4A CN114492627B (en) | 2022-01-25 | 2022-01-25 | Shale brittleness index prediction method based on improved KNN algorithm |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202210084514.4A CN114492627B (en) | 2022-01-25 | 2022-01-25 | Shale brittleness index prediction method based on improved KNN algorithm |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN114492627A true CN114492627A (en) | 2022-05-13 |
| CN114492627B CN114492627B (en) | 2023-04-21 |
Family
ID=81475138
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202210084514.4A Active CN114492627B (en) | 2022-01-25 | 2022-01-25 | Shale brittleness index prediction method based on improved KNN algorithm |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN114492627B (en) |
Citations (14)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20060230006A1 (en) * | 2003-01-15 | 2006-10-12 | Massimo Buscema | System and method for optimization of a database for the training and testing of prediction algorithms |
| CN105221141A (en) * | 2014-06-23 | 2016-01-06 | 中国石油化工股份有限公司 | A kind of mud shale brittleness index Forecasting Methodology |
| CN106597544A (en) * | 2016-11-25 | 2017-04-26 | 中国石油天然气股份有限公司 | Method and device for predicting brittleness of compact oil and gas reservoir |
| CN108009705A (en) * | 2017-11-07 | 2018-05-08 | 中国石油大学(华东) | A kind of shale reservoir compressibility evaluation method based on support vector machines technology |
| CN108665109A (en) * | 2018-05-15 | 2018-10-16 | 中国地质大学(北京) | A kind of reservoir parameter log interpretation method based on recurrence committee machine |
| CN109919184A (en) * | 2019-01-28 | 2019-06-21 | 中国石油大学(北京) | An intelligent identification method and system for multi-well complex lithology based on logging data |
| KR20200013146A (en) * | 2018-07-17 | 2020-02-06 | 한국전력공사 | Method for optimizing predictive algorithm based empirical model |
| CN111027882A (en) * | 2019-12-18 | 2020-04-17 | 延安大学 | A method for evaluating brittleness index using conventional logging data based on high-order neural network |
| CN111694071A (en) * | 2020-06-17 | 2020-09-22 | 陕西延长石油(集团)有限责任公司 | Continental facies shale brittleness index evaluation method |
| CN112304754A (en) * | 2020-10-11 | 2021-02-02 | 中国石油天然气股份有限公司大港油田分公司 | Shale brittleness logging evaluation method considering diagenesis and pressure change |
| CN112578475A (en) * | 2020-11-23 | 2021-03-30 | 中海石油(中国)有限公司 | Compact reservoir dual-dessert identification method based on data mining |
| CN112987125A (en) * | 2021-02-22 | 2021-06-18 | 中国地质大学(北京) | Shale brittleness index prediction method based on logging data |
| CN113076700A (en) * | 2021-04-27 | 2021-07-06 | 昆明理工大学 | SVM-LDA rock burst machine learning prediction model method based on data analysis principle |
| CN113919219A (en) * | 2021-10-08 | 2022-01-11 | 西安石油大学 | Stratum evaluation method and system based on logging big data |
-
2022
- 2022-01-25 CN CN202210084514.4A patent/CN114492627B/en active Active
Patent Citations (14)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20060230006A1 (en) * | 2003-01-15 | 2006-10-12 | Massimo Buscema | System and method for optimization of a database for the training and testing of prediction algorithms |
| CN105221141A (en) * | 2014-06-23 | 2016-01-06 | 中国石油化工股份有限公司 | A kind of mud shale brittleness index Forecasting Methodology |
| CN106597544A (en) * | 2016-11-25 | 2017-04-26 | 中国石油天然气股份有限公司 | Method and device for predicting brittleness of compact oil and gas reservoir |
| CN108009705A (en) * | 2017-11-07 | 2018-05-08 | 中国石油大学(华东) | A kind of shale reservoir compressibility evaluation method based on support vector machines technology |
| CN108665109A (en) * | 2018-05-15 | 2018-10-16 | 中国地质大学(北京) | A kind of reservoir parameter log interpretation method based on recurrence committee machine |
| KR20200013146A (en) * | 2018-07-17 | 2020-02-06 | 한국전력공사 | Method for optimizing predictive algorithm based empirical model |
| CN109919184A (en) * | 2019-01-28 | 2019-06-21 | 中国石油大学(北京) | An intelligent identification method and system for multi-well complex lithology based on logging data |
| CN111027882A (en) * | 2019-12-18 | 2020-04-17 | 延安大学 | A method for evaluating brittleness index using conventional logging data based on high-order neural network |
| CN111694071A (en) * | 2020-06-17 | 2020-09-22 | 陕西延长石油(集团)有限责任公司 | Continental facies shale brittleness index evaluation method |
| CN112304754A (en) * | 2020-10-11 | 2021-02-02 | 中国石油天然气股份有限公司大港油田分公司 | Shale brittleness logging evaluation method considering diagenesis and pressure change |
| CN112578475A (en) * | 2020-11-23 | 2021-03-30 | 中海石油(中国)有限公司 | Compact reservoir dual-dessert identification method based on data mining |
| CN112987125A (en) * | 2021-02-22 | 2021-06-18 | 中国地质大学(北京) | Shale brittleness index prediction method based on logging data |
| CN113076700A (en) * | 2021-04-27 | 2021-07-06 | 昆明理工大学 | SVM-LDA rock burst machine learning prediction model method based on data analysis principle |
| CN113919219A (en) * | 2021-10-08 | 2022-01-11 | 西安石油大学 | Stratum evaluation method and system based on logging big data |
Non-Patent Citations (5)
| Title |
|---|
| DELIANG SUN 等: "Investigating the Applications of Machine Learning Techniques to Predict the Rock Brittleness Index" * |
| HAI XU等: "Supervised Machine Learning Techniques to the Prediction of Tunnel Boring Machine Penetration Rate" * |
| 王伟明等: "致密砂岩储层岩石脆性评价及相关因素分析" * |
| 王振: "页岩气储层岩石力学参数及脆性测井评价" * |
| 袁思乔等: "利用多测井参数预测致密砂岩脆性指数方法研究" * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN114492627B (en) | 2023-04-21 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN112901137B (en) | Deep well drilling ROP prediction method based on deep neural network Sequential model | |
| CN109209321B (en) | Fracturing potential-based horizontal well fracturing design method and device to be fractured | |
| CN107784191A (en) | Anisotropic rock joint peak shear strength Forecasting Methodology based on neural network model | |
| CN107728231A (en) | One kind prediction nuclear magnetic resonance log T2 T2The method of distribution | |
| CN104047598A (en) | Heterogeneous paleo-karst carbonate reservoir productivity prediction method | |
| CN114943125A (en) | Intelligent inversion analysis method for tunnel surrounding rock parameters based on XGboost optimization algorithm | |
| CN112282742B (en) | Prediction method for shale oil high-quality reservoir | |
| CN119740619B (en) | Modeling method and prediction method of reservoir rock fracture network expansion prediction model based on neural network | |
| CN114358434A (en) | Prediction method of drilling machine ROP based on LSTM recurrent neural network model | |
| CN112746835A (en) | Optimized deep shale gas geology dessert logging comprehensive evaluation method | |
| CN1945279A (en) | Identifying method for underground engineering surrounding rock category | |
| CN114329874A (en) | Stratum collapse and burst pressure uncertainty quantitative characterization method | |
| CN120087236A (en) | A spatial inversion prediction method and system for in-situ key rock mechanical parameters of reservoirs | |
| CN119538763B (en) | Machine learning-based tight sandstone hydraulic fracture layer penetration prediction method | |
| CN112365054A (en) | Comprehensive grading prediction method for deep well roadway surrounding rock | |
| CN114755744B (en) | Total organic carbon logging interpretation method and system based on shale heterogeneity characteristics | |
| CN111984928A (en) | Method for calculating organic carbon content of shale oil reservoir by logging information | |
| CN118938317B (en) | A method and system for intelligent assessment of landslide risk in tunnel surrounding areas | |
| CN115271367A (en) | Plateau tunnel surrounding rock classification method, device, equipment and storage medium | |
| CN113189647B (en) | A method for predicting the brittleness index of transversely isotropic shale formations | |
| CN117686309B (en) | Rock mass property-based rock stratum maximum horizontal principal stress prediction method | |
| CN114492627B (en) | Shale brittleness index prediction method based on improved KNN algorithm | |
| CN115712827A (en) | Well logging parameter interpretation method under carbonate rock physical phase constraint based on deep learning | |
| CN118761968A (en) | A high-precision fracture identification method based on well logging images under lithology constraints | |
| CN109376375B (en) | A fracturing position design method and device for a horizontal well to be fractured |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |