+

CN119474881B - A basin rainfall runoff prediction method, device and program product integrating mechanism model and machine learning model - Google Patents

A basin rainfall runoff prediction method, device and program product integrating mechanism model and machine learning model

Info

Publication number
CN119474881B
CN119474881B CN202510037939.3A CN202510037939A CN119474881B CN 119474881 B CN119474881 B CN 119474881B CN 202510037939 A CN202510037939 A CN 202510037939A CN 119474881 B CN119474881 B CN 119474881B
Authority
CN
China
Prior art keywords
rainfall
seasonal
model
machine learning
day
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202510037939.3A
Other languages
Chinese (zh)
Other versions
CN119474881A (en
Inventor
邹锐
容思亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Yingteliwei Environmental Technology Co ltd
Original Assignee
Beijing Yingteliwei Environmental Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Yingteliwei Environmental Technology Co ltd filed Critical Beijing Yingteliwei Environmental Technology Co ltd
Priority to CN202510037939.3A priority Critical patent/CN119474881B/en
Publication of CN119474881A publication Critical patent/CN119474881A/en
Application granted granted Critical
Publication of CN119474881B publication Critical patent/CN119474881B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01WMETEOROLOGY
    • G01W1/00Meteorology
    • G01W1/10Devices for predicting weather conditions
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01WMETEOROLOGY
    • G01W1/00Meteorology
    • G01W1/14Rainfall or precipitation gauges
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks
    • G06F18/295Markov models or related models, e.g. semi-Markov models; Markov random fields; Networks embedding Markov models
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2123/00Data types
    • G06F2123/02Data types in the time domain, e.g. time-series data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Environmental & Geological Engineering (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Atmospheric Sciences (AREA)
  • Software Systems (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Environmental Sciences (AREA)
  • Ecology (AREA)
  • Medical Informatics (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Hydrology & Water Resources (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a drainage basin rainfall runoff prediction method, a drainage basin rainfall runoff prediction device and a drainage basin rainfall runoff prediction program product which are integrated with a mechanism model and a machine learning model, wherein seasonal Markov chain models are constructed by taking seasonal rainfall state transition probability as conditional transition probability, the seasonal Markov chain models are adopted to simulate rainfall states of each day in a preset simulation period, daily rainfall is simulated according to the rainfall obeying distribution of each seasonal characteristic time period, and a rainfall sequence in the preset simulation period is generated; and finally, training a machine learning model by taking the input characteristics of all the production flow predictions in the preset simulation period and the corresponding production flow simulation values as a data set, and predicting the future production flow by adopting the trained machine learning model. The invention has high prediction efficiency and high prediction precision.

Description

Drainage basin rainfall runoff production prediction method, device and program product integrating mechanism model and machine learning model
Technical Field
The invention relates to a drainage basin rainfall runoff prediction technology, in particular to a drainage basin rainfall runoff prediction method, a drainage basin rainfall runoff prediction device and a drainage basin rainfall runoff program product which are integrated with a mechanism model and a machine learning model.
Background
The river basin rainfall runoff production prediction refers to predicting corresponding runoff according to a primary rainfall process in the river basin. Traditional watershed rainfall runoff prediction relies primarily on a mechanism model, such as semi-distributed hydrologic models HSPF (Hydrological Simulation Program Fortran) and LSPC (Loading Simulation Program C). These models are based on physical process simulation, and require a large number of complex numerical calculations, resulting in long calculation times of the models, which are difficult to meet the requirements of real-time prediction and quick response. In large-scale data or long-time series simulations, the demand for computing resources is more pronounced. In addition, if rapid deduction is needed, the calculation time of on-site deduction by adopting a mechanism model is long, and the decision requirement cannot be met.
With the widespread use of machine learning models, some researchers have begun to use machine learning models for drainage basin rainfall prediction, for example, patent documents 202210642420, 202011375784.8, 202310968062.0, etc. Although the machine learning model has advantages in terms of pattern recognition and nonlinear mapping, the machine learning model needs to be trained by a large amount of high-quality training data, however, in the field of hydrologic water environment, actually measured data are often limited, especially data of extreme events are more scarce, so that the machine learning model is limited to learn a complex hydrologic process, the generalization capability of the model is reduced, and the prediction precision is reduced. Moreover, machine learning models are often regarded as "black boxes" and lack the ability to interpret physical processes, which makes the prediction results of machine learning models difficult to trust by hydrologists and decision makers, reducing their reliability in practical applications. In addition, the characteristics of the existing machine learning model are incomplete, so that the model cannot capture the influence of other factors related to rainfall runoff production besides rainfall, and the prediction accuracy is reduced.
Disclosure of Invention
Aiming at the problems existing in the prior art, the invention aims to provide a drainage basin rainfall runoff prediction method, a drainage basin rainfall runoff prediction device and a program product of a fusion mechanism model and a machine learning model, wherein the fusion mechanism model and the machine learning model are high in prediction efficiency and prediction precision.
In order to achieve the above object, the present invention provides the following technical solutions:
A drainage basin rainfall runoff production prediction method integrating a mechanism model and a machine learning model comprises the following steps:
(1) Acquiring historical rainfall data of a river basin, and preprocessing;
(2) Dividing one year into a plurality of seasonal characteristic time periods, and calculating seasonal rainfall state transition probability of each seasonal characteristic time period and rainfall obeying distribution of each seasonal characteristic time period according to the historical rainfall data;
(3) Constructing a seasonal Markov chain model by taking the seasonal rainfall state transition probability as the conditional transition probability, and simulating the rainfall state every day in a preset simulation period according to the seasonal Markov chain model;
(4) Simulating the daily rainfall according to the rainfall obeying distribution of each seasonal characteristic time period and the daily rainfall state in a preset simulation period, and combining the daily rainfall state and the corresponding rainfall in the rainfall state sequence to serve as a seasonal simulated rainfall sequence;
(5) Carrying out river basin rainfall runoff simulation by adopting a river basin mechanism model according to a seasonal rainfall simulation sequence to obtain a daily runoff simulation value in a simulation period;
(6) Extracting accumulated rainfall and seasonal features of the first several days of each day in a preset simulation period, and fusing the accumulated rainfall and the seasonal features with the rainfall of the current day to form input features of the yield prediction;
(7) Training a machine learning set model by taking input features of all production flow predictions in a preset simulation period and corresponding production flow simulation values as data sets, wherein the machine learning set model is a weighted integration of a plurality of machine learning sub-models;
(8) And inputting the input characteristics of the target river basin runoff prediction into a trained machine learning set model, and predicting future runoff.
Further, the step (1) specifically includes:
(1-1) acquiring historical rainfall data of a river basin;
(1-2) identifying whether an abnormal rainfall value exists in the historical rainfall data, and if so, deleting the abnormal rainfall value;
(1-3) detecting whether the historical rainfall data has a missing value, if so, interpolating the missing value;
and (1-4) detecting whether the historical rainfall data is in a unified time format and unit, and if not, unifying.
Further, the step (2) specifically includes:
(2-1) determining a daily rainfall state according to the historical rainfall data according to the following formula:
,
Wherein S t represents a rainfall state on the t-th day in the historical rainfall data, rainfall state 1 represents rainfall, rainfall state 0 represents no rainfall, P threshold represents a rainfall threshold value, and P t represents a rainfall on the t-th day in the historical rainfall data;
(2-2) dividing a year into a plurality of seasonal characteristic time periods, and counting the number of rainfall state transitions in each seasonal characteristic time period of the river basin:
,
,
in the formula, A rainfall state transition indicating variable indicating whether or not the t-th day transitions from rainfall state i to rainfall state j,The representation is that the number of the elements,Indicating the absence of the presence of a further agent,The number of rainfall state transitions from rainfall state i to rainfall state j in the seasonal characteristic period s is represented,Representing a seasonal characteristic time period s;
(2-3) calculating seasonal rainfall state transition probabilities for each seasonal characteristic time period according to the rainfall state transition times:
,
in the formula, The rainfall state transition probability of the seasonal characteristic period s from rainfall state i to rainfall state j is represented,The number of times a rainfall condition i occurs within a seasonal characteristic time period s is represented;
(2-4) for each seasonal characteristic time period, acquiring rainfall of the dates with rainfall in all rainfall states, and fitting according to the rainfall to obtain the rainfall obeying distribution of the seasonal characteristic time period, wherein the distribution is specifically Gamma distribution.
Further, the step (3) specifically includes:
(3-1) constructing a seasonal Markov chain model by taking a set {0,1} of all possible rainfall states as a state space and a set of all seasonal rainfall state transition probabilities as a conditional probability set;
(3-2) randomly setting the simulated rainfall state of the first day according to the rainfall occurrence frequency corresponding to the first day of the preset simulation period ;
(3-3) Calculating rainfall states of the following day according to the corresponding seasonal transition probabilities from rainfall states of the previous day based on the seasonal Markov chain model, thereby simulating a simulated rainfall state sequence of a preset simulation period, wherein,And (3) representing the simulated rainfall state of the tau th day in the simulated rainfall state sequence, wherein N represents the total number of days of a preset simulation period.
Further, the step (4) specifically includes:
(4-1) extracting rainfall states as dates with rainfall from the rainfall simulation state sequence, and simulating corresponding rainfall according to the rainfall obeying distribution of the corresponding seasonal characteristic time period;
(4-2) setting a rainfall amount of a date in which the rainfall state is no rainfall to 0;
(4-3) combining the simulated rainfall state and the corresponding rainfall amount every day in the preset simulation period as a seasonal simulated rainfall sequence.
Further, the step (5) specifically includes:
(5-1) acquiring meteorological data, elevation data, soil parameters, land utilization data and vegetation coverage data of a river basin;
(5-2) inputting seasonal rainfall simulation sequences, meteorological data, elevation data, soil parameters, land utilization data and vegetation coverage data into a river basin mechanism model, and simulating rainfall-runoff production process in the river basin to obtain daily runoff simulation values;
(5-3) outputting the flow output time sequence after the flow output analog values are subjected to format unification.
Further, the step (6) specifically includes:
(6-1) calculating the accumulated rainfall in the last days of the day according to the rainfall in each day in a preset simulation period;
(6-2) determining seasonal features according to a seasonal feature time period in which each day is located in a preset simulation period;
(6-3) standardizing daily rainfall, accumulated rainfall, and seasonal characteristics;
And (6-4) splicing the standardized rainfall, the accumulated rainfall and the seasonal characteristics to form the input characteristics of the yield prediction.
Further, the step (7) specifically includes:
(7-1) constructing a plurality of machine learning sub-models based on different machine learning algorithms;
(7-2) forming a data set by using the input characteristics of all the production flow predictions and the corresponding production flow analog values in a preset analog period, and dividing a first training set, a second training set, a first verification set, a second verification set, a first test set and a second test set;
(7-3) training each machine learning sub-model by using a first training set, optimizing the respective super parameters of all the machine learning sub-models by using a first verification set, and testing the capability of the machine learning sub-models by using a first test set;
(7-4) building a machine learning set model, which is a weighted set of all machine learning sub-models;
And (7-5) training the machine learning set model by adopting a second training set, optimizing the weight in the machine learning set model by adopting a second verification set to obtain an optimized machine learning set model, and testing the capacity of the optimized machine learning set model by adopting a second test set.
A watershed rainfall runoff production prediction apparatus integrating a mechanism model and a machine learning model, comprising:
The historical rainfall data preprocessing module is used for acquiring historical rainfall data of the river basin and preprocessing the historical rainfall data;
the transition probability and distribution calculation module is used for calculating seasonal rainfall state transition probability of each seasonal characteristic time period and rainfall obeying distribution of each seasonal characteristic time period according to the historical rainfall data;
The rainfall state generation module is used for constructing a seasonal Markov chain model by taking the seasonal rainfall state transition probability as the conditional transition probability, and simulating the rainfall state of each day in a preset simulation period according to the seasonal Markov chain model;
the rainfall sequence generation module is used for simulating the rainfall of each day according to the rainfall obeying distribution of each seasonal characteristic time period and the rainfall state of each day in a preset simulation period, and combining the rainfall state of each day in the preset simulation period with the corresponding rainfall as a seasonal simulated rainfall sequence;
The flow rate simulation module is used for simulating the flow rate of the rainfall in the river basin by adopting a mechanism model of the river basin according to the seasonal rainfall simulation sequence to obtain a flow rate simulation value of each day in a simulation period;
The output flow prediction input feature extraction module is used for extracting accumulated rainfall and seasonal features of the first several days of each day in a preset simulation period and fusing the accumulated rainfall and the rainfall of the same day into output flow prediction input features;
The machine learning model integration module is used for training and optimizing a machine learning integration model by taking input characteristics of all production flow predictions and corresponding production flow simulation values in a preset simulation period as a data set, wherein the machine learning integration model is a weighted integration of a plurality of machine learning sub-models;
And the future flow prediction module is used for inputting the input characteristics of the flow prediction of the target river basin into a trained machine learning model to predict the future flow.
A computer program product comprising computer programs/instructions which when executed by a processor implement the above-described method.
Compared with the prior art, the invention has the beneficial effects that:
1. The invention utilizes the Markov chain and Gamma distribution model to combine with seasonal characteristics to generate a seasonal rainfall simulation sequence with long time sequence, trains the machine learning model by taking the seasonal rainfall simulation sequence as basic data, breaks through the limitation of the length of historical data, enriches training samples of the model, enhances the adaptability of the model to different rainfall situations, overcomes the defect of poor generalization capability of the model caused by insufficient samples in the prior art, and improves the prediction precision;
2. the method inputs the generated virtual rainfall data into a river basin mechanism model, performs large-scale hydrologic process simulation, and acquires corresponding daily output flow data. Thus, a large amount of high-precision throughput simulation data are obtained, a sufficient sample is provided for training a machine learning model, short plates with insufficient actual measurement throughput data are made up, and the prediction precision is further improved;
3. The model input features comprise early accumulated rainfall and seasonal features, various factors influencing the flow are considered, the features are rich, the sensitivity of the model to the factors influencing the flow is improved, the prediction capability of the model is enhanced, the problem of low model precision caused by single features in the prior art is solved, and the prediction precision is further improved;
4. The invention adopts a plurality of machine learning sub-models to train and optimizes the weight of each machine learning sub-model, thus constructing a machine learning set model integrating the plurality of machine learning sub-models. The advantages of different models are fully exerted, the deviation and variance of a single model are reduced, the accuracy and the robustness of prediction are improved, and the defect of limited performance of the single model in the prior art is overcome;
5. According to the method, through the physical mechanism of the fusion mechanism model and the nonlinear mapping capability of the machine learning model, the prediction result not only has high precision, but also has certain physical significance, and the credibility of the model is enhanced;
6. The invention applies the trained model to the actual rainfall runoff prediction, provides real-time prediction capability, supports the quick response of decision-making departments, improves the efficiency of flood control early warning and water resource scheduling, and fills the gap of poor prediction timeliness in the prior art.
Drawings
FIG. 1 is a schematic flow chart of a drainage basin rainfall runoff prediction method integrating a mechanism model and a machine learning model;
Fig. 2 is a schematic structural diagram of a drainage basin rainfall runoff prediction device integrating a mechanism model and a machine learning model.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.
Embodiment one.
The embodiment of the invention provides a drainage basin rainfall runoff production prediction method integrating a mechanism model and a machine learning model, which is shown in fig. 1 and comprises the following steps:
s101, acquiring historical rainfall data of a river basin, and preprocessing.
In specific implementation, the method specifically comprises the following steps:
S101-1, acquiring historical rainfall data of a river basin, wherein the historical rainfall data sources comprise weather station observation, radar estimation and satellite remote sensing data. Ensuring that the data is covered for a long enough time to cover various climatic conditions and rainfall modes, thereby providing a comprehensive data base for subsequent analysis;
S101-2, identifying whether abnormal rainfall values exist in the historical rainfall data, if yes, deleting the abnormal rainfall values, and ensuring the accuracy of the data;
s101-3, detecting whether the historical rainfall data has a missing value, if so, interpolating the missing value, if so, using a similarity interpolation method, and based on the similarity among sites, analyzing and then completing site data of adjacent or similar domains after linear regression, so as to ensure the continuity of time sequences;
S101-4, detecting whether the historical rainfall data is in a unified time format and unit, if not, unifying the historical rainfall data, and ensuring that the data from different sources has comparability.
It should be noted that, in other embodiments, the pretreatment may be performed in other manners.
S102, dividing a year into a plurality of seasonal characteristic time periods, and calculating seasonal rainfall state transition probability of each seasonal characteristic time period and rainfall obeying distribution of each seasonal characteristic time period according to the historical rainfall data.
In specific implementation, the method specifically comprises the following steps:
S102-1, judging the rainfall state of each day according to the historical rainfall data and the following formula:
in the formula, The rainfall state of the t-th day in the historical rainfall data is represented, the rainfall state 1 represents rainfall, the rainfall state 0 represents no rainfall, the P threshold represents a rainfall threshold value, the average rainfall value is 0.1 mm, and the P t represents the rainfall of the t-th day in the historical rainfall data.
S102-2, dividing a year into a plurality of seasonal characteristic time periods, and counting rainfall state transition times in each seasonal characteristic time period of the drainage basin:
,
in the formula, A rainfall state transition indicating variable indicating whether or not the t-th day transitions from rainfall state i to rainfall state j,The representation is that the number of the elements,Indicating the absence of the presence of a further agent,The number of rainfall state transitions from rainfall state i to rainfall state j in the seasonal characteristic period s is represented,Representing a seasonal characteristic time period s;
It should be noted that the seasonal characteristic period may be divided in various manners, for example, a year is divided into four seasons according to four seasons in this embodiment, each season is taken as a seasonal characteristic period, and a total of 4 seasonal characteristic periods and 4 seasonal rainfall state transition probabilities are given. In other embodiments, a year may be divided into 12 months by month, each month being a seasonal characteristic period, a total of 12 seasonal characteristic periods and 12 seasonal rainfall state transition probabilities.
It should be noted that, the rainfall state may be divided into two states of rainfall and non-rainfall in various manners, and in other embodiments, the rainfall state may be divided into no rain, light rain, medium rain, heavy rain, etc.
S102-3, calculating seasonal rainfall state transition probability of each seasonal characteristic time period according to the rainfall state transition times:
in the formula, The rainfall state transition probability of the seasonal characteristic period s from rainfall state i to rainfall state j is represented,The number of times a rainfall condition i occurs within the seasonal characteristic time period s is indicated.
S102-4, for each seasonal characteristic time period, acquiring rainfall of the dates with rainfall in all rainfall states, and fitting according to the rainfall to obtain the rainfall obeying distribution of the seasonal characteristic time period, wherein the distribution is specifically Gamma distribution.
The fitting of the rainfall compliant distribution in the seasonal characteristic time period is specifically to fit the shape parameter alpha and the scale parameter beta of Gamma distribution. Common methods include moment estimation and maximum likelihood estimation, resulting in probability density functions based on Gamma distribution of shape parameters α and scale parameters β:
,
,
Wherein f (x; alpha, beta) represents a probability density function of Gamma distribution, x represents rainfall (rainfall with a drop of rain day), the shape parameter alpha controls the form of the rainfall distribution, the scale parameter beta controls the range and average value of the rainfall, The angle marks alpha-1 and beta α are directly derived from the definition of Gamma distribution and are used for describing the distribution rule of rainfall.
Here, it is necessary to fit rainfall distributions of different rainfall correspondence states, for example, rainfall distributions of rainfall states are to be fitted in the present embodiment. In other embodiments, if the rainfall condition is classified into light rain, medium rain, heavy rain, etc., then the distribution of rainfall compliance for different rainfall levels needs to be fitted separately.
S103, constructing a seasonal Markov chain model by taking the seasonal rainfall state transition probability as the conditional transition probability, and simulating the rainfall state of each day in a preset simulation period according to the seasonal Markov chain model.
In specific implementation, the method comprises the following steps:
S103-1, taking a set {0,1} of all possible rainfall states as a state space, taking a set of all seasonal rainfall state transition probabilities as a conditional probability set, and constructing a seasonal Markov chain model;
S103-2, randomly setting the simulated rainfall state of the first day according to the rainfall occurrence frequency corresponding to the first day of the preset simulation period The rainfall occurrence probability can be obtained by dividing the number of days with rainfall by the total number of days according to all rainfall states in the corresponding seasonal characteristic time period;
s103-3, calculating the rainfall state of the following day according to the corresponding seasonal transition probability according to the rainfall state of the previous day based on the seasonal Markov chain model, thereby simulating a simulated rainfall state sequence of a preset simulation period , wherein,And (3) representing the simulated rainfall state of the tau th day in the simulated rainfall state sequence, wherein N represents the total number of days of a preset simulation period. For example, assuming that the preset simulation period is 2 years, the total number of days is 2×365 days.
And S104, simulating the daily rainfall according to the rainfall obeying distribution of each seasonal characteristic time period and the daily rainfall state in a preset simulation period, and combining the daily rainfall state and the corresponding rainfall in the rainfall state sequence to serve as a seasonal simulated rainfall sequence.
In specific implementation, the method comprises the following steps:
S104-1, extracting rainfall from the simulated rainfall state sequence to obtain the rainfall state of the rainfall The date of=1), and simulate the corresponding rainfall according to the distribution obeyed by the rainfall in the corresponding seasonal characteristic time period;
S104-2, the rainfall state is no rainfall The rainfall of the date of =0) is set to 0;
S104-3, combining the simulated rainfall state and the corresponding rainfall amount every day in a preset simulation period to be used as a seasonal simulated rainfall sequence.
S105, carrying out river basin rainfall runoff production simulation by adopting a river basin mechanism model according to the seasonal rainfall simulation sequence, and obtaining a daily throughput simulation value in a simulation period.
In specific implementation, the method comprises the following steps:
S105-1, acquiring meteorological data, elevation data, soil parameters, land utilization data and vegetation coverage data of a river basin, wherein the meteorological data comprise temperature, wind speed and the like, the soil parameters comprise infiltration rate, saturated water conductivity, field water holding capacity and the like, the hydrologic characteristics of soil are reflected, and the vegetation coverage data comprise vegetation types, canopy interception quantity and the like, so that the interception and evapotranspiration processes of rainfall are influenced;
S105-2, inputting seasonal rainfall simulation sequences, meteorological data, elevation data, soil parameters, land utilization data and vegetation coverage data into a river basin mechanism model, and simulating rainfall runoff processes in a river basin, wherein the processes comprise rainfall interception, infiltration, evaporation, runoff and the like, so as to obtain a daily throughput simulation value, and the river basin mechanism model can select any proper traditional mechanism model;
s105-3, outputting the flow output time sequence after the flow output analog values are subjected to format unification.
Compared with actual measurement data of the simulated rainfall sequence and the corresponding simulated yield, the simulated rainfall sequence has large data volume, and can overcome the defect of insufficient generalization capability of a pure machine learning model caused by limited training data. Through training of a machine learning model, the learning capacity of the complex hydrologic process is improved.
S106, extracting accumulated rainfall and seasonal features of the first several days of each day in a preset simulation period, and fusing the accumulated rainfall and the rainfall on the same day into input features of the yield prediction.
In specific implementation, the method comprises the following steps:
S106-1, calculating the accumulated rainfall on the first several days of the day according to the rainfall on each day in a preset simulation period, wherein a plurality of time windows can be selected to realize the accumulated rainfall, for example, the accumulated rainfall on the first 1 day, the first 3 days and the first 5 days can be respectively set to reflect the influence of short-term rainfall, the accumulated rainfall on the first 7 days and the first 15 days can be added to reflect the influence of medium-term rainfall, and the accumulated rainfall on the first 30 days and the first 60 days can be added to reflect the influence of long-term rainfall;
s106-2, determining seasonal features according to a seasonal feature time period of each day in a preset simulation period, wherein for example, assuming that a certain date in the preset simulation period is 11 months 11, when the seasonal feature time period is divided according to months, the seasonal feature corresponding to 11 months 11 is 11 months, and when the seasonal feature time period is divided according to seasons, the seasonal feature corresponding to 11 months 11 is winter;
s106-3, standardizing daily rainfall, accumulated rainfall and seasonal characteristics, such as codes, uniform scales and the like;
S106-4, splicing the standardized rainfall, the accumulated rainfall and the seasonal characteristics to form the input characteristics of the yield prediction.
The input characteristics of the runoff prediction take various factors influencing the runoff into consideration, the influence of the season and early rainfall conditions on the runoff is captured, and the adaptability and the prediction accuracy of the model to different rainfall processes are improved.
S107, training a machine learning model by taking input characteristics of all the production flow predictions in a preset simulation period and corresponding production flow simulation values as a data set.
In specific implementation, the method comprises the following steps:
s107-1, constructing a plurality of machine learning sub-models based on different machine learning algorithms;
wherein, the machine learning sub-model is the following models:
convolutional Neural Networks (CNNs) that are adept at processing spatiotemporal data, capturing local features and timing patterns;
XGBoost, a tree model based on gradient lifting has high-efficiency characteristic processing capacity;
LightGBM an improved gradient lifting frame, which has high training speed and excellent performance;
random forest, namely, by integrating a plurality of decision trees, the generalization capability of the model is improved.
The machine learning sub-model may also be other types of network models.
S107-2, forming a data set by the input characteristics of all the production flow predictions in the preset simulation period and the corresponding production flow simulation values, and dividing the data set into a first training set, a second training set, a first verification set, a second verification set, a first test set and a second test set.
S107-3, training each machine learning sub-model by adopting a first training set, optimizing the respective super parameters of all the machine learning sub-models by adopting a first verification set, and testing the capability of the machine learning sub-models by adopting a first test set.
During training, a Mean Square Error (MSE) is used as a loss function, the difference between a predicted value and a true value is measured, and the updating of model parameters is guided. In the training process, each machine learning sub-model learns according to input characteristics (such as month information, rainfall in the next m days and accumulated rainfall in the previous n days) and corresponding daily output flow targets, and continuously adjusts internal parameters of the machine learning sub-model, so that errors with a true value are gradually reduced.
After the preliminary training of each model is completed, the first verification set is used for carrying out fine tuning on the super parameters (non-directly learnable parameters) of each machine learning sub-model, so that overfitting is prevented and the model prediction capability is maximized. The following describes the gist of each machine learning sub-model in super parameter tuning:
A. Convolutional Neural Network (CNN)
Convolution kernel size, determining the receptive field (RECEPTIVE FIELD) of the network to the local features, such as 3×3, 5×5, etc. And selecting the optimal configuration for capturing the rainfall-runoff production time sequence characteristics by comparing the effects of different convolution kernel sizes on the first verification set.
The number of convolution kernels (number of channels) reflects the richness of the extracted features. Insufficient number, limited network expression capability, excessive number, easy overfitting and increased computational overhead.
The more the number of network layers, the more the model is capable of capturing complex features, but the more difficult the network is to train and the easier the fitting is.
Learning rate and regularization coefficient by adjusting learning rate (such as 0.001, 0.0005, etc.) and L2 regularization, etc., the balance of error convergence speed and stability is sought in the first evidence verification set.
B、XGBoost
Learning rate (eta), namely controlling the contribution of each weak learner to the whole model, wherein the learning rate (eta) is easy to be over-fitted when the learning rate is too large, and the learning rate (eta) is slow when the learning rate is too small.
Maximum depth (max_depth), which determines the complexity of a single tree, and if the depth is too deep, the fitting is easy to be performed too much, and if the depth is too shallow, the model is expressed insufficiently.
Sub-sample rate (subsamples) and column sample rate (colsample _ bytree) are used to randomly select samples or features to enhance model generalization ability. Different combinations are tried in the first verification set.
Regularization parameters (lambda, alpha) limit the complexity of the tree, adjust the penalty force on the high complex tree structure, and avoid the overfitting of the tree.
C、LightGBM
Learning rate (learning_rate) a single update step is determined, similar to eta in XGBoost, affecting convergence speed and risk of overfitting.
Maximum depth (max_depth) and number of leaf nodes (num_leaves), the complexity and degree of refinement of the control tree. If num_leave is too large, the model is easy to capture noise, so that the model is over-fitted, and if num_leave is too small, the expression capacity is insufficient.
The larger max_bin, the finer the feature division, but the more computationally intensive, the rectiliner width (max_bin) LightGBM discretizes the feature using a rectiliner-based algorithm.
Early stop (early_stop_ rounds) setting an early stop condition in the first verification set, stopping training when the number of verification errors is not improved, preventing overfitting.
D. Random forest
The number of trees (n_ estimators) is that a larger number of trees can reduce variance, but training and prediction overheads are increased, and the stability is affected by insufficient number of trees.
Maximum feature number (max_features), which is the selectable feature number when each node splits, and is too large to fit and too small to fit.
Maximum depth (max_depth), which limits the depth of a single tree, too deep can cause the model to memorize details on the training set and lose generalization ability.
The minimum sample leaf number (min_samples_leaf) controls the minimum sample number of the leaf nodes, and the larger value reduces the risk of over-fitting and smoothes the prediction result.
And finally, testing the generalization capability of each machine learning sub-model on the first test set, and evaluating the effect of the machine learning sub-model on the unseen data. The final performance measure may be performed for each sub-model by calculating an evaluation index using the first test set. If the sub-model performs similar to the first verification set on the first test set, the model is described as having robust generalization performance.
The evaluation index specifically includes:
Mean Square Error (MSE), which is the average squared error of a predicted value and a true value;
root Mean Square Error (RMSE), the square root of MSE, more intuitively reflects the magnitude of the error;
mean Absolute Error (MAE) mean absolute difference of predicted and real values;
the decision coefficient (R 2) reflects the degree of interpretation of the model for the true value variation.
S107-4, establishing a machine learning set model, wherein the machine learning set model is a weighted set of all machine learning sub-models.
The machine learning set model specifically comprises the following steps:
,
s.t.,
Where J (w) is an optimization objective function, The weight matrix formed for the weights of all the machine learning sub-models, w k represents the weight of the kth machine learning sub-model, K is the number of machine learning sub-models, Y i represents the true value of the ith sample, is the predicted value of the kth machine learning sub-model on sample i, and N is the number of samples.
S107-5, training the machine learning set model by adopting a second training set, optimizing the weight in the machine learning set model by adopting a second verification set to obtain an optimized machine learning set model, and testing the capacity of the optimized machine learning set model by adopting a second testing set.
In the invention, the machine learning set model is not simply integrated with each machine learning sub-model, but is obtained by independently training and evaluating a second training set, a second verification set and a second test set. Therefore, the machine learning set model can learn the optimal combination weight on unused independent data, and the overall prediction performance is further improved.
First, a machine learning set model is initially trained by a second training set. On the second training set, the model does not train the machine learning sub-model, but performs preliminary learning on the combination of the prediction results based on the trained machine learning sub-model, so as to obtain a preliminary weighting set model. Specifically, for each sample, a predicted value is given by each trained machine learning sub-model, and then the predicted values and the true value groups are paired to form training features and targets. And (3) measuring the difference between the combined output and the real yield by means of the loss such as the mean square error, and iteratively updating the initial weight of each machine learning sub-model in the machine learning set model.
And then optimizing the weight of the machine learning set model by adopting a second verification set, and distributing optimal weight to each machine learning sub-model to ensure that the combined prediction reaches the minimum error on the second verification set. Specifically, the optimal weight vector can be solved by adopting a non-negative least squares (NNLS) method or a convex optimization method with the aim of minimizing the combined error. And setting the machine learning set model output as the weighted sum of the predicted values of all the machine learning sub-models, and continuously adjusting the weight on the second verification set until the prediction error of the combined result is optimal.
Finally, carrying out final generalization test on the machine learning set model by adopting a second test set, checking the performance of the machine learning set model on completely independent and unseen data, and evaluating the adaptability of the machine learning set model to a real scene. Specifically, the second test set may be input into each machine learning sub-model to obtain a predicted value of each machine learning sub-model, and then combined according to the weights obtained by tuning to generate a set model for output, where the following formula is:
And comparing the yield predicted by the machine learning set model with the yield corresponding to the second test set, and testing the capacity of the machine learning set model, wherein the testing comprises performance evaluation, error analysis, model robustness verification and performance analysis under different conditions.
The performance evaluation specifically evaluates the predicted performance of the integrated model on a test set and calculates each evaluation index.
Error analysis includes residual analysis and outlier detection. The residual analysis is to calculate the prediction error, analyze the distribution characteristic of the residual, check whether there is systematic deviation or heteroscedasticity, and provide basis for model improvement. Outlier detection is to identify samples with large prediction errors, and analyze possible reasons, such as input characteristic anomalies, applicability of the model under specific conditions, and the like.
Model robustness verification includes repetition testing and cross-validation. The repeated experiments are to repeatedly train and evaluate the model under different random seed or data division, observe the consistency of the model performance and verify the stability of the model. The cross verification is that K-fold cross verification is adopted, so that the generalization capability of the model is comprehensively evaluated, and the contingency of an evaluation result is reduced.
Performance analysis in different scenarios includes evaluation by rainfall magnitude, seasonal analysis. According to rainfall level evaluation, a test set is classified according to rainfall intensity, and the prediction performance of the model under different rainfall situations is evaluated, so that the comprehensive adaptability of the model is ensured. Seasonal analysis is to evaluate model performance by season or month, analyze the effect of seasonal variation on the model, and ensure that the model is able to capture seasonal features.
And finally, judging whether the model meets the expected performance requirement according to the evaluation result, and providing a basis for the improvement and practical application of the model.
S108, inputting the input characteristics of the yield prediction of the target river basin into a trained machine learning set model, and predicting the future yield.
Wherein the input features of the production flow prediction of the target river basin are obtained by:
Acquiring a future rainfall forecast, namely acquiring or manually setting rainfall forecast data of each day in the future m days from a meteorological department;
calculating the early accumulated rainfall by using real-time data;
extracting seasonal features, namely extracting months from dates, performing periodic coding if necessary, and capturing the seasonal features.
And then processing the input characteristics of the product flow prediction according to a standardized mode during training, and ensuring that the scale of the characteristic value is consistent with that during training.
And loading a trained machine learning set model, inputting the input characteristics of the production flow prediction, and predicting to obtain the production flow simulation value of the future m days.
Embodiment two.
The embodiment of the invention provides a river basin rainfall runoff simulation device integrating a mechanism model and a machine learning model. The device can be realized in a software and/or hardware mode and can be configured in terminal equipment. As shown in fig. 2, the apparatus includes:
The historical rainfall data preprocessing module 201 is used for acquiring historical rainfall data of the river basin and preprocessing the historical rainfall data;
A transition probability and distribution calculation module 202, configured to calculate, according to the historical rainfall data, a seasonal rainfall state transition probability for each seasonal characteristic time period, and a distribution to which rainfall for each seasonal characteristic time period obeys;
the rainfall state generation module 203 is configured to construct a seasonal markov chain model by using the seasonal rainfall state transition probability as a conditional transition probability, and simulate the rainfall state every day in a preset simulation period according to the seasonal markov chain model;
The rainfall sequence generating module 204 is configured to simulate the rainfall of each day according to the rainfall obeying distribution of each seasonal characteristic time period and the rainfall state of each day in a preset simulation period, and combine the rainfall state of each day in the preset simulation period with the corresponding rainfall as a seasonal simulated rainfall sequence;
the flow rate simulation module 205 is configured to perform a basin rainfall flow rate simulation by using a basin mechanism model according to a seasonal rainfall simulation sequence, so as to obtain a daily flow rate simulation value in a simulation period;
the output flow prediction input feature extraction module 206 is configured to extract accumulated rainfall and seasonal features of the first several days of each day in a preset simulation period, and fuse the accumulated rainfall and the rainfall of the current day together to form output flow prediction input features;
The machine learning model integration module 207 is configured to train a machine learning integration model by using input features of all production flow predictions and corresponding production flow simulation values in a preset simulation period as a data set, where the machine learning integration model is a weighted integration of a plurality of machine learning sub-models;
The future flow prediction module 208 is configured to input the input features of the flow prediction of the target river basin into a trained machine learning model, and predict the future flow.
The historical rainfall data preprocessing module 201 specifically includes:
the data acquisition unit is used for acquiring historical rainfall data of the river basin;
The abnormal value processing unit is used for identifying whether abnormal rainfall values exist in the historical rainfall data, and if so, deleting the abnormal rainfall values;
the missing value processing unit is used for detecting whether missing values exist in the historical rainfall data, and if so, interpolation is carried out on the missing values;
and the detection unit is used for detecting whether the historical rainfall data is in a unified time format and unit, and if not, unifying the historical rainfall data.
The transition probability and distribution calculation module 202 specifically includes:
The rainfall state acquisition unit is used for judging the daily rainfall state according to the historical rainfall data and the following formula:
in the formula, The rainfall state of the t day in the historical rainfall data is represented, the rainfall state 1 represents rainfall, the rainfall state 0 represents no rainfall, P threshold represents a rainfall threshold value, and P t represents the rainfall of the t day in the historical rainfall data;
The state transition times calculating unit is used for dividing one year into a plurality of seasonal characteristic time periods and counting rainfall state transition times in each seasonal characteristic time period of the drainage basin:
in the formula, A rainfall state transition indicating variable indicating whether or not the t-th day transitions from rainfall state i to rainfall state j,The representation is that the number of the elements,Indicating the absence of the presence of a further agent,The number of rainfall state transitions from rainfall state i to rainfall state j in the seasonal characteristic period s is represented,Representing a seasonal characteristic time period s;
The state transition probability calculation unit is used for calculating seasonal rainfall state transition probability of each seasonal characteristic time period according to the rainfall state transition times:
in the formula, The rainfall state transition probability of the seasonal characteristic period s from rainfall state i to rainfall state j is represented,The number of times a rainfall condition i occurs within a seasonal characteristic time period s is represented;
The rainfall distribution fitting unit is used for obtaining rainfall of all rainfall states on the rainfall date for each seasonal characteristic time period, and obtaining rainfall obeying distribution of the seasonal characteristic time period according to rainfall fitting, wherein the distribution is specifically Gamma distribution.
The rainfall state generation module 203 specifically includes:
The seasonal Markov chain model construction unit is used for constructing a seasonal Markov chain model by taking the set {0,1} of all possible rainfall states as a state space and taking the set of all seasonal rainfall state transition probabilities as a conditional probability set;
an initial state setting unit for randomly setting the simulated rainfall state of the first day according to the rainfall occurrence frequency corresponding to the first day ;
A rainfall state simulation unit for calculating the rainfall state of the following day according to the corresponding seasonal transition probability according to the rainfall state of the previous day based on the seasonal Markov chain model, thereby simulating a simulated rainfall state sequence of a preset simulation period, wherein,And (3) representing the simulated rainfall state of the tau th day in the simulated rainfall state sequence, wherein N represents the total number of days of a preset simulation period.
The rainfall sequence generation module 204 specifically includes:
The rainfall day simulation unit is used for extracting rainfall states from the rainfall state simulation sequence to be rainfall dates, and simulating corresponding rainfall according to the rainfall obeying distribution of the corresponding seasonal characteristic time period;
A rainfall-free day simulation unit for setting the rainfall amount of a rainfall state at a date without rainfall to 0;
the simulated rainfall sequence acquisition unit is used for combining the simulated rainfall state and the corresponding rainfall amount every day in a preset simulation period to be used as a seasonal simulated rainfall sequence.
The throughput simulation module 205 specifically includes:
the data acquisition unit is used for acquiring meteorological data, elevation data, soil parameters, land utilization data and vegetation coverage data of the river basin;
the hydrologic simulation unit is used for inputting seasonal rainfall simulation sequences, meteorological data, elevation data, soil parameters, land utilization data and vegetation coverage data into the river basin mechanism model, simulating rainfall-runoff production process in the river basin, and obtaining daily runoff production simulation values;
and the format unifying unit is used for unifying the formats of the output flow analog values and outputting the output flow time sequence.
The output flow prediction input feature extraction module 206 specifically includes:
The accumulated rainfall extraction unit is used for calculating the accumulated rainfall of the previous days of the day according to the daily rainfall in a preset simulation period;
The seasonal characteristic determining unit is used for determining seasonal characteristics according to a seasonal characteristic time period of each day in a preset simulation period;
a normalization unit for normalizing daily rainfall, accumulated rainfall, and seasonal characteristics;
And the characteristic generating unit is used for splicing the standardized rainfall, the accumulated rainfall and the seasonal characteristic to form an input characteristic of the yield prediction.
The machine learning model integration module 207 specifically includes:
The sub-model construction unit is used for constructing a plurality of machine learning sub-models based on different machine learning algorithms;
The data set acquisition unit is used for forming a data set from the input characteristics of all the production flow predictions in the preset simulation period and the corresponding production flow simulation values, and dividing the data set into a first training set, a second training set, a first verification set, a second verification set, a first test set and a second test set;
The sub-model training unit is used for training each machine learning sub-model by adopting a first training set, optimizing the super parameters of each machine learning sub-model by adopting a first verification set, and testing the capability of the machine learning sub-model by adopting a first test set;
The machine learning set model building unit is used for building a machine learning set model, wherein the machine learning set model is a weighted set of all machine learning sub-models;
the weight optimizing unit is used for training the machine learning set model by adopting a second training set, optimizing the weight in the machine learning set model by adopting a second verification set to obtain an optimized machine learning set model, and testing the capacity of the optimized machine learning set model by adopting a second test set.
The device provided by the embodiment of the invention can be used for executing the method provided by the first embodiment of the invention, and has the corresponding functions and beneficial effects of executing the method. Not specifically described with reference to the method, and not described in detail.
It should be noted that, in the embodiment of the apparatus, each unit and module included are only divided according to the functional logic, but not limited to the above-mentioned division, so long as the corresponding functions can be implemented, and the specific names of the functional units are only for convenience of distinguishing each other, and are not used to limit the protection scope of the present invention.
The embodiments described above are merely illustrative, wherein the modules illustrated as separate components may or may not be physically separate, and the components shown as modules may or may not be physical, i.e., may be located in one place, or may be distributed over a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. It will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course, may be implemented solely by hardware, as long as the function or function is achieved.
Embodiment three.
The present invention also provides a computer program product, such as a mobile phone, an app on a tablet, an installer on a computer, etc., comprising computer programs/instructions which when executed by a processor implement the method of embodiment one. Code for computer-executable programs for performing the operations of the present invention may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++, python and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

Claims (10)

1. A drainage basin rainfall runoff production prediction method integrating a mechanism model and a machine learning model is characterized by comprising the following steps:
(1) Acquiring historical rainfall data of a river basin, and preprocessing;
(2) Dividing one year into a plurality of seasonal characteristic time periods, and calculating seasonal rainfall state transition probability of each seasonal characteristic time period and rainfall obeying distribution of each seasonal characteristic time period according to the historical rainfall data;
(3) Constructing a seasonal Markov chain model by taking the seasonal rainfall state transition probability as the conditional transition probability, and simulating the rainfall state every day in a preset simulation period according to the seasonal Markov chain model;
(4) Simulating the rainfall of each day according to the rainfall obeying distribution of each seasonal characteristic time period and the rainfall state of each day in a preset simulation period, specifically extracting the rainfall state as the rainfall date, simulating the corresponding rainfall according to the rainfall obeying distribution of the corresponding seasonal characteristic time period, and combining the rainfall state of each day in the rainfall state sequence with the corresponding rainfall as a seasonal simulation rainfall sequence;
(5) Carrying out river basin rainfall runoff simulation by adopting a river basin mechanism model according to a seasonal rainfall simulation sequence to obtain a daily runoff simulation value in a simulation period;
(6) Extracting accumulated rainfall and seasonal features of the first several days of each day in a preset simulation period, and fusing the accumulated rainfall and the seasonal features with the rainfall of the current day to form input features of the yield prediction;
(7) Training a machine learning set model by taking input features of all production flow predictions in a preset simulation period and corresponding production flow simulation values as data sets, wherein the machine learning set model is a weighted integration of a plurality of machine learning sub-models;
(8) And inputting the input characteristics of the target river basin runoff prediction into a trained machine learning set model, and predicting future runoff.
2. The method for predicting drainage basin rainfall runoff of a fusion mechanism model and a machine learning model according to claim 1, wherein the step (1) specifically comprises:
(1-1) acquiring historical rainfall data of a river basin;
(1-2) identifying whether an abnormal rainfall value exists in the historical rainfall data, and if so, deleting the abnormal rainfall value;
(1-3) detecting whether the historical rainfall data has a missing value, if so, interpolating the missing value;
and (1-4) detecting whether the historical rainfall data is in a unified time format and unit, and if not, unifying.
3. The method for predicting drainage basin rainfall runoff of a fusion mechanism model and a machine learning model according to claim 1, wherein the step (2) specifically comprises:
(2-1) determining a daily rainfall state according to the historical rainfall data according to the following formula:
,
in the formula, The rainfall state of the t day in the historical rainfall data is represented, the rainfall state 1 represents rainfall, the rainfall state 0 represents no rainfall, P threshold represents a rainfall threshold value, and P t represents the rainfall of the t day in the historical rainfall data;
(2-2) dividing a year into a plurality of seasonal characteristic time periods, and counting the number of rainfall state transitions in each seasonal characteristic time period of the river basin:
,
,
in the formula, A rainfall state transition indicating variable indicating whether or not the t-th day transitions from rainfall state i to rainfall state j,The representation is that the number of the elements,Indicating the absence of the presence of a further agent,The number of rainfall state transitions from rainfall state i to rainfall state j in the seasonal characteristic period s is represented,Representing a seasonal characteristic time period s;
(2-3) calculating seasonal rainfall state transition probabilities for each seasonal characteristic time period according to the rainfall state transition times:
,
in the formula, The rainfall state transition probability of the seasonal characteristic period s from rainfall state i to rainfall state j is represented,The number of times a rainfall condition i occurs within a seasonal characteristic time period s is represented;
(2-4) for each seasonal characteristic time period, acquiring rainfall of the dates with rainfall in all rainfall states, and fitting according to the rainfall to obtain the rainfall obeying distribution of the seasonal characteristic time period, wherein the distribution is specifically Gamma distribution.
4. The drainage basin rainfall runoff prediction method integrating a mechanism model and a machine learning model as claimed in claim 1, wherein the step (3) specifically comprises:
(3-1) constructing a seasonal Markov chain model by taking a set {0,1} of all possible rainfall states as a state space and a set of all seasonal rainfall state transition probabilities as a conditional probability set;
(3-2) randomly setting the simulated rainfall state of the first day according to the rainfall occurrence frequency corresponding to the first day of the preset simulation period ;
(3-3) Calculating rainfall states of the following day according to the corresponding seasonal transition probabilities from rainfall states of the previous day based on the seasonal Markov chain model, thereby simulating a simulated rainfall state sequence of a preset simulation period, wherein,Representing the first of a sequence of simulated rainfall conditionsAnd the simulated rainfall state of the day, wherein N represents the total number of days of a preset simulation period.
5. The method for predicting drainage basin rainfall runoff of a fusion mechanism model and a machine learning model according to claim 1, wherein the step (4) specifically comprises:
(4-1) extracting rainfall states as dates with rainfall from the rainfall simulation state sequence, and simulating corresponding rainfall according to the rainfall obeying distribution of the corresponding seasonal characteristic time period;
(4-2) setting a rainfall amount of a date in which the rainfall state is no rainfall to 0;
(4-3) combining the simulated rainfall state and the corresponding rainfall amount every day in the preset simulation period as a seasonal simulated rainfall sequence.
6. The method for predicting drainage basin rainfall runoff of a fusion mechanism model and a machine learning model according to claim 1, wherein the step (5) specifically comprises:
(5-1) acquiring meteorological data, elevation data, soil parameters, land utilization data and vegetation coverage data of a river basin;
(5-2) inputting seasonal rainfall simulation sequences, meteorological data, elevation data, soil parameters, land utilization data and vegetation coverage data into a river basin mechanism model, and simulating rainfall-runoff production process in the river basin to obtain daily runoff simulation values;
(5-3) outputting the flow output time sequence after the flow output analog values are subjected to format unification.
7. The method for predicting drainage basin rainfall runoff of a fusion mechanism model and a machine learning model according to claim 1, wherein the step (6) specifically comprises:
(6-1) calculating the accumulated rainfall in the last days of the day according to the rainfall in each day in a preset simulation period;
(6-2) determining seasonal features according to a seasonal feature time period in which each day is located in a preset simulation period;
(6-3) standardizing daily rainfall, accumulated rainfall, and seasonal characteristics;
And (6-4) splicing the standardized rainfall, the accumulated rainfall and the seasonal characteristics to form the input characteristics of the yield prediction.
8. The method for predicting drainage basin rainfall runoff of a fusion mechanism model and a machine learning model according to claim 1, wherein the step (7) specifically comprises:
(7-1) constructing a plurality of machine learning sub-models based on different machine learning algorithms;
(7-2) forming a data set by using the input characteristics of all the production flow predictions and the corresponding production flow analog values in a preset analog period, and dividing a first training set, a second training set, a first verification set, a second verification set, a first test set and a second test set;
(7-3) training each machine learning sub-model by using a first training set, optimizing the respective super parameters of all the machine learning sub-models by using a first verification set, and testing the capability of the machine learning sub-models by using a first test set;
(7-4) building a machine learning set model, which is a weighted set of all machine learning sub-models;
And (7-5) training the machine learning set model by adopting a second training set, optimizing the weight in the machine learning set model by adopting a second verification set to obtain an optimized machine learning set model, and testing the capacity of the optimized machine learning set model by adopting a second test set.
9. A drainage basin rainfall runoff production prediction device integrating a mechanism model and a machine learning model, which is characterized by comprising:
The historical rainfall data preprocessing module is used for acquiring historical rainfall data of the river basin and preprocessing the historical rainfall data;
the transition probability and distribution calculation module is used for calculating seasonal rainfall state transition probability of each seasonal characteristic time period and rainfall obeying distribution of each seasonal characteristic time period according to the historical rainfall data;
The rainfall state generation module is used for constructing a seasonal Markov chain model by taking the seasonal rainfall state transition probability as the conditional transition probability, and simulating the rainfall state of each day in a preset simulation period according to the seasonal Markov chain model;
the rainfall sequence generation module is used for simulating the rainfall of each day according to the rainfall obeying distribution of each seasonal characteristic time period and the rainfall state of each day in a preset simulation period, and combining the rainfall state of each day in the preset simulation period with the corresponding rainfall as a seasonal simulated rainfall sequence;
The flow rate simulation module is used for simulating the flow rate of the rainfall in the river basin by adopting a mechanism model of the river basin according to the seasonal rainfall simulation sequence to obtain a flow rate simulation value of each day in a simulation period;
The output flow prediction input feature extraction module is used for extracting accumulated rainfall and seasonal features of the first several days of each day in a preset simulation period and fusing the accumulated rainfall and the rainfall of the same day into output flow prediction input features;
The machine learning model integration module is used for training and optimizing a machine learning integration model by taking input characteristics of all production flow predictions and corresponding production flow simulation values in a preset simulation period as a data set, wherein the machine learning integration model is a weighted integration of a plurality of machine learning sub-models;
And the future flow prediction module is used for inputting the input characteristics of the flow prediction of the target river basin into a trained machine learning model to predict the future flow.
10. A computer program product comprising computer programs/instructions which, when executed by a processor, implement the method of any of claims 1-8.
CN202510037939.3A 2025-01-10 2025-01-10 A basin rainfall runoff prediction method, device and program product integrating mechanism model and machine learning model Active CN119474881B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202510037939.3A CN119474881B (en) 2025-01-10 2025-01-10 A basin rainfall runoff prediction method, device and program product integrating mechanism model and machine learning model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202510037939.3A CN119474881B (en) 2025-01-10 2025-01-10 A basin rainfall runoff prediction method, device and program product integrating mechanism model and machine learning model

Publications (2)

Publication Number Publication Date
CN119474881A CN119474881A (en) 2025-02-18
CN119474881B true CN119474881B (en) 2025-08-08

Family

ID=94588728

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202510037939.3A Active CN119474881B (en) 2025-01-10 2025-01-10 A basin rainfall runoff prediction method, device and program product integrating mechanism model and machine learning model

Country Status (1)

Country Link
CN (1) CN119474881B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105139080A (en) * 2015-08-04 2015-12-09 国家电网公司 Improved photovoltaic power sequence prediction method based on Markov chain
CN116842851A (en) * 2023-08-03 2023-10-03 北京市市政工程设计研究总院有限公司广东分院 Model system for water service data perception and mechanism analysis based on drainage basin subsystem

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102034001A (en) * 2010-12-16 2011-04-27 南京大学 Design method for distributed hydrological model by using grid as analog unit
CN111047079B (en) * 2019-11-25 2022-05-13 山东师范大学 Wind power plant wind speed time series prediction method and system
CN115062818A (en) * 2022-05-10 2022-09-16 中国长江三峡集团有限公司 A probabilistic prediction method of reservoir sediment inflow based on Bayesian model averaging and machine learning
CN115507822B (en) * 2022-06-09 2024-07-02 武汉大学 Flood risk prediction method driven by hydrologic cycle variation
CN115423163A (en) * 2022-08-24 2022-12-02 中国地质大学(武汉) Method and device for predicting short-term flood events of drainage basin and terminal equipment
CN115796402B (en) * 2023-02-08 2023-05-12 成都理工大学 Air quality index prediction method based on combined model
CN117494586B (en) * 2023-12-29 2024-04-30 浙江大学 A spatiotemporal prediction method for flash floods based on deep learning
CN118194687A (en) * 2024-01-04 2024-06-14 青海大学 Mountain torrent forecasting method and system with coupling of physical mechanism and deep learning model
CN118013866B (en) * 2024-04-09 2024-06-25 西北工业大学 Medium-and-long-term runoff prediction method based on horizontal and vertical attention

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105139080A (en) * 2015-08-04 2015-12-09 国家电网公司 Improved photovoltaic power sequence prediction method based on Markov chain
CN116842851A (en) * 2023-08-03 2023-10-03 北京市市政工程设计研究总院有限公司广东分院 Model system for water service data perception and mechanism analysis based on drainage basin subsystem

Also Published As

Publication number Publication date
CN119474881A (en) 2025-02-18

Similar Documents

Publication Publication Date Title
CN111310968B (en) LSTM neural network circulating hydrologic forecasting method based on mutual information
Xu et al. Research on particle swarm optimization in LSTM neural networks for rainfall-runoff simulation
CN111639748B (en) Watershed pollutant flux prediction method based on LSTM-BP space-time combination model
CN118350678B (en) Water environment monitoring data processing method and system based on Internet of things and big data
CN117526274B (en) New energy power prediction method, electronic equipment and storage medium in extreme climate
CN112396152A (en) Flood forecasting method based on CS-LSTM
CN115423163A (en) Method and device for predicting short-term flood events of drainage basin and terminal equipment
CN114330935B (en) New energy power prediction method and system based on multiple combination strategies integrated learning
Li et al. A novel multichannel long short-term memory method with time series for soil temperature modeling
CN112560633A (en) Plant key phenological period time point prediction method and system based on deep learning
JP7662881B1 (en) Distributed hydrological forecasting method, device, computer device, and medium
CN113011657A (en) Method and device for predicting typhoon water level, electronic equipment and storage medium
CN114723188A (en) Water quality prediction method, device, computer equipment and storage medium
CN118014391A (en) Groundwater level prediction method, device and medium
CN116611588B (en) Precipitation multi-driving factor segmented calibrated optimization forecast method and system
Luppichini et al. Machine learning models for river flow forecasting in small catchments
CN118657225B (en) Interpretability evaluation method and system for hydrological and meteorological deep learning forecasting models
CN119761849A (en) Method for identifying and forecasting strong convection based on deep learning
CN119829912A (en) Marine environment forecasting method integrating daily climate state and machine learning model
CN119651568A (en) Wind power prediction method and system based on meteorological characteristics and deep learning model
CN119167039A (en) Wind and solar power generation prediction method based on meteorological feature screening and transfer learning
CN119378735A (en) A basin runoff prediction method based on multi-dimensional hydrological information
CN119474881B (en) A basin rainfall runoff prediction method, device and program product integrating mechanism model and machine learning model
CN115422840B (en) Daily-scale runoff estimation method based on physical model hybrid deep learning model
Zhu A hybrid model to predict the hydrological drought in the Tarim River Basin based on CMIP6

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载