+

CN120013004A - A quantitative precipitation estimation method, system, storage medium and electronic device - Google Patents

A quantitative precipitation estimation method, system, storage medium and electronic device Download PDF

Info

Publication number
CN120013004A
CN120013004A CN202510094673.6A CN202510094673A CN120013004A CN 120013004 A CN120013004 A CN 120013004A CN 202510094673 A CN202510094673 A CN 202510094673A CN 120013004 A CN120013004 A CN 120013004A
Authority
CN
China
Prior art keywords
data
distribution
precipitation
probability
precipitation estimation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202510094673.6A
Other languages
Chinese (zh)
Inventor
陈元昭
陈训来
刘东华
王明洁
张文海
饶华炎
张立杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Meteorological Bureau Shenzhen Meteorological Station
Original Assignee
Shenzhen Meteorological Bureau Shenzhen Meteorological Station
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Meteorological Bureau Shenzhen Meteorological Station filed Critical Shenzhen Meteorological Bureau Shenzhen Meteorological Station
Priority to CN202510094673.6A priority Critical patent/CN120013004A/en
Publication of CN120013004A publication Critical patent/CN120013004A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S13/00Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
    • G01S13/88Radar or analogous systems specially adapted for specific applications
    • G01S13/95Radar or analogous systems specially adapted for specific applications for meteorological use
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01WMETEOROLOGY
    • G01W1/00Meteorology
    • G01W1/14Rainfall or precipitation gauges
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Mathematical Physics (AREA)
  • Business, Economics & Management (AREA)
  • Evolutionary Computation (AREA)
  • Remote Sensing (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Pure & Applied Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • Environmental & Geological Engineering (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Algebra (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Hydrology & Water Resources (AREA)
  • Databases & Information Systems (AREA)
  • Atmospheric Sciences (AREA)
  • Biodiversity & Conservation Biology (AREA)

Abstract

The invention provides a quantitative precipitation estimation method, a quantitative precipitation estimation system, a storage medium and electronic equipment, wherein the quantitative precipitation estimation method comprises the steps of obtaining a data set, constructing a priori distribution model based on the data set, defining approximate distribution of posterior distribution of precipitation estimation according to the priori distribution model based on variation reasoning, determining lower bound of precipitation estimation according to the approximate distribution based on Jessen inequality, and maximizing the lower bound of precipitation estimation to obtain quantitative precipitation estimation results. According to the method, firstly, a radar data set of a quantitative precipitation estimation area and an automatic station live observation information data set are constructed, a priori distribution model is constructed based on the data set, based on variation reasoning, the approximate distribution of precipitation estimation posterior distribution is defined according to the priori distribution model, the lower bound of precipitation estimation is determined according to the approximate distribution based on the Jessen inequality, the lower bound of precipitation estimation is maximized, and a quantitative precipitation estimation result is obtained, so that quantitative precipitation estimation is carried out, and service application performance of the quantitative precipitation estimation is improved.

Description

Quantitative precipitation estimation method, system, storage medium and electronic equipment
Technical Field
The invention relates to the technical field of weather prediction, in particular to a quantitative precipitation estimation method, a quantitative precipitation estimation system, a storage medium and electronic equipment.
Background
At present, quantitative precipitation estimation (Quantitative Precipitation Estimation, QPE) is the basis of services such as quantitative precipitation prediction QPF, strong precipitation approach warning and the like, is an important component of short-time approach prediction, and is always an important point and difficulty in prediction services. Automatic weather stations are the most direct way to observe precipitation at present. Due to uneven spatial distribution of the automatic weather station, the observed data can not completely reflect the distribution characteristics of precipitation.
Radar reflectivity factors are the primary factors affecting precipitation, but precipitation is affected by a number of factors. How to screen out factors affecting the accuracy of quantitative precipitation estimation to obtain precipitation estimation which is optimized as much as possible is a difficult problem of quantitative precipitation estimation at present.
Thus, a solution is needed.
Disclosure of Invention
The invention aims to provide a quantitative precipitation estimation method, which is based on long-time massive observation data and adopts a quantitative precipitation estimation algorithm based on non-parameter Bayesian machine learning. The algorithm is combined with 9 elements such as radar reflectivity factors, automatic station information, radar inversion wind fields, weather types and the like to constrain a radar quantitative precipitation estimation algorithm, firstly, a radar data set of a quantitative precipitation estimation area and an automatic station live observation information data set are constructed, an priori distribution model is constructed based on the data set, an approximate distribution of precipitation estimation posterior distribution is defined based on a variational reasoning according to the priori distribution model, a lower bound of precipitation estimation is determined based on a Jessen inequality according to the approximate distribution, the lower bound of precipitation estimation is maximized, and a quantitative precipitation estimation result is obtained, so that quantitative precipitation estimation is carried out, and service application performance of the quantitative precipitation estimation is improved.
The quantitative precipitation estimation method provided by the embodiment of the invention comprises the following steps:
acquiring a data set M;
constructing a priori distribution model p (M|D) based on the data set M, wherein D is a family of models;
based on the variational reasoning, defining the approximate distribution q (theta) of precipitation estimation posterior distribution according to a priori distribution model p (M|D), wherein theta is a probability model parameter;
Determining a lower bound E q[log(p(Θ,M))]-Eq [ log (q (Θ)) ] of precipitation estimation according to the approximate distribution q (Θ) based on the Jessen inequality, wherein E q is a precipitation estimation data distribution interval;
maximizing the lower bound E q[log(p(Θ,M))]-Eq [ log (q (Θ)) ] of precipitation estimation to obtain quantitative precipitation estimation results
Optionally, the data set M at least comprises a radar data set of a quantitative precipitation estimation area and an automatic station information data set, wherein the radar data set is 1Km x 1Km radar reflectivity factor grid data in the quantitative precipitation estimation area, and the automatic station information data set comprises longitude and latitude position coordinates and precipitation live observation data of each automatic station in the quantitative precipitation estimation area.
Optionally, the constructing the prior distribution model p (m|d) based on the data set M includes:
g 0 in the Dirichlet non-parametric bayesian model is a basic probability distribution on the probability measure space Ω, the concentration parameter α 0 >0, and if the probability distribution G on the probability measure space Ω obeys the basic probability distribution G 0:
G~DP(α0,G0)
Wherein, the basic probability distribution G 0 determines the distribution of basic constituent elements in the prior distribution model p (M|D), and DP is a Dirichlet process;
Based on the Dirichlet process hybrid model DPM, a generation probability is increased for each data point in a given amount of precipitation estimation area as a priori distribution of data:
mi~p(m|θi)
wherein, the parameter theta i obeys the probability distribution G, i epsilon N is a set from 1 to the total number N of data points, each data point generates a probability parameter, N is the total number of data points, p is a conditional probability density function, m is the precipitation estimated value of the data point;
the optimal model is chosen as the prior distribution model p (m|d) by comparing the likelihood functions of the different families of prior distribution probability models M i~p(m|θi):
p(M|D)=∫Θ(M|Θ)p(Θ|D)
wherein D represents a family of models.
Optionally, the selecting the optimal model as the prior distribution model p (m|d) by comparing likelihood functions of different family models includes:
Setting the data of a data set M= { M 1,m2,m3…mn } to be independent, and reading n automatic station live precipitation observation data in a quantitative precipitation estimation area, wherein the n automatic station live precipitation observation data are arranged randomly to obtain { F (i) }, wherein i=1, 2,..n;
Is provided with Ω=Ω t-1, sampling the indicator factor β i of each automatic station live precipitation observation data i e { F (1), F (2)..f (n) }, n randomly arranged automatic station live precipitation observation data formed based on the function { F (i) }, wherein,Omega t-1 is the probability measure space of the time t-1;
Likelihood estimates f k(mi) of the observed data based on the current K clusters,
fk(mi)=p(mii=k,M\i,ζ)
Wherein, beta i is a new category, M \i is the data of the corresponding corner mark is removed from the data set M= { M 1,m2,m3…mn }, ζ is a distribution parameter, k is an observation value, and M i is a random variable; similar to k, another observation that represents a non-k;
Beta i is sampled:
in the formula, For the amount of data already in class K, E i is a preset observation data sample, K is the number of observation data samples, f k represents the probability density function for class K, delta is the Cronecker delta function when β i =k, delta (β i, K) =1, otherwise 0; To represent probability density functions for other than the kth class;
If it is Then the number of clusters K is increased by 1, k=k+1;
Checking the observed data quantity of various clustering calculation likelihood functions, if the total number of the observed data of one type is 0, deleting the corresponding type, and reducing the clustering quantity K by 1, wherein K=K-1.
The quantitative precipitation estimation system provided by the embodiment of the invention comprises:
an acquisition module for acquiring a data set M;
The construction module is used for constructing a priori distribution model p (M|D) based on the data set M, wherein D is a family of models;
the definition module is used for defining the approximate distribution q (theta) of precipitation estimation posterior distribution according to the prior distribution model p (M|D) based on variation reasoning, wherein theta is a probability model parameter;
A determining module for determining a lower bound E q[log(p(Θ,M))]-Eq [ log (q (Θ)) ] of the precipitation estimate based on the Jessen inequality according to an approximate distribution q (Θ), wherein E q is a region of the precipitation estimate distribution;
the maximizing module is used for maximizing the lower limit E q[log(p(Θ,M))]-Eq [ log (q (Θ)) ] of precipitation estimation to obtain quantitative precipitation estimation results
Optionally, the data set M at least comprises a radar data set of a quantitative precipitation estimation area and an automatic station information data set, wherein the radar data set is 1Km x 1Km radar reflectivity factor grid data in the quantitative precipitation estimation area, and the automatic station information data set comprises longitude and latitude position coordinates and precipitation live observation data of each automatic station in the quantitative precipitation estimation area.
Optionally, the constructing the prior distribution model p (m|d) based on the data set M includes:
g 0 in the Dirichlet non-parametric bayesian model is a basic probability distribution on the probability measure space Ω, the concentration parameter α 0 >0, and if the probability distribution G on the probability measure space Ω obeys the basic probability distribution G 0:
G~DP(α0,G0)
Wherein, the basic probability distribution G 0 determines the distribution of basic constituent elements in the prior distribution model p (M|D), and DP is a Dirichlet process;
Based on the Dirichlet process hybrid model DPM, a generation probability is increased for each data point in a given amount of precipitation estimation area as a priori distribution of data:
mi~p(m|θi)
wherein, the parameter theta i obeys the probability distribution G, i epsilon N is a set from 1 to the total number N of data points, each data point generates a probability parameter, N is the total number of data points, p is a conditional probability density function, m is the precipitation estimated value of the data point;
the optimal model is chosen as the prior distribution model p (m|d) by comparing the likelihood functions of the different families of prior distribution probability models M i~p(m|θi):
p(M|D)=∫Θ(M|Θ)p(Θ|D)
wherein D represents a family of models.
Optionally, the selecting the optimal model as the prior distribution model p (m|d) by comparing likelihood functions of different family models includes:
Setting the data of a data set M= { M 1,m2,m3…mn } to be independent, and reading n automatic station live precipitation observation data in a quantitative precipitation estimation area, wherein the n automatic station live precipitation observation data are arranged randomly to obtain { F (i) }, wherein i=1, 2,..n;
Is provided with Ω=Ω t-1, sampling the indicator factor β i of each automatic station live precipitation observation data i e { F (1), F (2)..f (n) }, n randomly arranged automatic station live precipitation observation data formed based on the function { F (i) }, wherein,Omega t-1 is the probability measure space of the time t-1;
Likelihood estimates f k(mi) of the observed data based on the current K clusters,
fk(mi)=p(mii=k,M\i,ζ)
Wherein, beta i is a new category, M \i is the data of the corresponding corner mark is removed from the data set M= { M 1,m2,m3…mn }, ζ is a distribution parameter, k is an observation value, and M i is a random variable; similar to k, another observation that represents a non-k;
Beta i is sampled:
in the formula, For the amount of data already in class K, E i is a preset observation data sample, K is the number of observation data samples, f k represents the probability density function for class K, delta is the Cronecker delta function when β i =k, delta (β i, K) =1, otherwise 0; To represent probability density functions for other than the kth class;
If it is Then the number of clusters K is increased by 1, k=k+1;
Checking the observed data quantity of various clustering calculation likelihood functions, if the total number of the observed data of one type is 0, deleting the corresponding type, and reducing the clustering quantity K by 1, wherein K=K-1.
The embodiment of the invention provides a computer readable storage medium, on which a computer program is stored, and a processor executes the computer program to implement the method of any one of the above embodiments.
The electronic device provided by the embodiment of the invention comprises a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to realize the method of any one of the above.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention may be realized and attained by the structure particularly pointed out in the written description and drawings.
The technical scheme of the invention is further described in detail through the drawings and the embodiments.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:
FIG. 1 is a schematic diagram of a quantitative precipitation estimation method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a quantitative precipitation estimation system according to an embodiment of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described below with reference to the accompanying drawings, it being understood that the preferred embodiments described herein are for illustration and explanation of the present invention only, and are not intended to limit the present invention.
The embodiment of the invention provides a quantitative precipitation estimation method, as shown in fig. 1, comprising the following steps:
S1, acquiring a data set M;
S2, constructing a priori distribution model p (M|D) based on a data set M, wherein D is a family of models;
S3, based on variation reasoning, defining an approximate distribution q (theta) of precipitation estimation posterior distribution according to a priori distribution model p (M|D), wherein theta is a probability model parameter;
S4, determining a lower boundary E q[log(p(Θ,M))]-Eq [ log (q (Θ)) ] of precipitation estimation according to the approximate distribution q (Θ) based on the Jessen inequality, wherein E q is a precipitation estimation data distribution interval;
s5, maximizing a lower boundary E q[log(p(Θ,M))]-Eq [ log (q (Θ)) ] of precipitation estimation to obtain a quantitative precipitation estimation result
The data set M at least comprises a radar data set of a quantitative precipitation estimation area and an automatic station information data set, wherein the radar data set is 1Km x 1Km radar reflectivity factor grid data in the quantitative precipitation estimation area, and the automatic station information data set comprises longitude and latitude position coordinates and precipitation live observation data of each automatic station in the quantitative precipitation estimation area.
The constructing a priori distribution model p (m|d) based on the data set M, includes:
g 0 in the Dirichlet non-parametric bayesian model is a basic probability distribution on the probability measure space Ω, the concentration parameter α 0 >0, and if the probability distribution G on the probability measure space Ω obeys the basic probability distribution G 0:
G~DP(α0,G0)
Wherein, the basic probability distribution G 0 determines the distribution of basic constituent elements in a prior distribution model p (M|D), the DP is a Dirichlet process, wherein, the 'DIRICHLET PROCESS' (Dirichlet process) is a random process widely used in Bayesian non-parametric statistics, and the method is used for modeling the precipitation distribution probability without knowing the specific form of precipitation distribution but having some prior knowledge;
Based on the Dirichlet process hybrid model DPM, a generation probability is increased for each data point in a given amount of precipitation estimation area as a priori distribution of data:
mi~p(m|θi)
Wherein the parameter theta i obeys a probability distribution G, i epsilon N is a set from 1 to the total number N of data points, each data point generates a probability parameter, N is the total number of the data points, p is a conditional probability density function (probability density function, PDF), m is the precipitation estimated value of the data point, and the expression represents the probability distribution of the precipitation estimated data under the condition of the given parameter;
the optimal model is chosen as the prior distribution model p (m|d) by comparing the likelihood functions of the different families of prior distribution probability models M i~p(m|θi):
p(M|D)=∫Θ(M|Θ)p(Θ|D)
wherein D represents a family of models.
The selecting the optimal model as the prior distribution model p (m|d) by comparing likelihood functions of different family models includes:
Setting the data of a data set M= { M 1,m2,m3…mn } to be independent, reading n automatic station live precipitation observation data in a quantitative precipitation estimation area, and randomly arranging the n automatic station live precipitation observation data to obtain { F (i) }, wherein i=1, 2..n;
Is provided with Ω=Ω t-1, sampling the indicator factor β i of each automatic station live precipitation observation data i e { F (1), F (2)..f (n) }, n randomly arranged automatic station live precipitation observation data formed based on the function { F (i) }, wherein,The method comprises the steps of collecting parameters at t-1 time, wherein omega t-1 is a probability measure space at t-1 time, generating random arrangement based on a specific function for each automatic station live precipitation observation data, sampling to form corresponding indication factors, and the indication factors represent the collecting parameters (such as weather parameters, precipitation values and the like) at the current time, wherein the probability measure space at the time is included.
Likelihood estimates f k(mi) of the observed data based on the current K clusters,
fk(mi)=p(mii=k,M\i,ζ)
Wherein β i is a new class, M \i is removing data (i.e., k in the formula) corresponding to the corner mark (unique index corresponding to each data point in the data set) from the data set m= { M 1,m2,m3…mn }, ζ is a distribution parameter, k is an observation value, and M i is a random variable; Similarly to k, another observation value other than k is represented, and for each current cluster, a likelihood estimate of its observation is calculated. Namely, the fitting degree of the data is evaluated based on the current data set and the clustering model. And gradually fitting different models by setting the number of clusters and sample data, and calculating the likelihood function of each cluster. And (3) calculating related parameters in the formula, such as clustering parameters, observation values, random variables and the like of the data, and finally obtaining the likelihood function value of each cluster. By comparing likelihood functions of different families of models, the model with the highest likelihood value is selected as the optimal model, and is used as the prior distribution model.
Beta i is sampled:
in the formula, For the amount of data already in class K, E i is a preset observation data sample, K is the number of observation data samples, f k represents a probability density function for class K, δ is a Kronecker δ function, when β i =k, δ (β i, K) =1, otherwise 0; To represent probability density functions for other than the kth class;
If it is Then the number of clusters K is increased by 1, k=k+1;
Checking the observed data quantity of various clustering calculation likelihood functions, if the total number of the observed data of one type is 0, deleting the corresponding type, and reducing the clustering quantity K by 1, wherein K=K-1.
Sampling is carried out in the clustering process, and whether the clustering quantity needs to be adjusted is judged. If the amount of data in a cluster is 0, the cluster is deleted and the number of clusters is reduced. The adjustment of the number of clusters can improve the estimation precision, and each cluster can effectively represent the distribution condition of data. When the number of clusters reaches a preset condition, the number of clusters is increased by 1 so as to further refine classification. All clusters are calculated and checked to ensure that the total number of observation data of the likelihood function of each cluster is proper. During data sampling, a kronecker function (Kronecker function) is used to represent the relationship between different data categories. If the number of clusters is still not satisfactory (e.g., a certain class is empty), the class needs to be deleted and the number of clusters reduced. By dynamically adjusting the clustering quantity, the clustering result of the precipitation data can be ensured to be more in line with the distribution of the actual data. After multiple iterations, when the clustering number and likelihood function reach the preset precision, finally outputting the optimal precipitation estimation result.
The working principle and the beneficial effects of the technical scheme are as follows:
Bayesian machine learning is an important branch of machine learning. Bayesian methods were first proposed by the uk math, thomas bayes. Through the development of over two hundred years, the method becomes an important component of machine learning, and is widely applied to the fields of statistical machine learning such as multivariable structured output prediction and the like. The basic concept of bayesian theorem is that given a priori distribution and likelihood function of a model, the posterior probability distribution p (Θ|m) of the model can be derived from bayesian formulas:
in the formula (1), Θ is a probability model parameter, M is a data set, the application comprises a radar data set, an automatic station information data set, p 0 (Θ) is a priori distribution function of the model, p (M|Θ) is a likelihood function, and p (M) is a constant.
The choice of model is a fundamental problem of bayesian approaches. The Dirichlet non-parameter Bayes model can obtain the number of the clustering centers through the characteristics of the data and the machine automatic learning.
The Dirichlet non-parametric Bayesian model assumes G 0 is a random probability distribution over a probability measure space Ω, the concentration parameter α 0 >0, the probability distribution G over space Ω obeys the base probability distribution G 0:
G~DP(α0,G0)(2)
Wherein the basic distribution G 0 determines the distribution of basic elements in the model.
However, the probability distribution obtained by the Dirichlet process is discrete, and in order to cluster different sets of data with a certain similarity, a Dirichlet process hybrid model DPM (Dirichlet Process Mixture) (antoniak 1974) is introduced, and by adding a generation probability to each data point, the probability is used as the prior distribution of the data:
mi~p(m|θi)(3)
In the formula (3), the parameter theta i is subjected to G distribution, i epsilon N is a generation probability increased by generating each data point, and N is the data point data. When G obeys the Dirichlet process distribution, the model is called Dirichlet process mixture model. The bayesian method selects the optimal model by comparing likelihood functions of different families of models:
p(M|D)=∫Θ(M|Θ)p(Θ|D) (4)
in equation (4), D represents a family of models, and over-fitting of the models can be avoided by (4) integration, assuming that p (Θ|D) is uniformly distributed without a significant prior function.
In the study, the data of the observation set M= { M 1,m2,m3…mn } is assumed to be independent, in order to obtain the indication factor of each observation, in a non-parametric Bayesian model using a Dirichlet process as a priori distribution, likelihood functions of different family models are obtained by using Gibbs sampling to select an optimal model, and the steps are as follows:
(1) Data initialization, namely reading data by a system, and randomly arranging n pieces of observation data to obtain { F (i) }, wherein i=1, 2,..n.
(2) Category clustering, settingΩ=Ω t-1, for each observation i e { F (1), F (2)..f (n) }, the indicator factor β i for each data is sampled.
Likelihood estimates for the observed data are computed based on the existing K clusters:
fk(mi)=p(mii=k,M\i,ζ) (5)
In formula (5), formula (6), β i is a new category, M \i represents the removal of the data of the corresponding corner mark from the corresponding observation dataset, and ζ is the distribution parameter.
Beta i is sampled:
In the formula (7), in the formula (8), Is the amount of data already in class k. If it isThe number of clusters is increased by one, k=k+1.
(3) Cluster updating, namely checking the quantity of various observed data. If the total number of observed data of a certain class is 0, the class is deleted, and the number of clusters is reduced by one, wherein K=K-1.
The reasoning method of the Bayesian model is important content in Bayesian learning. Given the prior distribution, the posterior distribution of bayesian models is often insoluble, requiring efficient inference methods. In the application, the prediction is performed by adopting the variation reasoning.
In quantitative precipitation estimation prediction, given a data set M, a priori distribution model p (M|D), a variation method is adopted to define an approximate distribution q (Θ) of precipitation estimation posterior distribution. Using the jessen inequality, one can get a lower bound on precipitation estimates:
logp(M)≥Eq[log(p(Θ,M))]-Eq[log(q(Θ))] (9)
By maximizing the estimate lower bound:
the solution of quantitative precipitation estimation can be completed.
To make a quantitative precipitation estimate, it is first necessary to collect some actual data related to precipitation, the data set consisting of a radar data set and an automatic station information data set. The radar dataset contains radar reflectivity factor data within the region of reduced water. The spatial resolution of the radar data is 1km by 1km. These radar reflectivity data can be used to estimate the precipitation distribution, since there is a relationship between the radar signal and the precipitation. The automatic station information data set comprises automatic weather station data in a quantitative precipitation estimation area, and longitude and latitude information of each station and precipitation observation data of corresponding time points are recorded. These live observations serve as references for actual precipitation, serving as verification and auxiliary estimation.
In order to be able to infer an accurate precipitation estimate from the acquired data, a prior distribution model needs to be constructed in order to set prior information of the precipitation based on the existing data and background knowledge. The Dirichlet Process (DP) is a non-parametric bayesian model that can be used to model distributions with an unknown number of populations. In quantitative precipitation estimation, the Dirichlet process is used to model different types of precipitation processes. Through the Dirichlet process, a hybrid model can be constructed, the data divided into classes, and a probability assigned to each class. In the present application, DPM is used to model the a priori distribution of precipitation. In particular, the distribution of data points can be considered as being generated from a mixture model, with the parameters of each mixture component being unknown. The distribution is thus defined by the Dirichlet procedure, providing a priori distribution information for the estimation of the precipitation. In DPM, a priori distribution is used to describe the distribution morphology of precipitation. By constructing prior distribution, the model can be updated when new observation data exists, so that the actual precipitation distribution is gradually approximated.
After the prior distribution model is established, posterior reasoning is carried out. Since in practical cases, the posterior distribution cannot be directly calculated, it is necessary to approximate calculation by a method of variational reasoning. The variational reasoning is an optimization method for solving approximate solutions of complex probability distributions. In precipitation estimation problems, the objective of variational reasoning is to approximate a complex posterior distribution by constructing a simple approximation distribution. In particular, the present application uses variational reasoning to infer the posterior distribution of precipitation from a priori distribution. In the reasoning process, the objective of variational reasoning is to minimize the KL divergence of the approximate posterior distribution by optimizing a set of parameters so as to approximate the true posterior distribution as accurately as possible.
To further optimize the estimation results and to ensure the validity of the model for a given data, the present application introduces the jessen inequality to calculate the lower bound. The Jensen inequality (Jensen's Inequality) is a mathematical tool in bayesian inference that provides a method of approximating an objective function by optimizing a lower bound. In the present application, the jessen inequality is used to establish the lower bound for precipitation estimation. By calculating the lower bound it is ensured that the estimate of the posterior distribution does not deviate too far from the true value, and the maximization of this lower bound also allows to find the optimal precipitation estimate in the multiple solution space.
After the lower bound is calculated, the process of the lower bound is maximized. By maximizing the lower bound, a quantitative precipitation estimate can be obtained that is as accurate as possible. Specifically, for a given dataset, an objective function is derived based on the variational reasoning and the jessen inequality, which function represents the magnitude of the lower bound. And optimizing the objective function to maximize the lower bound, and finally obtaining the optimal precipitation estimation result. It is often necessary to estimate likelihood functions for the data points and to make precipitation estimates by selecting an optimal model. The likelihood function describes the probability of occurrence of the observed data given the model parameters. In precipitation estimation, likelihood functions are used to calculate the difference between the precipitation data and the model predictions. By maximizing the likelihood function, precipitation can be estimated more accurately. In actual data, a plurality of different precipitation types often appear, so that the data are grouped by adopting a clustering method, the distribution characteristics of the precipitation of different types can be better understood, and more accurate estimated values can be obtained through sampling calculation.
Through the steps, the quantitative precipitation estimation result can be finally obtained. The whole process combines the methods of Bayesian reasoning, variational reasoning, jessen inequality and the like, and ensures the accuracy and reliability of the estimation result. The model is continuously optimized and approximates the real precipitation amount by means of a priori distribution model, variation reasoning, lower bound maximization and the like, and finally high-precision precipitation estimation is achieved. The application has wide application prospect in the meteorological field, in particular to the aspects of extreme weather early warning, climate research and the like.
The quantitative precipitation estimation method and system of the invention adopt a quantitative precipitation estimation algorithm based on non-parameter Bayesian machine learning based on long-time massive observation data. The algorithm is combined with 9 elements such as radar reflectivity factors, automatic station information, radar inversion wind fields, weather types and the like to constrain a radar quantitative precipitation estimation algorithm, firstly, a radar data set of a quantitative precipitation estimation area and an automatic station live observation information data set are constructed, an priori distribution model is constructed based on the data set, an approximate distribution of precipitation estimation posterior distribution is defined based on a variational reasoning according to the priori distribution model, a lower bound of precipitation estimation is determined based on a Jessen inequality according to the approximate distribution, the lower bound of precipitation estimation is maximized, and a quantitative precipitation estimation result is obtained, so that quantitative precipitation estimation is carried out, and service application performance of the quantitative precipitation estimation is improved.
The quantitative precipitation estimation system provided by the embodiment of the invention, as shown in fig. 2, comprises:
An acquisition module 1 for acquiring a data set M;
the construction module 2 is used for constructing a priori distribution model p (M|D) based on the data set M, wherein D is a family of models;
The definition module 3 is used for defining the approximate distribution q (theta) of precipitation estimation posterior distribution according to the prior distribution model p (M|D) based on variation reasoning, wherein theta is a probability model parameter;
A determining module 4, configured to determine a lower bound E q[log(p(Θ,M))]-Eq [ log (q (Θ)) ] of the precipitation estimate according to the approximate distribution q (Θ) based on the Jessen inequality, where E q is a region of the precipitation estimate distribution;
the maximizing module 5 is used for maximizing the lower boundary E q[log(p(Θ,M))]-Eq [ log (q (Θ)) ] of precipitation estimation to obtain quantitative precipitation estimation results
The data set M at least comprises a radar data set of a quantitative precipitation estimation area and an automatic station information data set, wherein the radar data set is 1Km x 1Km radar reflectivity factor grid data in the quantitative precipitation estimation area, and the automatic station information data set comprises longitude and latitude position coordinates and precipitation live observation data of each automatic station in the quantitative precipitation estimation area.
The constructing a priori distribution model p (m|d) based on the data set M, includes:
g 0 in the Dirichlet non-parametric bayesian model is a basic probability distribution on the probability measure space Ω, the concentration parameter α 0 >0, and if the probability distribution G on the probability measure space Ω obeys the basic probability distribution G 0:
G~DP(α0,G0)
Wherein, the basic probability distribution G 0 determines the distribution of basic constituent elements in the prior distribution model p (M|D), and DP is a Dirichlet process;
Based on the Dirichlet process hybrid model DPM, a generation probability is increased for each data point in a given amount of precipitation estimation area as a priori distribution of data:
mi~p(m|θi)
wherein, the parameter theta i obeys the probability distribution G, i epsilon N is a set from 1 to the total number N of data points, each data point generates a probability parameter, N is the total number of data points, p is a conditional probability density function, m is the precipitation estimated value of the data point;
the optimal model is chosen as the prior distribution model p (m|d) by comparing the likelihood functions of the different families of prior distribution probability models M i~p(m|θi):
p(M|D)=∫Θ(M|Θ)p(Θ|D)
wherein D represents a family of models.
The selecting the optimal model as the prior distribution model p (m|d) by comparing likelihood functions of different family models includes:
Setting the data of a data set M= { M 1,m2,m3…mn } to be independent, and reading n automatic station live precipitation observation data in a quantitative precipitation estimation area, wherein the n automatic station live precipitation observation data are arranged randomly to obtain { F (i) }, wherein i=1, 2,..n;
Is provided with Ω=Ω t-1, sampling the indicator factor β i of each automatic station live precipitation observation data i e { F (1), F (2)..f (n) }, n randomly arranged automatic station live precipitation observation data formed based on the function { F (i) }, wherein,Omega t-1 is the probability measure space of the time t-1;
Likelihood estimates f k(mi) of the observed data based on the current K clusters,
fk(mi)=p(mii=k,M\i,ζ)
Wherein, beta i is a new category, M \i is the data of the corresponding corner mark is removed from the data set M= { M 1,m2,m3…mn }, ζ is a distribution parameter, k is an observation value, and M i is a random variable; similar to k, another observation that represents a non-k;
Beta i is sampled:
in the formula, For the amount of data already in class K, E i is a preset observation data sample, K is the number of observation data samples, f k represents the probability density function for class K, delta is the Cronecker delta function when β i =k, delta (β i, K) =1, otherwise 0; To represent probability density functions for other than the kth class;
If it is Then the number of clusters K is increased by 1, k=k+1;
Checking the observed data quantity of various clustering calculation likelihood functions, if the total number of the observed data of one type is 0, deleting the corresponding type, and reducing the clustering quantity K by 1, wherein K=K-1.
The embodiment of the invention provides a computer readable storage medium, on which a computer program is stored, and a processor executes the computer program to implement the method of any one of the above embodiments.
The electronic device provided by the embodiment of the invention comprises a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to realize the method of any one of the above.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (10)

1.一种定量降水估测方法,其特征在于,包括:1. A quantitative precipitation estimation method, characterized by comprising: 获取数据集M;Get data set M; 基于数据集M,构建先验分布模型p(M|D);其中,D为一族模型;Based on the data set M, construct a prior distribution model p(M|D); where D is a family of models; 基于变分推理,根据先验分布模型p(M|D),界定降水估测后验分布的近似分布q(Θ);其中,Θ为概率模型参数;Based on variational inference, according to the prior distribution model p(M|D), the approximate distribution q(Θ) of the posterior distribution of precipitation estimation is defined; where Θ is the probability model parameter; 基于杰森不等式,根据近似分布q(Θ),确定降水估测的下界Eq[log(p(Θ,M))]-Eq[log(q(Θ))];其中,Eq为降水估测数据分布区间;Based on Jason's inequality and the approximate distribution q(Θ), the lower bound of precipitation estimation is determined as E q [log(p(Θ,M))]-E q [log(q(Θ))]; where E q is the distribution interval of precipitation estimation data; 对降水估测的下界Eq[log(p(Θ,M))]-Eq[log(q(Θ))]进行最大化,获得定量降水估测结果 Maximize the lower bound of precipitation estimation Eq [log(p(Θ,M))]- Eq [log(q(Θ))] to obtain quantitative precipitation estimation results 2.如权利要求1所述的定量降水估测方法,其特征在于,所述数据集M至少包括:定量降水估测区域的雷达数据集以及自动站信息数据集,其中雷达数据集为定量降水估测区域内的1Km*1Km雷达反射率因子网格数据,自动站信息数据集包含定量降水估测区域内每个自动站的经纬度位置坐标、降水量实况观测数据。2. The quantitative precipitation estimation method as described in claim 1 is characterized in that the data set M at least includes: a radar data set of the quantitative precipitation estimation area and an automatic station information data set, wherein the radar data set is 1Km*1Km radar reflectivity factor grid data in the quantitative precipitation estimation area, and the automatic station information data set contains the longitude and latitude position coordinates and actual precipitation observation data of each automatic station in the quantitative precipitation estimation area. 3.如权利要求1所述的定量降水估测方法,其特征在于,所述基于数据集M,构建先验分布模型p(M|D),包括:3. The quantitative precipitation estimation method according to claim 1, wherein the step of constructing a prior distribution model p(M|D) based on the data set M comprises: Dirichlet非参数贝叶斯模型中G0是概率测度空间Ω上的基础概率分布,集中参数α0>0,概率测度空间Ω上的概率分布G若服从基础概率分布G0,则:In the Dirichlet nonparametric Bayesian model, G 0 is the basic probability distribution on the probability measure space Ω, and the concentration parameter α 0 >0. If the probability distribution G on the probability measure space Ω obeys the basic probability distribution G 0 , then: G~DP(α0,G0)G~DP(α 0 ,G 0 ) 式中:基础概率分布G0决定先验分布模型p(M|D)中基本组成元素的分布;DP为Dirichlet过程;Where: the basic probability distribution G 0 determines the distribution of the basic components in the prior distribution model p(M|D); DP is the Dirichlet process; 基于Dirichlet过程混合模型DPM,给定量降水估测区域内每个数据点增加一个生成概率,作为数据的先验分布:Based on the Dirichlet process mixture model DPM, a generation probability is added to each data point in the given precipitation estimation area as the prior distribution of the data: mi~p(m|θi)m i ~p(m|θ i ) 式中:参数θi服从概率分布G,i∈[N]是取值于1到数据点总数N的集合,每个数据点生成一个概率参数,N为数据点的总数;p是条件概率密度函数;m是数据点的降水估测值;Where: parameter θi follows the probability distribution G, i∈[N] is a set with values ranging from 1 to the total number of data points N, each data point generates a probability parameter, N is the total number of data points; p is the conditional probability density function; m is the precipitation estimate of the data point; 通过比较不同族先验分布概率模型mi~p(m|θi)的似然函数来选取最优模型作为先验分布模型p(M|D):By comparing the likelihood functions of different families of prior distribution probability models m i ~p(m|θ i ), the optimal model is selected as the prior distribution model p(M|D): p(M|D)=∫Θ(M|Θ)p(Θ|D)p(M|D)=∫ Θ (M|Θ)p(Θ|D) 式中:D表示一族模型。Where: D represents a family of models. 4.如权利要求1所述的定量降水估测方法,其特征在于,所述通过比较不同族模型的似然函数来选取最优模型作为先验分布模型p(M|D),包括:4. The quantitative precipitation estimation method according to claim 1, wherein the selecting the optimal model as the prior distribution model p(M|D) by comparing the likelihood functions of different families of models comprises: 设数据集M={m1,m2,m3…mn}的数据独立,读取定量降水估测区域内的n个自动站实况降水观测数据,n个自动站实况降水观测数据随机排列,得到{F(i)},其中i=1,2,…n;Assume that the data set M = {m 1 ,m 2 ,m 3 …m n } is independent, read the actual precipitation observation data of n automatic stations in the quantitative precipitation estimation area, and randomly arrange the actual precipitation observation data of n automatic stations to obtain {F(i)}, where i = 1, 2, …n; Ω=Ωt-1,对每个自动站实况降水观测数据i∈{F(1),F(2)…F(n)}、基于函数{F(i)}形成的n个随机排列组成的自动站实况降水观测数据的指示因子βi进行采样;其中,为t-1时刻的集中参数;Ωt-1为t-1时刻的概率测度空间;set up Ω=Ω t-1 , sample the indicator factor β i of the actual precipitation observation data of each automatic station i∈{F(1),F(2)…F(n)} and the n random permutations of the actual precipitation observation data of the automatic station formed based on the function {F( i )}; wherein, is the concentrated parameter at time t-1;Ω t-1 is the probability measure space at time t-1; 基于当前的K个聚类计算观测数据的似然估计fk(mi)、 Calculate the likelihood estimate f k (m i ) of the observed data based on the current K clusters, fk(mi)=p(mii=k,M\i,ζ)f k (m i )=p(m ii =k,M \i ,ζ) 式中,βi为新的一个类别,M\i为从数据集M={m1,m2,m3…mn}中移除对应角标的数据;ζ为分布参数;k为观测值;mi为一个随机变量;与k类似,表示非k的另外一个观测值;In the formula, β i is a new category, M \i is the data with the corresponding subscript removed from the data set M = {m 1 ,m 2 ,m 3 …m n }; ζ is the distribution parameter; k is the observation value; mi is a random variable; Similar to k, it represents another observation value other than k; 对βi进行采样:Sampling β i : 式中,为第k类中已经有的数据量;Ei为预设的观测数据样本;K为观测数据样本的数量;fk表示对于第k类的概率密度函数;δ是克罗内克δ函数,当βi=k时,δ(βi,k)=1,否则为0;为表示对于除了第k类之外的概率密度函数;In the formula, is the amount of data already in the kth class; E i is the preset observed data sample; K is the number of observed data samples; f k represents the probability density function for the kth class; δ is the Kronecker delta function, when β i = k, δ(βi, k) = 1, otherwise it is 0; To represent the probability density function except for the kth class; 如果则将聚类数量K增加1个,K=K+1;if Then increase the number of clusters K by 1, K = K + 1; 检查各类聚类计算似然函数的观测数据量,若出现一类的观测数据总数为0,删除对应类别,聚类数量K减少1个,K=K-1。Check the amount of observation data for calculating the likelihood function of each type of clustering. If the total number of observation data for one type is 0, delete the corresponding type and the number of clusters K will be reduced by 1, K=K-1. 5.一种定量降水估测系统,其特征在于,包括:5. A quantitative precipitation estimation system, characterized by comprising: 获取模块,用于获取数据集M;An acquisition module, used to acquire a data set M; 构建模块,用于基于数据集M,构建先验分布模型p(M|D);其中,D为一族模型;A construction module is used to construct a prior distribution model p(M|D) based on a data set M, where D is a family of models; 界定模块,用于基于变分推理,根据先验分布模型p(M|D),界定降水估测后验分布的近似分布q(Θ);其中,Θ为概率模型参数;A definition module is used to define the approximate distribution q(Θ) of the posterior distribution of precipitation estimation based on the prior distribution model p(M|D) based on variational inference; wherein Θ is a probability model parameter; 确定模块,用于基于杰森不等式,根据近似分布q(Θ),确定降水估测的下界Eq[log(p(Θ,M))]-Eq[log(q(Θ))];其中,Eq为降水估测分布的区间;A determination module, used for determining the lower bound of precipitation estimation E q [log(p(Θ,M))]-E q [log(q(Θ))] based on Jason's inequality and the approximate distribution q(Θ); wherein E q is the interval of the precipitation estimation distribution; 最大化模块,用于对降水估测的下界Eq[log(p(Θ,M))]-Eq[log(q(Θ))]进行最大化,获得定量降水估测结果 The maximization module is used to maximize the lower bound of precipitation estimation Eq [log(p(Θ,M))]- Eq [log(q(Θ))] to obtain quantitative precipitation estimation results. 6.如权利要求5所述的定量降水估测系统,其特征在于,所述数据集M至少包括:定量降水估测区域的雷达数据集以及自动站信息数据集,其中雷达数据集为定量降水估测区域内的1Km*1Km雷达反射率因子网格数据,自动站信息数据集包含定量降水估测区域内每个自动站的经纬度位置坐标、降水量实况观测数据。6. The quantitative precipitation estimation system as described in claim 5 is characterized in that the data set M at least includes: a radar data set of the quantitative precipitation estimation area and an automatic station information data set, wherein the radar data set is 1Km*1Km radar reflectivity factor grid data in the quantitative precipitation estimation area, and the automatic station information data set contains the latitude and longitude position coordinates and actual precipitation observation data of each automatic station in the quantitative precipitation estimation area. 7.如权利要求5所述的定量降水估测系统,其特征在于,所述基于数据集M,构建先验分布模型p(M|D),包括:7. The quantitative precipitation estimation system according to claim 5, wherein the constructing of the prior distribution model p(M|D) based on the data set M comprises: Dirichlet非参数贝叶斯模型中G0是概率测度空间Ω上的基础概率分布,集中参数α0>0,概率测度空间Ω上的概率分布G若服从基础概率分布G0,则:In the Dirichlet nonparametric Bayesian model, G 0 is the basic probability distribution on the probability measure space Ω, and the concentration parameter α 0 >0. If the probability distribution G on the probability measure space Ω obeys the basic probability distribution G 0 , then: G~DP(α0,G0)G~DP(α 0 ,G 0 ) 式中:基础概率分布G0决定先验分布模型p(M|D)中基本组成元素的分布;DP为Dirichlet过程;Where: the basic probability distribution G 0 determines the distribution of the basic components in the prior distribution model p(M|D); DP is the Dirichlet process; 基于Dirichlet过程混合模型DPM,给定量降水估测区域内每个数据点增加一个生成概率,作为数据的先验分布:Based on the Dirichlet process mixture model DPM, a generation probability is added to each data point in the given precipitation estimation area as the prior distribution of the data: mi~p(m|θi)m i ~p(m|θ i ) 式中:参数θi服从概率分布G,i∈[N]是取值于1到数据点总数N的集合,每个数据点生成一个概率参数,N为数据点的总数;p是条件概率密度函数;m是数据点的降水估测值;Where: parameter θi follows the probability distribution G, i∈[N] is a set with values ranging from 1 to the total number of data points N, each data point generates a probability parameter, N is the total number of data points; p is the conditional probability density function; m is the precipitation estimate of the data point; 通过比较不同族先验分布概率模型mi~p(m|θi)的似然函数来选取最优模型作为先验分布模型p(M|D):By comparing the likelihood functions of different families of prior distribution probability models m i ~p(m|θ i ), the optimal model is selected as the prior distribution model p(M|D): p(M|D)=∫Θ(M|Θ)p(Θ|D)p(M|D)=∫ Θ (M|Θ)p(Θ|D) 式中:D表示一族模型。Where: D represents a family of models. 8.如权利要求5所述的定量降水估测系统,其特征在于,所述通过比较不同族模型的似然函数来选取最优模型作为先验分布模型p(M|D),包括:8. The quantitative precipitation estimation system according to claim 5, wherein the selecting the optimal model as the prior distribution model p(M|D) by comparing the likelihood functions of different families of models comprises: 设数据集M={m1,m2,m3…mn}的数据独立,读取定量降水估测区域内的n个自动站实况降水观测数据,n个自动站实况降水观测数据随机排列,得到{F(i)},其中i=1,2,…n;Assume that the data set M = {m 1 ,m 2 ,m 3 …m n } is independent, read the actual precipitation observation data of n automatic stations in the quantitative precipitation estimation area, and randomly arrange the actual precipitation observation data of n automatic stations to obtain {F(i)}, where i = 1, 2, …n; Ω=Ωt-1,对每个自动站实况降水观测数据i∈{F(1),F(2)…F(n)}、基于函数{F(i)}形成的n个随机排列组成的自动站实况降水观测数据的指示因子βi进行采样;其中,为t-1时刻的集中参数;Ωt-1为t-1时刻的概率测度空间;set up Ω=Ω t-1 , sample the indicator factor β i of the actual precipitation observation data of each automatic station i∈{F(1),F(2)…F(n)} and the n random permutations of the actual precipitation observation data of the automatic station formed based on the function {F( i )}; wherein, is the concentrated parameter at time t-1;Ω t-1 is the probability measure space at time t-1; 基于当前的K个聚类计算观测数据的似然估计fk(mi)、 Calculate the likelihood estimate f k (m i ) of the observed data based on the current K clusters, fk(mi)=p(mii=k,M\i,ζ)f k (m i )=p(m ii =k,M \i ,ζ) 式中,βi为新的一个类别,M\i为从数据集M={m1,m2,m3…mn}中移除对应角标的数据;ζ为分布参数;k为观测值;mi为一个随机变量;与k类似,表示非k的另外一个观测值;In the formula, β i is a new category, M \i is the data with the corresponding subscript removed from the data set M = {m 1 ,m 2 ,m 3 …m n }; ζ is the distribution parameter; k is the observation value; mi is a random variable; Similar to k, it represents another observation value other than k; 对βi进行采样:Sampling β i : 式中,为第k类中已经有的数据量;Ei为预设的观测数据样本;K为观测数据样本的数量;fk表示对于第k类的概率密度函数;δ是克罗内克δ函数,当βi=k时,δ(βi,k)=1,否则为0;为表示对于除了第k类之外的概率密度函数;In the formula, is the amount of data already in the kth class; E i is the preset observation data sample; K is the number of observation data samples; f k represents the probability density function for the kth class; δ is the Kronecker delta function, when β i = k, δ(β i , k) = 1, otherwise it is 0; To represent the probability density function except for the kth class; 如果则将聚类数量K增加1个,K=K+1;if Then increase the number of clusters K by 1, K = K + 1; 检查各类聚类计算似然函数的观测数据量,若出现一类的观测数据总数为0,删除对应类别,聚类数量K减少1个,K=K-1。Check the amount of observation data for calculating the likelihood function of each type of clustering. If the total number of observation data for one type is 0, delete the corresponding type and the number of clusters K will be reduced by 1, K=K-1. 9.一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机程序,处理器执行所述计算机程序,实现如权利要求1-4中任一项所述的方法。9. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, and a processor executes the computer program to implement the method according to any one of claims 1 to 4. 10.一种电子设备,其特征在于,该电子设备包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器执行所述计算机程序,实现如权利要求1-4中任一项所述的方法。10. An electronic device, characterized in that the electronic device comprises a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to implement the method according to any one of claims 1 to 4.
CN202510094673.6A 2025-01-21 2025-01-21 A quantitative precipitation estimation method, system, storage medium and electronic device Pending CN120013004A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202510094673.6A CN120013004A (en) 2025-01-21 2025-01-21 A quantitative precipitation estimation method, system, storage medium and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202510094673.6A CN120013004A (en) 2025-01-21 2025-01-21 A quantitative precipitation estimation method, system, storage medium and electronic device

Publications (1)

Publication Number Publication Date
CN120013004A true CN120013004A (en) 2025-05-16

Family

ID=95660154

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202510094673.6A Pending CN120013004A (en) 2025-01-21 2025-01-21 A quantitative precipitation estimation method, system, storage medium and electronic device

Country Status (1)

Country Link
CN (1) CN120013004A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170075034A1 (en) * 2015-09-10 2017-03-16 The Climate Corporation Generating probabilistic estimates of rainfall rates from radar reflectivity measurements
CN109344999A (en) * 2018-09-07 2019-02-15 华中科技大学 A Probabilistic Prediction Method of Runoff
CN112612995A (en) * 2021-03-08 2021-04-06 武汉理工大学 Multi-source rainfall data fusion algorithm and device based on Bayesian regression
CN116565840A (en) * 2023-04-20 2023-08-08 湖南大学 A high-precision wind speed soft-sensing method for wind power prediction in wind farms

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170075034A1 (en) * 2015-09-10 2017-03-16 The Climate Corporation Generating probabilistic estimates of rainfall rates from radar reflectivity measurements
CN109344999A (en) * 2018-09-07 2019-02-15 华中科技大学 A Probabilistic Prediction Method of Runoff
CN112612995A (en) * 2021-03-08 2021-04-06 武汉理工大学 Multi-source rainfall data fusion algorithm and device based on Bayesian regression
CN116565840A (en) * 2023-04-20 2023-08-08 湖南大学 A high-precision wind speed soft-sensing method for wind power prediction in wind farms

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
SHENGCHAO CHEN 等: "TempEE: Temporal–Spatial Parallel Transformer for Radar Echo Extrapolation Beyond Autoregression", 《IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING》, vol. 61, 4 September 2023 (2023-09-04), pages 5108914 *
WENYUAN LI 等: "StarNet: A Deep Learning Model for Enhancing Polarimetric Radar Quantitative Precipitation Estimation", 《 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING》, vol. 62, 11 July 2024 (2024-07-11), pages 4106513 *
陈训来 等: "基于雷暴尺度集合预报的频率匹配降水预报研究", 《气象与环境科学》, vol. 45, no. 6, 15 November 2022 (2022-11-15), pages 9 - 17 *
高歆 等: "面向稀疏降水站点的套合各向异性贝叶斯地统计估计研究", 《地球信息科学学报》, vol. 24, no. 8, 18 July 2022 (2022-07-18), pages 1445 - 1458 *

Similar Documents

Publication Publication Date Title
CN117116382B (en) Method and system for spatial and temporal prediction of water quality of lakes affected by water diversion projects
CN108428017B (en) Wind power interval prediction method based on kernel extreme learning machine quantile regression
CN114254561A (en) Waterlogging prediction method, waterlogging prediction system and storage medium
CN110705760A (en) A photovoltaic power generation power prediction method based on deep belief network
CN110597873A (en) Precipitation data estimation method, device, equipment and storage medium
CN116454875A (en) Method and system for medium-term power probability prediction method and system of regional wind farms based on cluster division
CN107886160B (en) BP neural network interval water demand prediction method
CN117526274A (en) New energy power prediction method, electronic equipment and storage medium in extreme climate
CN117078048A (en) Digital twinning-based intelligent city resource management method and system
CN117408394B (en) Carbon emission factor prediction method and device for electric power system and electronic equipment
CN117787110A (en) Soil moisture inversion method and system based on deep learning model
CN117521907A (en) Photovoltaic power generation power interval prediction method considering photovoltaic output and meteorological elements
Ferro A probability model for verifying deterministic forecasts of extreme events
CN115169089B (en) Wind power probability prediction method and device based on kernel density estimation and copula
CN119782859A (en) A monitoring method and system for water conservancy projects
CN113095579B (en) Daily-scale rainfall forecast correction method coupled with Bernoulli-gamma-Gaussian distribution
CN110942196B (en) Predicted irradiation correction method and device
CN120013004A (en) A quantitative precipitation estimation method, system, storage medium and electronic device
CN110929849B (en) Video detection method and device based on neural network model compression
CN115525872B (en) Two-step Bayesian estimation method for building scale population fused with position data
CN117609756A (en) A non-uniform hydrological sequence reconstruction method based on regional characteristics
CN113886360B (en) Data table partitioning method, device, computer readable medium and electronic equipment
CN113868939A (en) A method, device, equipment and medium for probability density evaluation of wind power
WO2022217568A1 (en) Daily precipitation forecast correction method coupled with bernoulli-gamma-gaussian distributions
CN117271915B (en) Space sampling point planning method considering space-time variability

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载