Summary of the invention
The present invention be directed to not consider to change the influence to load because of external conditions such as weather conditions in load prediction research
And the problem of without carrying out similar day selection, a kind of public building short-term load forecasting method is proposed, estimated performance is improved
With the validity of prediction.
The technical solution of the present invention is as follows: a kind of public building short-term load forecasting method, specifically comprises the following steps:
1) to each acquisition moment set daily, moment sequence is carried out to public building load power data, meteorological data
Column acquisition;
2) similar day based on mutual information is selected:
2.1) data normalization: public building load power data and each meteorological effect factor moment to step 1) acquisition
Sequence data is normalized, and obtains processing afterload power sequence X and each meteorological effect factor data sequence, wherein
I-th of meteorological effect factor data sequence is Yi;
2.2) mutual information calculates: calculating separately the association relationship of load power sequence X Yu each influence factor sequence, the value
Show that this meteorologic factor is more important more greatly;
2.3) similar day selects: the crucial meteorological effect factor picked out with step 2.2), with this meteorological effect factor pair
On the basis of the data sequence answered, similarity is determined using Euclidean distance, selects the similar day of load power prediction;
3) ARIMA-BP mixed model is predicted:
3.1) it by carrying out autocorrelation analysis and partial autocorrelation analysis to the similar day load power sequence selected, determines
The parameter of ARIMA time series models;
3.4) the load power sequence inputting of similar day to ARIMA time series models carries out prediction daily load power
Prediction, obtains the prediction result of ARIMA time series predicting model;
It 3.3) will be more than similar day load power sequence, the similar day of setting value by its auto-correlation coefficient of autocorrelation analysis
Corresponding key meteorological effect is because of prime sequences, ARIMA time series models load power predicted value, the crucial effect for predicting day
Factor sequential value forms ARIMA-BP mixed model and carries out load power prediction as the input of BP neural network prediction model;
3.5) average forecasting error is carried out to ARIMA-BP mixed model prediction result and calculates analysis, as error is approving model
It encloses, gained ARIMA-BP mixed model is used for public building short-term load forecasting.
Mutual information calculates specific as follows in the step 2.2): load power sequence and meteorological effect factor are all time sequence
Column, to load power sequence X and i-th of meteorological effect factor data sequence YiFor, if their joint probability density function
For pX,Yi(x(t),yi(t)), i=1,2,3 ..., l, l are the total number of influence factor, then the marginal probability distribution of load sequence
Marginal probability distribution with i-th of factor is respectively pX(x (t)) and pYi(yi(t)):
Wherein x (t) is the sampled data values of t moment load power sequence, yiIt (t) is i-th of influence factor data of t moment
The sampled data values of sequence;
The comentropy of load power sequence, the comentropy of i-th of meteorological effect factor are respectively H (X) and H (Yi):
Load power sequence X and i-th of influence factor sequence YiCombination entropy are as follows:
Load power sequence X and i-th of meteorological effect are because of prime sequences YiMutual information I (X, Yi) calculation formula are as follows: I (X,
Yi)=H (X)+H (Yi)-H(X,Yi)
In formula, I (X, Yi) it is association relationship of the meteorological effect factor about load power sequence, which shows more greatly i-th
A meteorologic factor is more important, that is, influences on load power bigger.
The beneficial effects of the present invention are: public building short-term load forecasting method of the present invention, it is contemplated that weather condition, gas
The external conditions such as temperature, sendible temperature change the influence to load, are picked out using the method for mutual information maximum to loading effects
Meteorologic factor;On the basis of key influence factor, using Euclidean distance select similar day, and by similar day according to Euclidean distance from
It is small to be arranged to big sequence, improve the validity of estimated performance and prediction;First load prediction is carried out to using ARIMA,
Using the load value of its predicted value and similar day, similar day and predict the key influence factor value of day as BP neural network
Input forms ARIMA-BP mixed model, carries out second of load prediction, and prediction result has scientific and accuracy.
Specific embodiment
The method of the present invention flow chart as shown in Figure 1, specifically comprises the following steps:
One, one-year age elapses, to each acquisition moment set daily, to public building load power data, meteorology
Data carry out moment sequence acquisition;
Two, similar day based on mutual information is selected
Meteorologic factor is the one of the major reasons for causing electric power short term to change, and when carrying out load prediction, usually will
Meteorologic factor chooses similar day as mainly foundation.Different meteorologic factors differ greatly to the influence degree of load, to improve
The quality that similar day is chosen, needs to select key influence factor.
1, data normalization
The meteorologic factor of consideration mainly has temperature, rainfall, wind speed, pressure, relative humidity, sendible temperature.It is each it is meteorological because
The dimension of element is different, needs to carry out data normalization processing, removes dimension.Using maximum-minimum normalized method, by each gas
As factor value yi(t)0The value y being converted on [0,1] sectioni(t), calculation formula is as follows:
In formula, maxyiIt (t) is the maximum value in each meteorological effect factor sequence samples data, minyiIt (t) is each meteorological shadow
Ring the minimum value in factor sequence samples data, maxyi(t)-minyi (t) is each meteorological effect factor sequence samples data
It is very poor.
In addition to the above-mentioned Meteorological Change factor, it is also contemplated that weather condition.District of Shanghai weather condition mainly have it is fine, cloudy, negative,
The types such as rain, rain and snow mixed, respectively in quantization to [0,1] section, wherein fine be quantified as 1, cloudy to be quantified as 0.9, yin is quantified as
0.8, light rain is quantified as 0.7, and moderate rain is quantified as 0.6, and heavy rain is quantified as 0.5, and rain and snow mixed is quantified as 0.4.
It simultaneously is also required to that load power sequence data is equally normalized similar to meteorological effect factor, by load power
Sequence data x (t)0The value x (t) being converted on [0,1] section, calculation formula:
In formula, maxx (t) is the maximum value in load power sequence sample data, and minx (t) is load power sequence sample
Minimum value in notebook data, maxx (t)-minx (t) are the very poor of load power sequence sample data.
2, mutual information calculates
Mutual information in information theory indicate information content in another system included in a system number, be applied to pass
Connection factor then characterizes the power of relation of interdependence between relation factor and load in selecting.The value of mutual information is bigger, then table
The bright factor and the information content that load shares are more, bigger to the influence of load.
Load power sequence and meteorological effect factor are all time series, to load power sequence X and i-th of meteorological effect
Factor data sequence YiFor (i=1,2,3 ..., l, l are the total number of influence factor), if their joint probability density letter
Number is pX,Yi(x(t),yi(t)), then the marginal probability distribution of load sequence and the marginal probability distribution of i-th of factor are respectively pX
(x (t)) and pYi(yi(t)):
Wherein x (t) is the sampled data values of t moment load power sequence, yiIt (t) is i-th of influence factor data of t moment
The sampled data values of sequence.
The comentropy of load power sequence, the comentropy of i-th of meteorological effect factor are respectively H (X) and H (Yi):
Load power sequence X and i-th of influence factor sequence YiCombination entropy are as follows:
Load power sequence X and i-th of meteorological effect are because of prime sequences YiMutual information I (X, Yi) calculation formula are as follows:
I(X,Yi)=H (X)+H (Yi)-H(X,Yi)
In formula, I (X, Yi) it is association relationship of the meteorological effect factor about load power sequence, which shows more greatly i-th
A meteorologic factor is more important, that is, influences bigger, crucial meteorologic factor mutual information calculated result such as table 1 to load power.
Table 1
3, similar day selects
If picking out crucial meteorological effect factor is m-th of influence factor, after picking out, with the influence factor data sequence
Ym=(ym(t1),ym(t2),…,ym(tn)) on the basis of, select the similar day of load prediction.Euclidean distance is most common similar
One of measurement, it has many advantages, such as that calculating is simple, quick, calculates jth day using Euclidean distance and predict the similarity of day, calculates
Formula is as follows:
In formula, YmjFor the sequence vector after the normalization of jth day key influence factor, Ym0To predict day key influence factor
Sequence vector after normalization, ymj(tk) it is key influence factor sampled data values of the jth day k-th of moment, ym0(tk) be
In the key influence factor sampled data values at prediction k-th of moment of day, tkIt is expressed as k-th of moment, sequence vector one shares n
Moment, i.e. future position have n.
The similarity for calculating all history days and prediction day in sample picks out Euclidean distance the smallest first 30 days as phase
Like day, and descending arrangement is carried out, smaller with the Euclidean distance of prediction day, similarity is higher.The similar day selection on the 8th of August in 2016
The results are shown in Table 2.
Table 2
Three, the formation of ARIMA-BP mixed model
(1) ARIMA-BP mixed model
Time series can be decomposed into a linear component and a nonlinear component, and ARIMA autoregression integral slides
Averaging model is a kind of Time Series Analysis Model, is linear model, BP neural network is nonlinear model.By ARIMA model
The prediction result for carrying out load power prediction is added in the input variable of BP neural network, the ARIMA-BP mixed model of formation
Can more fully learning sample data characteristics, to improve the precision of load prediction.
The input sample point of different prediction modes requires different, main consideration directly prediction and two kinds of prediction moulds of rolling forecast
Formula.Autocorrelation analysis is carried out to the load power sequence for the similar day selected, is more than 0.8 point with auto-correlation coefficient (ACF)
For input sample point.
When carrying out the load prediction of prediction day using ARIMA-BP mixed model, when considering direct prediction mode, first
It is predicted by ARIMA time series models, the load prediction results of prediction day will be obtained as ARIMA-BP mixed model
One of input variable.In load power sequence data x (t) (taking t is the integral point moment) for carrying out prediction prediction day t moment,
Needing to input similar day load power sequence samples data is x (t-23), x (t-24) and ARIMA time series predicting model
Sample data x (t) * of obtained prediction daily load power sequence;Meanwhile by the corresponding crucial meteorological effect factor sequence of similar day
Arrange YmAs another group of input sample, since the meteorological data of prediction day is known, therefore the key influence factor that will predict day
Sequence Ym* it is also added in input sample, the meteorological effect of input is y because of the data sample of prime sequencesm(t)*,ym(t-1)*,ym
(t-2)*,…,ym(t-24)*。
When considering rolling forecast mode, equally first by ARIMA time series models to prediction daily load sequence into
Row prediction, and using prediction result as one of input variable of ARIMA-BP mixed model.Carrying out prediction prediction day t moment
When load power sequence data x (t) (taking t is the integral point moment), needing to input similar day load power sequence samples data is x
(t-1), x (t-2), x (t-23), the prediction daily load power sequence that x (t-24) and ARIMA time series predicting model obtain
Sample data x (t) * of column;Meanwhile by the corresponding crucial meteorological effect of similar day because of prime sequences YmAs another group of input sample
This, is since the meteorological data of prediction day is known, therefore the key influence factor sequence Y that will predict daym* it is also added to input sample
In this, the data sample of the meteorological effect of input because of prime sequences are as follows:
ym(t)*,ym(t-1)*,ym(t-2)*,…,ym(t-24)*。
Two kinds of prediction modes progress ARIMA-BP mixed models of directly prediction and rolling forecast are respectively adopted and carry out prediction day
Load prediction, compare the prediction effect of two kinds of prediction modes, and select optimum prediction mode according to the actual situation.
The process of ARIMA-BP hybrid forecasting method:
1. being determined by carrying out autocorrelation analysis and partial autocorrelation analysis to the similar day load power sequence selected
The parameter of ARIMA time series models;
2. the load power sequence inputting of similar day to ARIMA time series models predicts prediction daily load,
Obtain the prediction result of ARIMA time series predicting model;
3. the load power sequence of the input sample point selected by auto-correlation coefficient (is taken similar daily load in the present invention
Power sequence auto-correlation coefficient (ACF) be more than 0.8 point be input sample point) and similar day crucial meteorological effect factor sequence
Column, ARIMA time series models predicted load, the key influence factor sequential value for predicting day are pre- as BP neural network
The input of model is surveyed, ARIMA-BP mixed model is formed and carries out load prediction.
4. carrying out average forecasting error to ARIMA-BP mixed model prediction result calculates analysis.
(2) ARIMA time series models
Difference ARMA model ARIMA (p, d, q) is a kind of time sequence that can predict linear change well
Column model, wherein AR, p, MA, q, d respectively indicate expression autoregression, autoregression item, rolling average, rolling average item number, time
The difference number of sequence stationaryization processing.
The basic thought of ARIMA model is: will predict object over time and formed data sequence be considered as one with
Machine sequence, with certain mathematical model come this sequence of approximate description.This model can be from time sequence after identified
The past value of column and value predicts future value now.
ARIMA (p, d, q) model is established to need through following two step:
1. unit root test is carried out to time series, it is steady to determine if, it needs to carry out difference meter if unstable
It calculates, the difference number of stationary time series is d;
2. calculating the auto-correlation coefficient (ACF) and PARCOR coefficients (PACF) of stationary time series, p and q are determined.
Through unit root test, market load sequence is inherently stable, does not need to carry out differential variation, i.e. d is 0,
Auto-correlation and partial autocorrelation situation are as shown in figures 2 a and 2b.
It can be seen that sinusoidal wave form of the auto-correlation in decaying, is " hangover ", q 0;Partial autocorrelation has significant lag
In the obelisk of 1 rank, i.e. 1 rank " truncation ", p 1, model used in preliminary judgement is ARIMA (1,0,0).
In this method by the load sequence inputting of similar day into ARIMA time prediction model to prediction day load sequence
It is predicted, obtains the predicted load of prediction day ARIMA time series predicting model.
(3) BP neural network
BP neural network is a kind of supervised learning model, has very strong self-organizing, adaptive ability.It is by having
After the learning training of representative sample, the intrinsic propesties of research system can be grasped, and structure is simple, strong operability, it can mould
Quasi- arbitrary non-linear input/output relation.
Generally the load data of similar day and key influence factor data are input in BP neural network model simultaneously, it can
Short-term load forecasting is carried out, this method is on this basis, further to have input the predicted load of ARIMA time series models
The numerical value of key influence factor corresponding with prediction day.BP neural network structure is as shown in Figure 3.
After debugging, used BP neural network parameter are as follows:
1. training maximum cycle is 200;
2. training objective error is 0.05, learning rate 0.0000001;
3. the determination of activation primitive: training function is the adaptive adjusting training function of LM, and hidden layer uses tanh S type
Function, output layer use linear transfer function;
5. being 5 containing the number of plies.
(4), load power prediction result
Choosing the work of the four seasons in 2015 daily load is sample, carries out load prediction to market class building in 2016, chooses the spring
Predict that day is April 25 to April 29 season, summer predicts that day is August 22 days to August 26th, and autumn predicts that day is September 12 days to 9
The moon 16, winter predict that day is 17 days 2 months to 23 days 2 months.
Prediction effect is judged using mean absolute percentage error (mean absolute percentage error, MAPE)
The quality of fruit, calculation formula are as follows:
In formula, x " (t) is the load sequence prediction value predicting day ARIMA-BP mixed model and obtaining, x'(t) it is actual value,
N is future position number.
By taking market 1 as an example, carry out prediction comparison using 4 kinds of methods: ARIMA time series forecasting, BP neural network prediction,
ARIMA-BP rolling forecast, ARIMA-BP directly predict that being averaged for the different prediction techniques in prediction result market 1 as shown in table 3 is pre-
It surveys error (%).
Table 3
It can be seen that the effect that ARIMA-BP directly predicts following 24 hours loads is best
ARIMA-BP mixed model is applied to 6 market buildings, obtaining market class building prediction mean error is
7.5%, each market average forecasting error market class building load prediction mean error (%) as shown in table 4.
Table 4
Fig. 4 a-4d is that the public building provided by the invention based on similar day selection and ARIMA-BP mixed model is born in short term
Lotus prediction technique to 1 spring of market, the summer, the autumn, the four seasons in winter load prediction result figure.