CN113469399A

CN113469399A - Service prediction method and device

Info

Publication number: CN113469399A
Application number: CN202010242629.2A
Authority: CN
Inventors: 杨帆; 常剑; 陈三鉴
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2020-03-31
Filing date: 2020-03-31
Publication date: 2021-10-01

Abstract

The present specification provides a service prediction method and a device, wherein the service prediction method includes: acquiring historical service data corresponding to each historical time node of a service item in a historical time interval; preprocessing the historical service data according to each service dimension of the service project to obtain characteristic data corresponding to each service dimension of each historical time node; constructing a characteristic sequence corresponding to the historical time interval according to the characteristic data; inputting the characteristic sequence into a time sequence prediction model for prediction to obtain a target characteristic sequence corresponding to a target time interval; and determining target prediction data corresponding to each target time node in the target time interval based on the target feature sequence.

Description

Service prediction method and device

Technical Field

The present disclosure relates to the field of data processing technologies, and in particular, to a service prediction method. The present specification also relates to a traffic prediction apparatus, a computing device, and a computer-readable storage medium.

Background

With the development of internet technology, time series data (which refers to data series arranged in sequence in time) is applied in various technical fields, such as the fields of internet of things, medical treatment or system control, and the like; due to the richness and universality of the time series data, data storage or data extraction can be processed according to the time series, and the change trend of the future data can be predicted by using the past time series data. The prediction of future data through past time series data usually comprises the steps of independently considering each time series, then training specific model parameters through each time series, and realizing the prediction of the future data through a trained model, but the method cannot be well suitable for predicting data after the time series data is increased to continue prediction, and meanwhile, a model fitted according to each independent time series cannot learn more optimal model parameters from a plurality of associated time series, so that the situation that the accuracy of the prediction result is reduced along with the increase of data volume occurs; therefore, a highly efficient and accurate prediction method is needed to solve the above problems.

Disclosure of Invention

In view of this, the embodiments of the present specification provide a service prediction method. The present specification also relates to a traffic prediction apparatus, a computing device, and a computer-readable storage medium, which are used to solve the technical problems in the prior art.

According to a first aspect of embodiments herein, there is provided a traffic prediction method, including:

acquiring historical service data corresponding to each historical time node of a service item in a historical time interval;

preprocessing the historical service data according to each service dimension of the service project to obtain characteristic data corresponding to each service dimension of each historical time node;

constructing a characteristic sequence corresponding to the historical time interval according to the characteristic data;

inputting the characteristic sequence into a time sequence prediction model for prediction to obtain a target characteristic sequence corresponding to a target time interval;

determining target prediction data corresponding to each target time node in the target time interval based on the target feature sequence;

the time series prediction model comprises a decoder and an encoder, wherein the encoder performs time series attention calculation on the feature sequence to obtain an intermediate feature sequence, the decoder performs cross attention calculation on the intermediate feature sequence, and a time series attention calculation result is performed on the cross attention calculation result to obtain the target feature sequence.

Optionally, the obtaining, by the encoder, an intermediate feature sequence by performing a time-series attention calculation on the feature sequence includes:

determining a convolution kernel and a step length preset in the encoder;

performing convolution processing on the characteristic sequence according to the convolution kernel and the step length to obtain a short characteristic sequence;

obtaining the intermediate signature sequence by performing the time series attention calculation on the short signature sequence.

Optionally, the obtaining the intermediate feature sequence by performing the time-series attention calculation on the short feature sequence includes:

determining query weight, key weight and value weight corresponding to the short feature sequence;

calculating a query vector corresponding to the short characteristic sequence according to the query weight, calculating a key vector corresponding to the short characteristic sequence according to the key weight, and calculating a value vector corresponding to the short characteristic sequence according to the value weight;

determining a parameter of interest between the respective historical time nodes in the short feature sequence based on the query vector and the key vector;

calculating the product between the attention parameter and the value vector, and summing the product result to obtain an attention feature sequence;

performing first aggregation processing on the attention feature sequence to obtain a first aggregation feature sequence, and splicing and transforming the first aggregation feature sequence into a second aggregation feature sequence;

and carrying out second polymerization treatment on the second polymerization characteristic sequence to obtain the intermediate characteristic sequence.

Optionally, the performing, by the decoder, cross attention calculation on the intermediate feature sequence, and performing time-series attention calculation on a cross attention calculation result to obtain the target feature sequence includes:

initializing the intermediate characteristic sequence to obtain an initial characteristic sequence corresponding to the prediction step length of each target time node in the target time interval;

and performing the cross attention calculation on the initial feature sequence to obtain a cross feature sequence, and performing the time sequence attention calculation on the cross feature sequence to obtain the target feature sequence.

Optionally, the obtaining a cross feature sequence by performing the cross attention calculation on the initial feature sequence includes:

determining an initial weight corresponding to the initial feature sequence, and calculating a product of the initial weight and the initial feature sequence;

determining a first prediction characteristic sequence according to the product result and the prediction step length characteristic obtained after the coding processing is carried out on the prediction step length;

splicing and transforming the predicted characteristic sequence into a second predicted characteristic sequence;

calculating a prediction query vector corresponding to the second prediction characteristic sequence according to the query weight, calculating a prediction key vector corresponding to the second prediction characteristic sequence according to the key weight, and calculating a prediction value vector corresponding to the second prediction characteristic sequence according to the value weight;

determining the cross feature sequence based on the predicted query vector, the predicted key vector, and the predictor vector.

Optionally, the performing the time-series attention calculation on the cross feature sequence to obtain the target feature sequence includes:

performing the time sequence attention calculation on the cross characteristic sequence to obtain a time sequence characteristic sequence, and determining a time sequence weight corresponding to the time sequence characteristic sequence;

and performing linear transformation on the time sequence characteristic sequence based on the time sequence weight to obtain the target characteristic sequence.

Optionally, the determining target prediction data corresponding to each target time node in the target time interval based on the target feature sequence includes:

determining target feature data corresponding to each service dimension based on the target feature sequence;

and analyzing the target characteristic data according to each target time node in the target time interval, and obtaining the target prediction data according to an analysis result.

Optionally, after the step of determining target prediction data corresponding to each target time node in the target time interval based on the target feature sequence is executed, the method further includes:

clearing the used resources of each target time node in the target time interval based on the target prediction data;

and calling a prepared resource which is the same as the clearing result of the used resource in a resource library, wherein the prepared resource is used by each target time node in the target time interval.

Optionally, the preprocessing the historical service data according to each service dimension of the service item to obtain feature data corresponding to each service dimension of each historical time node includes:

extracting dimension business data corresponding to each business dimension from the historical business data based on each business dimension;

and carrying out standardization processing on the dimension service data according to each historical time node to obtain the feature data.

Optionally, the time series prediction model is trained as follows:

collecting sample service data, and constructing a sample characteristic sequence corresponding to the sample service data;

determining a sample target characteristic sequence corresponding to the sample characteristic sequence;

adding labels to the sample target characteristic sequences based on all sample time nodes in a sample time interval, and taking the sample target characteristic sequences with the labels and the sample characteristic sequences as training samples;

and inputting the training sample into a time series prediction model constructed based on the incidence relation between the sample characteristic sequence and the sample target characteristic sequence for training to obtain the time series prediction model.

According to a second aspect of embodiments herein, there is provided a traffic prediction apparatus, including:

the acquisition module is configured to acquire historical service data corresponding to each historical time node of the service item in a historical time interval;

the preprocessing module is configured to preprocess the historical service data according to each service dimension of the service project to obtain feature data corresponding to each service dimension of each historical time node;

the construction module is configured to construct a characteristic sequence corresponding to the historical time interval according to the characteristic data;

the prediction module is configured to input the characteristic sequence into a time sequence prediction model for prediction to obtain a target characteristic sequence corresponding to a target time interval;

a determination module configured to determine target prediction data corresponding to respective target time nodes in the target time interval based on the target feature sequence;

Optionally, the determining module includes:

a target feature data determining unit configured to determine target feature data corresponding to the service dimensions based on the target feature sequence;

and the target characteristic data analyzing unit is configured to analyze the target characteristic data according to each target time node in the target time interval and obtain the target prediction data according to an analysis result.

Optionally, the traffic prediction apparatus further includes:

a clearing module configured to clear usage resources of each target time node in the target time interval based on the target prediction data;

a calling module configured to call, in a resource library, a prepared resource that is the same as a clearing result of the used resource, where the prepared resource is used by each target time node in the target time interval.

According to a third aspect of embodiments herein, there is provided a computing device comprising:

a memory and a processor;

the memory is to store computer-executable instructions, and the processor is to execute the computer-executable instructions to:

According to a fourth aspect of embodiments herein, there is provided a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the traffic prediction method.

In the service prediction method provided in an embodiment of this specification, by obtaining historical service data of each time node of a service item in a historical time interval, preprocessing the historical service data according to each service dimension of the service item, obtaining feature data corresponding to each service dimension of each historical time node, constructing a feature sequence corresponding to the historical time interval from the feature data, inputting a time sequence prediction model for prediction, obtaining a target feature sequence corresponding to the target time interval, and finally determining the target prediction data corresponding to each target time node in the target time interval according to the target feature sequence, the feature sequence corresponding to the target time interval can be predicted by using the time sequence prediction model including a decoder and an encoder, so that while the efficiency of feature sequence prediction is improved, the method can predict the characteristic sequences of a plurality of time nodes in the target time interval, and can establish a dependency relationship on a time dimension by introducing an attention mechanism, thereby greatly improving the accuracy of predicting the target characteristic sequences and further improving the accuracy of determining the target prediction data.

Drawings

Fig. 1 is a flowchart of a traffic prediction method provided in an embodiment of the present specification;

fig. 2 is a schematic diagram of a time series prediction model processing procedure in a service prediction method according to an embodiment of the present specification;

fig. 3 is a schematic diagram of a convolution process in a traffic prediction method according to an embodiment of the present disclosure;

fig. 4 is a schematic diagram of an encoding process in a service prediction method according to an embodiment of the present disclosure;

fig. 5 is a schematic diagram illustrating a process of initializing an intermediate feature sequence in a service prediction method according to an embodiment of the present specification;

fig. 6 is a schematic diagram of a decoding process in a traffic prediction method according to an embodiment of the present disclosure;

fig. 7 is a schematic structural diagram of a traffic prediction apparatus according to an embodiment of the present disclosure;

fig. 8 is a block diagram of a computing device according to an embodiment of the present disclosure.

Detailed Description

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present description. This description may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein, as those skilled in the art will be able to make and use the present disclosure without departing from the spirit and scope of the present disclosure.

The terminology used in the description of the one or more embodiments is for the purpose of describing the particular embodiments only and is not intended to be limiting of the description of the one or more embodiments. As used in one or more embodiments of the present specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any and all possible combinations of one or more of the associated listed items.

It will be understood that, although the terms first, second, etc. may be used herein in one or more embodiments to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first can also be referred to as a second and, similarly, a second can also be referred to as a first without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.

First, the noun terms to which one or more embodiments of the present specification relate are explained.

Time series: the numerical values of the same statistical index are arranged in sequence according to the occurrence time.

In the present specification, a traffic prediction method is provided, and the present specification relates to a traffic prediction apparatus, a computing device, and a computer-readable storage medium, which are described in detail in the following embodiments one by one.

Fig. 1 shows a flowchart of a service prediction method according to an embodiment of the present specification, which specifically includes the following steps:

step 102: and acquiring historical service data corresponding to each historical time node of the service item in the historical time interval.

In practical application, in the process of predicting the service data of the target time according to the historical service data, each time sequence is usually considered independently, then each time sequence is used for training a specific model, and the service data of the target time is predicted according to the trained model; however, in the process of predicting the service data of the target time through the model, the time sequence corresponding to the model in the training process has a certain corresponding relationship with the target time sequence, so that the service data corresponding to a single target time sequence can only be predicted, and the service data corresponding to a plurality of target time sequences cannot be predicted, and the problems of low efficiency, low accuracy and the like which affect the reasonable allocation of resources of service projects exist.

In order to realize the prediction of a multi-step time sequence and improve the output accuracy of a time sequence prediction model, the service prediction method provided by the specification includes acquiring historical service data of each time node of a service project in a historical time interval, preprocessing the historical service data according to each service dimension of the service project to obtain feature data corresponding to each service dimension of each historical time node, constructing a feature sequence corresponding to the historical time interval from the feature data, inputting the time sequence prediction model for prediction to obtain a target feature sequence corresponding to the target time interval, and finally determining the target prediction data corresponding to each target time node in the target time interval according to the target feature sequence, so that the feature sequence corresponding to the target time interval can be predicted through the time sequence prediction model comprising a decoder and an encoder, the method has the advantages that the efficiency of characteristic sequence prediction is improved, meanwhile, the characteristic sequences of a plurality of time nodes in a target time interval can be predicted, the dependency relationship on the time dimension can be established by introducing an attention mechanism, the accuracy of the predicted target characteristic sequence is greatly improved, and the accuracy of determining the target prediction data is improved.

In specific implementation, the encoder comprises a convolution module and an encoding module, the decoder comprises a decoding module and an output module, the convolution module is used for performing convolution processing on the input characteristic sequence, the encoding module is used for encoding the convolution processing result and inputting the encoding result to the decoder, the decoder is used for performing decoding processing on the encoding processing result and finally the output module is used for outputting the target characteristic sequence corresponding to the target time interval.

Based on this, the business item specifically refers to an item that needs a prediction requirement for business data of a target time interval, and correspondingly, the historical business data specifically refers to data corresponding to a historical time interval related to the business item; for example, a stock transaction needs to predict the price of a stock for seven days in the future, and the business item is the stock transaction item, or an online store needs to predict the sales condition for seven days in the future, and the business item is the online store sales item; under the condition that the business item is a stock item, the corresponding historical business data is data corresponding to the trading price of a stock exchange aiming at a certain stock in a year before the current time (day), or data corresponding to the reference price of the stock; wherein, the year before the current time is the historical time interval of the stock item, each historical time node is each day in the historical time interval, and the price or reference price of each day of the stock is the historical business data;

when the price of the stock of 7 days in the future needs to be predicted, the price of the stock of 7 days in the future can be predicted through a time series prediction model by combining the price of the stock of 365 days in the past year or a reference price.

Step 104: and preprocessing the historical service data according to each service dimension of the service project to obtain characteristic data corresponding to each service dimension of each historical time node.

Specifically, on the basis of obtaining the historical service data of the service project, it is further determined that the service project needs to predict target prediction data corresponding to a target time interval, and the historical service data needs to be preprocessed before prediction, so that the prediction accuracy of a time series prediction model is improved; based on this, the historical service data is preprocessed according to each service dimension of the service project, and feature data corresponding to each service dimension of each historical time node is obtained.

In practical applications, the business dimension specifically refers to a dimension related to the business item, for example, in a stock trading item, the business dimension may include a trading dimension and a price dimension; in a network sales project, business dimensions can be sales dimensions and sales number dimensions; correspondingly, the feature data specifically refers to data composed of features corresponding to the dimensions.

Following the above example, the data corresponding to the stock price of each day in the past 365 days is preprocessed, so that the feature data corresponding to the trading volume of the stock in the trading dimension of each day in the past 365 days and the feature data corresponding to the trading price of the price dimension can be obtained, the feature data of the trading dimension of each day is the trading volume of the stock, and the feature data of the price dimension of each day is the trading price of the stock.

Further, in order to improve the accuracy of the target feature sequence corresponding to the target time interval through the time sequence prediction model subsequently, the historical service data may be preprocessed at this time, in one or more embodiments of this embodiment, the process of preprocessing the historical service data is specifically implemented as follows:

In practical application, the dimension service data specifically refers to data extracted from historical service data according to each service dimension and corresponding to each service dimension, and on the basis, the dimension service data is standardized according to each historical time node in the historical time interval, so that the feature data can be obtained.

The standardized processing of the dimension service data can be understood as dividing the dimension service data according to each historical time node, and sorting the divided results, specifically, adjusting the dimension service data into data in a uniform format.

Along with the use of the above example, the trading price and the trading volume of the stock in the past 365 days are historical business data of the stock, the dimensional business data of the stock in the trading dimension is the trading volume of the stock, the dimensional business data in the price dimension is the trading price of the stock, the trading volume and the trading price of the stock at the moment correspond to each day in the past 365 days, standardization processing needs to be carried out according to each historical time node (each day in 365 days), and the standardization processing is that the data of each business dimension is arranged in a time increasing mode to obtain characteristic data; when the current time is t, the corresponding characteristic data are the transaction amount of the historical time node t-365 in the transaction dimension and the transaction price of the price dimension, the transaction amount of the historical time node t-364 in the transaction dimension and the transaction price of the price dimension, and the transaction amount of the historical time node t-363 … … in the transaction dimension and the transaction price of the price dimension, respectively.

In addition, in the process of standardizing the dimension service data, data filling or data deletion can be performed on the dimension service data, and when part of data is absent in the dimension service data, the dimension service data is incomplete, and the data filling or data deletion needs to be performed on the dimension service data, so that finally obtained feature data is completed and can be applied in a subsequent prediction process.

In conclusion, before the characteristic sequence corresponding to the target time interval is predicted, the characteristic data is obtained by preprocessing the historical service data, and the accuracy of the prediction result of the time sequence prediction model is improved.

Step 106: and constructing a characteristic sequence corresponding to the historical time interval according to the characteristic data.

Specifically, on the basis of obtaining the feature data after preprocessing the historical service data, at this time, the feature data needs to be constructed as a feature sequence capable of inputting a time sequence prediction model, that is, a feature sequence corresponding to the historical time interval is constructed based on the feature data.

In practical application, the characteristic sequence corresponding to the historical time interval specifically refers to a sequence constructed by characteristic data corresponding to all historical time nodes; following the above example, the feature sequence constructed based on the feature data may be: x_1：T＝(x₁，x₂，……，x_T) Wherein, X_1：TRepresenting a characteristic sequence, wherein T represents time information T corresponding to a historical time interval of 365 days and x_TAnd elements in the representation characteristic sequence consist of transaction prices and transaction amounts, and respectively represent the transaction prices and the transaction amounts of stocks corresponding to each day in 365 days.

Step 108: inputting the characteristic sequence into a time sequence prediction model for prediction to obtain a target characteristic sequence corresponding to a target time interval.

Specifically, on the basis of the feature sequence corresponding to the historical time interval constructed based on the feature data, the feature sequence needs to be input into the time sequence prediction model at this time to predict the feature sequence corresponding to the target time interval, where the target time interval is the time required to be subjected to feature sequence prediction, and finally the target feature sequence corresponding to the target time interval output by the time sequence prediction model is obtained.

In practical application, in the process of predicting the target feature sequences corresponding to the target time intervals, in order to predict the feature sequences of a plurality of target time nodes in the target time intervals and improve the prediction accuracy of the time sequence prediction model, the model may be trained from long-term historical service data corresponding to a plurality of historical time intervals, and the feature sequences of the plurality of time nodes are output in parallel, so that the feature sequences of the plurality of target time nodes are predicted.

Referring to fig. 2, in the process of predicting the target feature sequence in the L step through the history feature sequence in the T step, firstly, the history feature sequence X corresponding to the T step needs to be obtained_1：T∈R^T*dxInput into a time seriesA convolution module (Convolutional Block) in the prediction model carries out convolution processing, and then the convolution processing result output by the convolution module is input to an Encoding module (Encoding Block) to carry out Encoding processing in sequence until an intermediate characteristic sequence h output by a terminal Encoding module is obtained_A∈R^dAnd H_1：T’∈R^T*dThen h is_AInitialized to a characteristic sequence H corresponding to a prediction step_T+1：T+L∈R^L*dAnd inputting the data into a Decoding Block (Decoding Block) to perform Decoding processing in sequence, and simultaneously Decoding H in the intermediate characteristic sequence_1：T’∈R^T*dInputting the decoded data into a decoding module for decoding, and finally inputting the decoding result Output by the terminal decoding module into an Output module (Output Block) for Output processing to obtain the L-step predicted target characteristic sequence X Output by the Output module_T+1：T+L∈R^L*dxWherein, T step history characteristic sequence represents the value of T past moments of a time sequence, d_xRepresenting the number of variables, each time taking the value of a single variable or a plurality of variables, i.e. d_xAnd the L-step target characteristic sequence represents the prediction of the future L step.

In one or more implementations of this embodiment, the time series prediction model is trained by:

In addition, if the service data corresponding to the target time interval is directly output through the time series prediction model, the time series prediction model actually needs to perform further linear transformation on the target feature sequence output by the decoder, and at this time, the prediction accuracy of the service data corresponding to the target time interval may be reduced, the target prediction data may be determined according to the target feature sequence output by the model outside the time series prediction model, so that the improvement of the prediction accuracy is achieved.

Furthermore, due to the structural particularity of the time series prediction model, a dependency relationship can be established on historical service data of a time dimension, and the influence of multiple factors on a prediction result can be combined, so that the accuracy of the time series prediction model is effectively improved.

Determining a convolution kernel and a step length preset in the encoder;

Specifically, the time series prediction model is configured with an encoder and a decoder, wherein the encoder is composed of a multilayer convolution module and a multilayer encoding module, the first layer in the encoder is the convolution module, and in order to improve the efficiency of the time series prediction model, the convolution module can perform convolution processing on the characteristic sequence and convert the characteristic sequence into a sequence with a shorter length;

based on the above, firstly, determining a convolution kernel and a step length preset by the convolution module, and performing convolution processing on the feature sequence according to the convolution kernel and the step length to obtain the short feature sequence; in practical application, the time series prediction model can use multilayer convolution to reduce the length of the characteristic sequence, so that the processing efficiency of the time series prediction model is improved; further, after the short feature sequence is obtained, time sequence attention calculation is performed to obtain an intermediate feature sequence output by the encoder, where the intermediate feature sequence is an intermediate parameter output after encoding processing by the encoder and is represented as the intermediate feature sequence.

Referring to FIG. 3, the procedure of convolution processing by the convolution module in the time series prediction model is shown, wherein the characteristic series is X₁～X₁₂Wherein X is_1：12∈R^12*dxBy using a convolution kernel W of size 4_1:4∈R^4*d*dxFor the characteristic sequence X_1：12∈R^12*dxConvolution processing with step length of 2 is carried out, and the short characteristic sequence H can be obtained₁～H₅In which H is_1:5∈R^12*dx(ii) a At the moment, the length of the feature sequence after convolution processing is obviously shorter than that of the feature sequence before convolution processing, and the pre-storage efficiency can be effectively improved in the process of predicting the subsequent time sequence prediction model; further, according to the short signature sequence H_1:5∈R^12*dxAnd (4) performing time sequence attention calculation to obtain an intermediate characteristic sequence output by the encoder.

In conclusion, the length of the characteristic sequence can be effectively reduced by performing convolution processing on the characteristic sequence, so that the prediction efficiency of the time sequence prediction model is improved, and the prediction rate of the predicted target characteristic sequence is further accelerated.

Furthermore, after the feature sequence is subjected to convolution processing, time-series attention calculation needs to be performed on the short feature sequence of the convolution processing result, and in the time-series attention calculation process, that is, a dependency relationship between feature sequences under a time dimension is introduced, so as to achieve an effect of improving the accuracy of the time-series prediction model, in one or more embodiments of this embodiment, a specific implementation manner of the time-series attention calculation is as follows:

In practical application, in order to improve the prediction accuracy of a time series prediction model, a plurality of layers of coding modules are configured in a coder contained in the time series prediction model, each layer of coding module uses a Self-Attention mechanism (Self-Attention) to transform the characteristics of each historical time node, and meanwhile, the characteristics of each historical time node are converged into global characteristics, and the input of each layer of coding module is the integration of the output of the previous layer of coding module and the global characteristics;

therefore, in this embodiment, a layer of coding module in a multi-layer coding module is described, and the time sequence attention calculation process of other layer of coding modules may refer to the content described in this embodiment, which is not described in detail herein.

Specifically, in the process of performing time series attention calculation, the coding module may be analyzed as a self-attention unit, a first aggregation unit, a splicing transformation unit, and a second aggregation unit, where the self-attention unit is used to obtain an attention feature sequence through calculation, and then the first aggregation unit and the second aggregation unit perform aggregation transformation to obtain the intermediate feature sequence.

In specific implementation, the process of determining the attention feature sequence from the attention unit specifically includes: first determining the query weight (W) corresponding to the short signature sequence_Q) Key weight (W)_K) And a value weight (W)_V) Then, according to the query weight, the key weight and the value weight respectivelyCalculating a Query Vector (Query Vector), a Key Vector (Key Vector) and a Value Vector (Value Vector) corresponding to the short characteristic sequence; determining a first attention degree between each historical time node in a historical time interval based on a Query Vector (Query Vector) and the Key Vector (Key Vector), and determining a second attention degree based on the attention degree and the square root of the Key Vector (Key Vector); obtaining a probability Value corresponding to the feature sequence by performing Softmax processing on the second attention, obtaining an individual attention feature sequence corresponding to each historical time node by multiplying the probability Value by the Value Vector, and finally summing an individual attention mechanism (a result of the multiplication between the attention parameter and the Value Vector) to obtain the attention feature sequence;

the process of determining the first aggregation characteristic sequence by the first aggregation unit specifically includes: performing first aggregation treatment on the attention feature sequence to obtain a first aggregation feature sequence; the process of determining the second aggregation characteristic sequence by the splicing conversion unit specifically refers to the process of splicing and converting the first aggregation characteristic sequence into the second aggregation characteristic sequence; the process of determining the intermediate characteristic sequence by the second polymerization unit specifically refers to: and carrying out second polymerization treatment on the second polymerization characteristic sequence to obtain the intermediate characteristic sequence.

Referring to fig. 4, a process of performing time-series attention calculation on a short feature sequence is shown, where fig. 4 shows a time-series attention calculation process of one layer of coding modules, l +1 shows a coding module of layer l +1, an output of each layer of coding modules is an intermediate feature sequence composed of a global feature and a feature sequence, and a feature sequence output by layer l is an intermediate feature sequence composed of a global feature and a feature sequence

And global features

The intermediate characteristic sequence output by the coding module for time sequence attention calculation is the characteristic sequence

And global features

Composition is carried out; wherein,

and the specific process of determining the intermediate feature sequence is determined by a Self-Attention unit, a first aggregation unit, a splicing transformation unit and a second aggregation unit in the coding module, and the process of determining the Attention feature sequence by the Self-Attention unit (Self-Attention) can be represented by the following formula (1), formula (2), formula (3) and formula (4):

wherein, Q represents a Query Vector (Query Vector), K represents a Key Vector (Key Vector), and V represents a Value Vector (Value Vector); w_QRepresenting the query weight, W_KDenotes a key weight, W_VRepresenting a value weight; the softmax function (normalized exponential function) is normalized for each row of the input matrix, i.e. Y ═ softmax (x)

Representing the characteristic sequence output by the coding module of the upper layer; t represents a historical time interval;

representing sequences of features in intermediate sequences of features

And output

At this time, the global characteristics also need to be determined

The first aggregation unit (Aggregate) is determined according to equation (5)

Wherein,

global features (first aggregated feature sequence) representing l layers; then passing through the splicing transformation unit (Concat)&MLP&Skip Conn) transforms the first aggregate signature sequence into a second aggregate signature sequence:

wherein, the element in the middle bracket indicates that two vectors are spliced into one vector, and the MLP indicates a multilayer perceptron; finally, global features are determined from equation (7) of the second aggregation unit (Aggregate)

Finally, based on the characteristic sequence

And global features

The intermediate signature sequence may be composed.

In summary, in order to improve the prediction accuracy of the time series prediction model, a self-attention mechanism is introduced into an encoder to construct a dependency relationship between feature sequences in a time dimension, so that the influence of various factors on a target feature sequence is considered, and the prediction accuracy of the target feature sequence is effectively improved.

In one or more embodiments of this embodiment, on the basis of determining the intermediate feature sequence output by the encoder, the target feature sequence can be output only by performing decoding processing on the intermediate feature sequence through the decoder, and in order to further improve the prediction accuracy of the time series prediction model during the decoding processing, the effect of improving the prediction accuracy can be achieved by performing time series attention calculation while performing cross attention calculation at this time, which is specifically implemented as follows:

In practical applications, in order to improve the prediction accuracy of the time-series prediction model, a multi-layer decoding module and an output module are arranged in a decoder included in the time-series prediction model, and before the input decoding module performs decoding processing, it is necessary to initialize global features in the intermediate feature sequence to improve the prediction accuracy of the time-series prediction model.

Referring to FIG. 5, the process of initializing the global feature is shown, where h_A∈R^d，p₁,…,p_LSpecifically, the method is to encode prediction steps from 1 to L, wherein L represents a pre-target time interval, and the prediction step t is encoded into a vector P with d_tThe ith dimension is specified as determined by equation (8):

vector P_tThe function of (1) is to make the time series prediction model distinguish different prediction step lengths, i.e. to predict the characteristic series of each target time node in the target time interval, as shown in fig. 5

To

Features representing an initial sequence of features and corresponding to each prediction step t-1 … … L

Are all P_LAnd h_ACollectively determined, i.e., can be represented by equation (9):

wherein W represents a learnable parameter (h)_AThe corresponding weight).

After the initial feature sequence is determined, cross attention calculation and time sequence attention calculation are required to be performed on the initial feature sequence so as to obtain the target feature sequence.

Before decoding processing, the global features in the intermediate feature sequence are initialized, so that the time series prediction model can clearly distinguish the feature sequence corresponding to each time node, and the accuracy of the time series prediction model can be effectively improved.

Further, in the process of performing cross attention calculation on the initial feature sequence, a cross attention mechanism is introduced to improve the accuracy of the time series prediction model, and in one or more embodiments of this embodiment, a specific implementation manner of the process of performing cross attention calculation on the initial feature sequence is as follows:

In practical application, in the process of performing cross attention calculation, the cross attention calculation can be realized through a plurality of layers of decoding modules configured in a decoder, and the input of each layer of decoding module is the output of the previous layer of decoding module; therefore, in this embodiment, a layer decoding module in a multi-layer decoding module is described, and the cross attention calculation process of other layer decoding modules may refer to the content described in this embodiment, which is not described in detail herein.

Specifically, first, an initial weight corresponding to the initial feature sequence is determined, and a product of the initial weight and the initial feature sequence is calculated; then, a first prediction characteristic sequence is determined according to the product result and the prediction step length characteristic obtained after the coding processing is carried out on the prediction step length; further, the first predicted feature sequence is spliced and transformed into the second predicted feature sequence, wherein the splicing and transformation of the first predicted feature sequence can be realized by formula (10) of a splicing and transformation unit (MLP & Skip Conn) in a decoding module:

wherein,

and the MLP represents the global features in the target feature sequence corresponding to the target time interval, and the MLP represents the multi-layer perceptron.

Further, after the second predicted feature sequence is determined, a Cross-Attention unit (Cross-Attention) in the decoding module performs Cross-Attention calculation, first extracts the feature sequence output by the encoder, determines a query weight, a key weight and a value weight in the encoder, calculates a predicted query vector corresponding to the second predicted feature sequence according to the query weight, calculates a predicted key vector corresponding to the second predicted feature sequence according to the key weight, and calculates a predicted value vector corresponding to the second predicted feature sequence according to the value weight; and finally, determining the cross feature sequence based on the prediction query vector, the prediction key vector and the prediction value vector.

In addition, on the basis of determining the cross feature sequence, time-series attention calculation needs to be performed on the cross feature sequence, and in one or more embodiments of this embodiment, a specific implementation manner is as follows:

In practical applications, the process of performing the timing attention calculation in the decoder is similar to the process of performing the timing attention calculation in the encoder, and reference may be made to the description related to the process of performing the timing attention calculation in the encoder, which is not described herein in detail.

Referring to fig. 6, a process of a decoding process performed by a decoder is shown, wherein,

representing the target characteristic sequence corresponding to the target time interval, and decoding by a decoder to obtain

The process of the target feature sequence can be represented by formula (11), formula (12), formula (13), and formula (14):

where Q ' denotes a predicted query vector, K ' denotes a predictor key vector, V ' denotes a predictor vector, and the softmax function (normalized exponential function) is normalized for each row of the input matrix, that is, Y ═ softmax (x)

And representing the characteristic sequence output by the last layer of coding module in the coder.

Furthermore, after determining the output of the last layer decoding module, the decoder may transform the feature sequence of the output of the last layer decoding module into the target feature sequence by using a linear transformation, which may be determined by equation (15):

X_T+1:T+L＝H_T+1:T+LW_x (15)；

wherein, W_xRepresenting a learnable parameter. X_T+1:T+LRepresenting a sequence of features corresponding to a target time node.

In addition, the continuous target time nodes in the target time interval may be predicted continuously, for example, the T +1 day signature sequence is determined first, and then the T +2 day signature sequence is predicted according to the T +1 day signature sequence.

In conclusion, in the process that the time series prediction model predicts the target feature sequences corresponding to the target time interval, the dependency relationship between the feature sequences of the time dimension is constructed by combining the attention mechanism, so that the accuracy of the time series prediction model is improved.

Step 110: and determining target prediction data corresponding to each target time node in the target time interval based on the target feature sequence.

Specifically, on the basis of obtaining the target feature sequence output by the time series prediction model, at this time, the service data corresponding to the target time interval cannot be correctly obtained through the target feature sequence, and the target feature sequence needs to be converted into target prediction data, so that the target prediction data corresponding to each target time node in the target time interval can be determined through the target feature sequence.

Further, in the process of determining the target prediction data, in order to be able to correctly determine the target prediction data through the target feature sequence, the target feature sequence needs to be analyzed, and in one or more embodiments of this embodiment, a specific process of determining the target prediction data is as follows:

In practical application, the target feature data is obtained by converting the target feature sequence according to each service dimension, and then analyzing the target feature data according to each target time node in the target time interval, so that the target prediction data can be obtained according to an analysis result.

Along the above example, the business data of the stock of 1-7 days in the future is predicted through the historical business data of the stock of 365 days in the past, the target characteristic sequence of the stock of 7 days in the future is determined to be H, the trading volume of the trading dimension of 7 days in the target time interval is determined to be S according to the target characteristic sequence H, the trading price of the trading dimension is N, and the trading volume S and the trading price N are analyzed according to each target time node (every day) in 7 days in the target time interval, so that the trading volume S1 and the trading price N1 of the stock of 1 day in L, the trading volume S2 and the trading price N2 of the stock of 2 days in L, and the trading volume S7 and the trading price N7 of the stock of 7 days in … … L can be predicted.

On the basis of determining the target characteristic sequence, firstly, determining target characteristic data corresponding to each service dimension of the target characteristic sequence, and then, performing processing on the target characteristic data according to the target time interval, namely determining target prediction data of each target time node in the target time interval, and improving the accuracy of determining the target prediction data on the basis of reducing the workload of the time series prediction model by determining the target prediction data outside the time series prediction model.

Further, on the basis of determining the target feature data, the resource usage amount of each target time node in the target time interval may be evaluated according to the target feature data, and then resources are prepared according to the evaluation result, so as to avoid a situation that a service project cannot be normally operated due to too small resource amount after reaching the target time node, or a situation that resources are wasted due to too much resources, in one or more embodiments of this embodiment, the specific implementation manner is as follows:

In practical applications, clearing the used resources of each target time node in the target time interval according to the target prediction data specifically means clearing the amount of resources that can be used by the target time node, and calling the same prepared resources in the resource library according to the clearing result of the used resources, for use by each target time node in the target time interval.

For example, it is predicted that the amount of stock a traded per day is S from 1/7/20 × year, the price of the trade is N, and at this time, the funds used from 1/7/1/7 are cleared, and it is determined that the fund flow per day is 2 yen, and at this time, 14 yen funds are prepared for use from 1/7/1/2 yen based on the fund flow per day.

In summary, by preparing the resources needed to be used in the target time interval, it is achieved that sufficient preparation can be made for the business project, and the situations that the resource is wasted due to too much preparation of the prepared resources and the optimal transaction time is missed due to too little preparation of the prepared resources are avoided.

The service prediction method provided by the specification can predict the characteristic sequence corresponding to the target time interval through the time sequence prediction model comprising the decoder and the encoder, and improves the efficiency of characteristic sequence prediction while obtaining the efficiency of characteristic sequence prediction, the method can predict the characteristic sequences of a plurality of time nodes in the target time interval, and can establish a dependency relationship on a time dimension by introducing an attention mechanism, thereby greatly improving the accuracy of predicting the target characteristic sequences and further improving the accuracy of determining the target prediction data.

Corresponding to the above method embodiment, the present specification further provides an embodiment of a service prediction apparatus, and fig. 7 shows a schematic structural diagram of a service prediction apparatus provided in an embodiment of the present specification. As shown in fig. 7, the apparatus includes:

an obtaining module 702, configured to obtain historical service data corresponding to each historical time node of the service item in the historical time interval;

a preprocessing module 704 configured to preprocess the historical service data according to each service dimension of the service project, and obtain feature data corresponding to each service dimension of each historical time node;

a constructing module 706 configured to construct a feature sequence corresponding to the historical time interval according to the feature data;

a prediction module 708, configured to input the feature sequence into a time sequence prediction model for prediction, and obtain a target feature sequence corresponding to a target time interval;

a determining module 710 configured to determine target prediction data corresponding to respective target time nodes in the target time interval based on the target feature sequence;

In an alternative embodiment, the time-series attention calculation performed by the encoder to obtain the intermediate feature sequence includes:

determining a convolution kernel and a step length preset in the encoder;

In an optional embodiment, the obtaining the intermediate feature sequence by performing the time-series attention calculation on the short feature sequence includes:

In an optional embodiment, the performing, by the decoder, cross attention calculation on the intermediate feature sequence and performing time-series attention calculation on a cross attention calculation result to obtain the target feature sequence includes:

In an optional embodiment, the obtaining a cross feature sequence by performing the cross attention calculation on the initial feature sequence includes:

In an optional embodiment, the performing the time-series attention calculation on the cross feature sequence to obtain the target feature sequence includes:

In an optional embodiment, the determining module 710 includes:

In an optional embodiment, the traffic prediction apparatus further includes:

In an alternative embodiment, the preprocessing module 704 includes:

an extracting unit configured to extract dimension business data corresponding to each business dimension from the historical business data based on the each business dimension;

and the processing unit is configured to standardize the dimensional service data according to the historical time nodes to obtain the feature data.

In an alternative embodiment, the time series prediction model is trained by:

the system comprises a sample acquisition module, a sample analysis module and a sample analysis module, wherein the sample acquisition module is configured to acquire sample business data and construct a sample characteristic sequence corresponding to the sample business data;

a sequence determining module configured to determine a sample target feature sequence corresponding to the sample feature sequence;

a labeling module configured to label the sample target feature sequence based on each sample time node in a sample time interval, and take the labeled sample target feature sequence and the sample feature sequence as training samples;

and the training model module is configured to input the training samples into a time series prediction model constructed based on the incidence relation between the sample feature sequence and the sample target feature sequence for training, so as to obtain the time series prediction model.

The service prediction device provided by the present specification obtains historical service data of each time node of a service item in a historical time interval, preprocesses the historical service data according to each service dimension of the service item, obtains feature data corresponding to each service dimension of each historical time node, constructs a feature sequence corresponding to the historical time interval from the feature data, inputs the time sequence prediction model to perform prediction to obtain a target feature sequence corresponding to the target time interval, and finally determines the target prediction data corresponding to each target time node in the target time interval according to the target feature sequence, so that the feature sequence corresponding to the target time interval can be predicted by the time sequence prediction model including a decoder and an encoder, and the efficiency of feature sequence prediction is improved, the method can predict the characteristic sequences of a plurality of time nodes in the target time interval, and can establish a dependency relationship on a time dimension by introducing an attention mechanism, thereby greatly improving the accuracy of predicting the target characteristic sequences and further improving the accuracy of determining the target prediction data.

The foregoing is a schematic scheme of a traffic prediction apparatus according to this embodiment. It should be noted that the technical solution of the service prediction apparatus and the technical solution of the service prediction method belong to the same concept, and details that are not described in detail in the technical solution of the service prediction apparatus can be referred to the description of the technical solution of the service prediction method.

Fig. 8 illustrates a block diagram of a computing device 800 provided in accordance with an embodiment of the present description. The components of the computing device 800 include, but are not limited to, memory 810 and a processor 820. The processor 820 is coupled to the memory 810 via a bus 830, and the database 850 is used to store data.

Computing device 800 also includes access device 840, access device 840 enabling computing device 800 to communicate via one or more networks 860. Examples of such networks include the Public Switched Telephone Network (PSTN), a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the internet. Access device 840 may include one or more of any type of network interface (e.g., a Network Interface Card (NIC)) whether wired or wireless, such as an IEEE802.11 Wireless Local Area Network (WLAN) wireless interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.

In one embodiment of the present description, the above-described components of computing device 800, as well as other components not shown in FIG. 8, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device architecture shown in FIG. 8 is for purposes of example only and is not limiting as to the scope of the description. Those skilled in the art may add or replace other components as desired.

Computing device 800 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), a mobile phone (e.g., smartphone), a wearable computing device (e.g., smartwatch, smartglasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 800 may also be a mobile or stationary server.

Wherein, the processor 820 is configured to execute the following computer-executable instructions:

determining target prediction data corresponding to each target time node in the target time interval based on the target feature sequence; the time series prediction model comprises a decoder and an encoder, wherein the encoder performs time series attention calculation on the feature sequence to obtain an intermediate feature sequence, the decoder performs cross attention calculation on the intermediate feature sequence, and a time series attention calculation result is performed on the cross attention calculation result to obtain the target feature sequence.

The above is an illustrative scheme of a computing device of the present embodiment. It should be noted that the technical solution of the computing device and the technical solution of the service prediction method belong to the same concept, and details that are not described in detail in the technical solution of the computing device can be referred to the description of the technical solution of the service prediction method.

An embodiment of the present specification also provides a computer readable storage medium storing computer instructions that, when executed by a processor, are operable to:

The above is an illustrative scheme of a computer-readable storage medium of the present embodiment. It should be noted that the technical solution of the storage medium belongs to the same concept as the technical solution of the service prediction method, and details that are not described in detail in the technical solution of the storage medium can be referred to the description of the technical solution of the service prediction method.

The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

The computer instructions comprise computer program code which may be in the form of source code, object code, an executable file or some intermediate form, or the like. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

It should be noted that, for the sake of simplicity, the foregoing method embodiments are described as a series of acts or combinations, but those skilled in the art should understand that the present disclosure is not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the present disclosure. Further, those skilled in the art should also appreciate that the embodiments described in this specification are preferred embodiments and that acts and modules referred to are not necessarily required for this description.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

The preferred embodiments of the present specification disclosed above are intended only to aid in the description of the specification. Alternative embodiments are not exhaustive and do not limit the invention to the precise embodiments described. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the specification and its practical application, to thereby enable others skilled in the art to best understand the specification and its practical application. The specification is limited only by the claims and their full scope and equivalents.

Claims

1. A traffic prediction method, comprising:

2. The traffic prediction method according to claim 1, wherein the feature sequence is obtained by performing a time-series attention calculation by the encoder to obtain an intermediate feature sequence, and the method comprises:

determining a convolution kernel and a step length preset in the encoder;

3. The traffic prediction method according to claim 2, wherein the obtaining of the intermediate signature sequence by performing the time-series attention calculation on the short signature sequence comprises:

4. The traffic prediction method according to claim 3, wherein the decoder performs cross attention calculation on the intermediate feature sequence and performs time-series attention calculation on a cross attention calculation result to obtain the target feature sequence, and the method comprises:

5. The traffic prediction method according to claim 4, wherein the obtaining a cross feature sequence by performing the cross attention calculation on the initial feature sequence comprises:

6. The traffic prediction method according to claim 5, wherein the performing the time-series attention calculation on the cross signature sequence to obtain the target signature sequence comprises:

7. The traffic prediction method according to claim 1, wherein the determining target prediction data corresponding to each target time node in the target time interval based on the target feature sequence comprises:

8. The traffic prediction method according to claim 1, after the step of determining the target prediction data corresponding to each target time node in the target time interval based on the target feature sequence is performed, further comprising:

9. The service prediction method according to claim 1, wherein the preprocessing the historical service data according to each service dimension of the service item to obtain feature data corresponding to each service dimension of each historical time node comprises:

10. The traffic prediction method according to claim 1, wherein the time series prediction model is trained by:

11. A traffic prediction apparatus comprising:

12. The traffic prediction apparatus of claim 11, the determining module comprising:

13. The traffic prediction apparatus of claim 11, further comprising:

14. A computing device, comprising:

a memory and a processor;

15. A computer readable storage medium storing computer instructions which, when executed by a processor, carry out the steps of the traffic prediction method according to any one of claims 1 to 10.