CN110059747A

CN110059747A - A kind of net flow assorted method

Info

Publication number: CN110059747A
Application number: CN201910314300.XA
Authority: CN
Inventors: 肖喜; 毛科龙; 夏树涛; 郑海涛; 江勇
Original assignee: Shenzhen Graduate School Tsinghua University
Current assignee: Shenzhen Graduate School Tsinghua University
Priority date: 2019-04-18
Filing date: 2019-04-18
Publication date: 2019-07-26
Anticipated expiration: 2039-04-18
Also published as: CN110059747B

Abstract

The present invention provides a kind of net flow assorted method, including building lightweight disaggregated model；The building lightweight disaggregated model includes the following steps: S1: based on the deep neural network flow Denoising Algorithm training network flow classified model from step study；S2: the network flow classified model is compressed into lightweight network flow classified model by the model compression technology based on regularization loss knowledge distillation.By a kind of new deep neural network flow Denoising Algorithm based on step study certainly, and knowledge distillation technique is combined, network flow classified model is compressed, final lightweight network flow classified model is obtained.Effectively promote robustness, classification accuracy and the classification speed of net flow assorted method.

Description

A kind of net flow assorted method

Technical field

The present invention relates to technical field more particularly to a kind of net flow assorted methods.

Background technique

Internet and information technology are to develop one of technology the most quick in world today's scientific and technical research.China is certainly Since accessing Internet, in short more than 20 years, by the unremitting effort of domestic various Internet enterprises, Chinese is mutual Networking industry experienced from the world with going to the world and run, then to future can the world of phase take the lead in race three Great spans, acquirement it is huge It achieves obvious to all.8.02 hundred million are had reached by netizen's number in June, 2018, China, Internet penetration reaches 57.7%, development speed is surprising.For Internet technology because its is convenient and efficient, mobility is good, and lower-price characteristic is changing Become people's lives mode.It is small to arrive chat, shopping, amusement, space flight and aviation, weapon guided missile are arrived greatly all be unable to do without internet.Mutually The rapid development of networking has pushed the production of globalization and the deep reform of life style.With the development of network theory technology, The continuous enhancing of network hardware performance, the flow scale of entire internet are also constantly increasing.The raising of living standards of the people The further development for pushing net application technology, promotes Network Provider constantly to promote the level of network service.Nowadays, existing For internet flow diversity and complexity far beyond bottom Internet architecture person originally the imagination.

Network flow is the important carrier of record and reflection network activity and traffic-operating period.As the high speed of internet is sent out Exhibition, in order to meet the diversified demand of Internet user, various new web services layers go out not thoroughly, lead to network flow either It is all continuously increased in quantity or type.These new applications expand the application scale of internet, provide richer Network service.But its application protocol features used is also different from traditional application type, becomes more complicated and diversified, Management and planning to network flow cause great impact.

Net flow assorted refers to the application type (such as FTP, HTTP, SMTP, 360, qq etc.) according to network, by base Classify in the TCP or UDP flow amount of the network communication generation of ICP/IP protocol.Net flow assorted technology is Logistics networks peace Most basic function in full key technology and modern network management and security system.Meanwhile net flow assorted technology exists The control of QOS service quality, network application trend analysis etc. also have the function of it is great, comprising very big application value, specifically It is embodied in:

1, by network flow identification technology, the distribution condition of network internal resource, network operator and net can be controlled Network service provider (ISP) can apply it in network service quality (QoS) controlling mechanism, the networks such as Logistics networks bandwidth The reasonable distribution of resource, so that network be promoted to develop to more reasonable direction.If each network critical point to network flow into Row classification, the different application protocol of Adaptive matching would help network manager and implement effective difference to network flow Change, fine-grained management.In this way, being also beneficial to solve variety of problems present in network supervision, built for the network user more strong The efficient network environment of health.

2, the identification classification for realizing network flow, can be managed the service traffics of enterprise or user, so as to Macroscopically dynamic adaptation Internet resources customize reasonable network operation scheme for user, realize more efficient network application.It is logical Cross the network flow of identification different application, intra-company can at work between be forbidden to use the relevant application traffic of amusement, political affairs Mansion department, which can set, to be forbidden illegally using encrypted transmissions business such as P2P etc..The proprietary stream for special applications risen recently Amount discount (such as Tencent king blocks), even more using net flow assorted technology as core.

3, network flow identification classification plays a significant role Logistics networks space safety.For example, intruding detection system (IDS) it can use net flow assorted technology, malicious network traffic identified to and taken the measures such as isolation processing, is passed through The malicious attacks flows such as wooden horse, Web injection are accurately identified, advanced warning or the possible attack of blocking, realization set network Standby protection, Logistics networks system are safely and reliably run.In the sensitive networks such as state enterprise, network flow can also be utilized Sorting technique carries out accurately identification and supervision to entrance network flow, effectively monitors and managing network flow, prevents machine Close, sensitive information leakage is to cause great network information security accident.In addition, in the environment based on cloud computing, network flow Amount sorting technique also plays extremely important effect in terms of ensuring cloud computing service quality.

But in the prior art, there are problems that much noise flow influences network shunt in network environment.

Summary of the invention

The present invention is in order to solve the problems, such as to provide a kind of network flow point in real network environment there are much noise flow Class method.

To solve the above-mentioned problems, the technical solution adopted by the present invention is as described below:

A kind of net flow assorted method, including building lightweight disaggregated model；The building lightweight disaggregated model packet Include following steps: S1: based on the deep neural network flow Denoising Algorithm training network flow classified model from step study；S2: Based on the model compression technology of regularization loss knowledge distillation, the network flow classified model is compressed into lightweight network flow Measure disaggregated model.

In an embodiment of the present invention, include the following steps: S11 in the step S1: the sample of network flow data This obtains output valve by deep neural network；S12: with the true mark of the output valve and the sample of the network flow data Label calculate penalty values；S13: the penalty values and threshold value comparison are ignored into the net if the penalty values are greater than the threshold value The sample of network data on flows；If the penalty values are less than the threshold value, the penalty values are carried out described in backpropagation optimization The parameter of network flow classified model.The method for optimizing the parameter of the network flow classified model is declined using small lot gradient Algorithm；By calculating in the small lot gradient descent algorithm every time for the sample of the network flow data of training Complexity average value is practised, in the training network flow classified model with the weight undated parameter of corresponding size；It obtains most Excellent initial learning rate simultaneously trains network flow classified model, and periodically regularized learning algorithm rate.Obtain optimal initial learning rate Method include the following steps: T1: one initial learning rate of setting is trained network flow classified model, each batch it The sample for updating the network flow data afterwards increases learning rate when updating the sample of the network flow data；T2: it calculates every The loss of the sample of a network flow data, obtains the change curve of learning rate and loss；T3: obtain it is described it is optimal just Beginning learning rate.

In another embodiment of the invention, include the following steps: S21 in the step S2: in the network flow The softmax layer of disaggregated model introduces the concept of " temperature ", and softmax layers of output is become:

Wherein, T is temperature, and a new network flow classified model, T > 1 are trained at temperature T；

S22: being input to the new network flow classified model for the sample of each network flow data,

Obtain output of the sample of each network flow data at softmax layers, i.e., soft label；S23: it is re-introduced into L1 Regularization method utilizes the soft label and the former true tag training lightweight network flow point at identical temperature T Class model.It is proposed that regularization loss function is as follows:

Loss=Loss1+Loss2+ λ | | w | |₁

Wherein, λ is hyper parameter, | | w | |₁It is the L of light weight model parameter w₁Norm,The respectively described new network The soft label that traffic classification model is obtained at mutually synthermal T with lightweight network flow classified model, T > 1,For lightweight Softmax layers of the model output valve in temperature T=1；y_trueIt is true tag；α is hyper parameter, for adjusting Loss1 and Loss2 Ratio, CrossEntropy indicate cross entropy loss function.Network flow data passes through the lightweight net flow assorted Prediction label can be obtained in model, and the temperature of the softmax layer of the lightweight network flow classified model is 1.

The present invention also provides a kind of computer readable storage medium, the computer-readable recording medium storage has computer Program, when the computer program is executed by processor realize as above any the method the step of.

The invention has the benefit that providing a kind of net flow assorted method, learnt by a kind of new based on step certainly Deep neural network flow Denoising Algorithm, and combine knowledge distillation technique, network flow classified model is compressed, is obtained Final lightweight network flow classified model.Effectively promoted the robustness of net flow assorted method, classification accuracy and point Class speed.

Detailed description of the invention

Fig. 1 is net flow assorted method schematic diagram in the embodiment of the present invention.

Fig. 2 is the method schematic diagram that lightweight disaggregated model is constructed in the embodiment of the present invention.

Fig. 3 is in the embodiment of the present invention based on the deep neural network Structure and Process schematic diagram from step study.

Fig. 4 is the method schematic diagram of training network flow classified model in the embodiment of the present invention.

Fig. 5 is the method schematic diagram that optimal initial learning rate is obtained in the embodiment of the present invention.

Fig. 6 is that the model compression technology based on regularization loss knowledge distillation in the embodiment of the present invention constructs lightweight network The flow diagram of traffic classification model.

Fig. 7 is that the model compression technology based on regularization loss knowledge distillation in the embodiment of the present invention constructs lightweight network The method schematic diagram of traffic classification model.

Fig. 8 is lightweight network flow classified model use flow diagram in the embodiment of the present invention.

Fig. 9 is the schematic diagram of net flow assorted terminal device in the embodiment of the present invention.

Specific embodiment

In order to which technical problem to be solved of the embodiment of the present invention, technical solution and beneficial effect is more clearly understood, The present invention is further described in detail below with reference to the accompanying drawings and embodiments.It should be appreciated that specific implementation described herein Example is only used to explain the present invention, is not intended to limit the present invention.

It should be noted that it can be directly another when element is referred to as " being fixed on " or " being set to " another element On one element or indirectly on another element.When an element is known as " being connected to " another element, it can To be directly to another element or be indirectly connected on another element.In addition, connection can be for fixing Effect is also possible to act on for circuit communication.

It is to be appreciated that term " length ", " width ", "upper", "lower", "front", "rear", "left", "right", "vertical", The orientation or positional relationship of the instructions such as "horizontal", "top", "bottom" "inner", "outside" is that orientation based on the figure or position are closed System is merely for convenience of the description embodiment of the present invention and simplifies description, rather than the device or element of indication or suggestion meaning must There must be specific orientation, be constructed and operated in a specific orientation, therefore be not considered as limiting the invention.

In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance Or implicitly indicate the quantity of indicated technical characteristic.Define " first " as a result, the feature of " second " can be expressed or Implicitly include one or more this feature.In the description of the embodiment of the present invention, the meaning of " plurality " is two or two More than, unless otherwise specifically defined.

Embodiment 1

As shown in Figure 1, this method is broadly divided into two aspects, first is to design the depth nerve net based on from step study Network flow Denoising Algorithm, the thought of the algorithm is: after network flow data passes through deep neural network and calculates penalty values, and Backpropagation Optimized model parameter is not carried out directly, but penalty values and decision threshold are made comparisons.If loss is greater than threshold value The sample is ignored in this training.Decision threshold can constantly increase with exercise wheel number simultaneously makes model gradually learn more complexity Sample.Such circulate operation, the deep learning model sample that only selection is easier to study every time are learnt, and final reject drops off Group's point.It, will in real network environment in this way since finally obtained model is there is no training is used as by noise sample Possess stronger noise resisting ability；Second be design based on regularization loss knowledge distillation model compression technology: first for The concept of softmax layers of addition " temperature ", what a complicated traffic classification model of training at higher temperature T.It then will stream Measure after the complex model softmax layer output valve (also known as soft label) and original label together, in conjunction with L₁Regularization, At identical temperature T, the new light weight model of training.In this way, the model of the lightweight not only remains former complex model The knowledge arrived, and it typically is provided with stronger generalization ability.When actually using light weight model progress traffic classification, first Softmax layers of temperature is set as 1, and prediction label then can be obtained by trained light weight model in new flow again.Through Light weight model after overcompression will have detection speed faster, can satisfy wanting to classification speed for real network environment It asks.

As shown in Fig. 2, a kind of net flow assorted method, which is characterized in that including constructing lightweight disaggregated model；

The building lightweight disaggregated model includes the following steps:

S1: based on the deep neural network flow Denoising Algorithm training network flow classified model from step study；

S2: the network flow classified model is compressed by the model compression technology based on regularization loss knowledge distillation Lightweight network flow classified model.

Next two steps of building lightweight disaggregated model are described in detail.

From step study by the Mechanism of Cognition of simulation people, learn since simple, universality the structure of knowledge first, so After gradually increase difficulty, be transitioned into more complicated, more specialized knowledge.For machine learning model, seek to first select Relatively good study (losing under "current" model parameter lower) sample learnt, then by being gradually expanded and can tolerate Penalty values select the sample of study until model parameter is stablized to change model.Always the sample not learnt is exactly to be removed Noise sample (outlier).Since learning process is since simple part, the parameter of final mask will be easier to reach Global optimum.Moreover, by screening step by step, it can be with automatic rejection noise sample with the robustness of boosting algorithm.

It will be being combined from step study and deep learning, proposition denoises calculation based on the deep neural network flow from step study Method is to achieve the purpose that further to remove network flow noise data.Specifically, when network flow data passes through depth nerve Network and after calculating penalty values, does not carry out backpropagation Optimized model parameter directly, but penalty values and decision threshold is made Compare.Ignore the sample greater than this training if threshold value if losing.Decision threshold can make as exercise wheel number constantly increases simultaneously Model gradually learns the sample of more complexity.Such circulate operation, only selection is easier to the sample learnt to deep learning model every time This is learnt, and outlier is finally weeded out.Threshold value recited above, most starting threshold value is to be manually set, typically small.Afterwards Side can incrementally increase, and the frequency of increase is also to be manually set.

As shown in figure 3, since finally obtained model is there is no training is used as by noise sample, in real network Stronger noise resisting ability will be possessed in environment.

As shown in figure 4, including the following steps: in the step S1

S11: the sample of network flow data obtains output valve by deep neural network；

S12: penalty values are calculated with the true tag of the output valve and the sample of the network flow data；

S13: the penalty values and threshold value comparison are ignored into the network flow if the penalty values are greater than the threshold value Measure the sample of data；If the penalty values are less than the threshold value, the penalty values are subjected to backpropagation and optimize the network The parameter of traffic classification model.

It is gradient descent algorithm that the optimization method of deep learning is most common at present, and gradient descent algorithm can be divided into again to be criticized It measures gradient descent algorithm (Batch Gradient Descent), stochastic gradient descent algorithm (Stochastic Gradient ) and small lot gradient descent algorithm (Mini-batch Gradient Descent) Descent.Batch gradient is declined and is calculated Method, objective function is calculated on entire training set, if data set is bigger, may face low memory problem, And its convergence rate is generally slow.Stochastic gradient descent algorithm is that another is extreme, and objective function is for training set In a training sample calculate, also known as on-line study, each sample propagated forward and calculate loss after i.e. can be at once It calculates gradient and carries out backpropagation, without carrying out backpropagation again after loss all has been calculated in all samples.Obtain One sample, so that it may execute primary parameter update.Compared to traditional gradient descent algorithm, convergence rate can fast one A bit, training speed can be greatly speeded up in this way, it is possible that there is target function value reforming phenomena, because high-frequency parameter is more Newly result in high variance.

Such as introduce in this model from after step study, since the training of every wheel only selects easier sample to be learnt, if Model parameter is changed with regard to carrying out backpropagation before having not determined whole samples that epicycle to be learnt, then will lead to subsequent samples choosing The variation selected.This means that the sample of epicycle selection study will be related with the sequence of sample input model, this is not our institutes It is desired.Small lot gradient descent algorithm is half-way house, choose small lot sample in training set (usually 2 multiple, Such as 32,64,128 etc.) it calculates, it can guarantee that training process is more stable in this way, and also can use using batch training method The advantage that matrix calculates.This is most common gradient descent algorithm, but in this model there are still under stochastic gradient Identical problem drops.In order to overcome the problems, such as this, we introduce weight parameter in our current research, in conjunction with small lot gradient decline side Method training pattern.By calculating the study in each small lot gradient descent algorithm for the sample of the network flow data of training Complexity average value makes each small lot in training network flow classified model with the weight undated parameter of corresponding size Gradient decline has different update degree relative to the complexity of different samples in parameter update.

Since the sample losses for being easier to study are inherently smaller, if also lesser words, convergence process will for learning rate Become very slowly, in some instances it may even be possible to can not restrain.But if biggish learning rate is blindly used, it may be such that model parameter always It is swung in a region, can not converge to and more preferably be worth.In order to solve this problem, in this model, we are periodically Regularized learning algorithm rate, since being set a very small initial learning rate, such as 0.00001, then after each batch Network is updated, while increasing learning rate, counts the calculated loss of each batch.Finally we can depict the change of study Change the change curve of curve and loss, therefrom just it can be found that best initial learning rate.Then it is opened using this initial learning rate Begin to train, allow global learning rate to be randomized in a certain range in training rather than be set to fixed value, makes model Faster converge to optimal value.Optimal initial learning rate and training network flow classified model are obtained, and periodically adjustment is learned Habit rate.

As shown in figure 5, the method for obtaining optimal initial learning rate includes the following steps:

T1: one initial learning rate of setting is trained network flow classified model, and the net is updated after each batch The sample of network data on flows increases learning rate when updating the sample of the network flow data；

T2: the loss of the sample of each network flow data is calculated, the change curve of learning rate and loss is obtained；

T3: the optimal initial learning rate is obtained.

As shown in fig. 6, the network flow classified model in real network environment requires that the detection speed of flow must be reached Fastly, detection delay is low.Due to knowledge distillation the knowledge acquired can be moved to from a complicated machine learning model it is another A compact machine learning model keeps model preferable generalization ability while compact model scale, therefore the application is quasi- The model of the method building lightweight of knowledge based distillation, devises the model compression skill based on regularization loss knowledge distillation Flow is trained new light weight model in softmax layers of output valve and former label after former complex model by art together. Due to flow after former complex model softmax layers output valve unlike former only one position of label value be 1 remaining It is all 0, there may be certain value in each position, therefore the soft label that is otherwise known as.Former label is also referred to as hard label.Soft label In remain the knowledge that complex model is acquired, to new model training tool have very great help.

As shown in fig. 7, including the following steps: in the step S2

S21: the concept of " temperature " is introduced in the softmax layer of the network flow classified model, by softmax layers Output becomes:

Wherein, T is temperature, and a new network flow classified model is trained at a higher temperature T；

Obvious temperature is bigger, and the difference between the probability of softmax layers of output is smaller, and difference has become smaller that is original small Value become larger, i.e., to original output small probability value played amplification.One thus has been played for the small probability value of original output Fixed amplification avoids model from ignoring these small probability output valves and lose these information.We are at a higher temperature T A complexity but the very high new network flow classified model of accuracy rate are trained, and each network flow training sample The model is inputted, obtains each sample in softmax layers of output as next training lightweight network flow classified model Soft label.

S22: being input to the new network flow classified model for the sample of each network flow data, obtains every Output of the sample of a network flow data at softmax layers, i.e., soft label；

S23: it is re-introduced into L₁Regularization method utilizes the soft label and former true tag training at identical temperature T The lightweight network flow classified model.

Therein it is proposed that regularization loss function is as follows:

Loss=Loss1+Loss2+ λ | | w | |₁

Wherein, wherein λ is hyper parameter, | | w | |₁It is the L of light weight model parameter w₁Norm.L₁Norm being capable of rarefaction mould Shape parameter, to keep model simpler.After to entire model training, the connection side for being 0 for weight in network, we The side is deleted to advanced optimize model structure.Respectively new network flow classified model and lightweight network flow The soft label that disaggregated model is obtained at mutually synthermal T (T > 1)；It is softmax layers of lightweight network flow classified model The output valve of (in temperature T=1)；y_trueIt is true tag；α is hyper parameter, for adjusting the ratio of Loss1 and Loss2. CrossEntropy indicates cross entropy loss function.Loss1 enables lightweight network flow classified model to learn to former complicated The knowledge of model, Loss2 mark lightweight network flow classified model can also directly really from original hard label is i.e. former Oneself learning knowledge in label.Magnitude due to using the gradient of the cross entropy of soft label is to use the gradient of the cross entropy of hard label 1/T², therefore will be multiplied by T in Loss1²To balance Loss1 and Loss2.In this way, lightweight network flow classified model is not only protected The knowledge for having stayed new network flow classified model to acquire, and it typically is provided with stronger generalization ability.

As shown in figure 8, can refer to when actually using lightweight network flow classified model and carrying out net flow assorted, it will New network flow data can be obtained by trained lightweight network flow classified model (softmax layers of temperature is 1) To prediction label.Temperature, which is set as 1, to be too close to avoid the probability of the last softmax layers each dimension of output, with reach compared with Good classifying quality.

The present invention proposes a kind of new based on from walking there is much noise flow in real network environment The deep neural network flow Denoising Algorithm of study, and knowledge distillation technique is combined, network flow classified model is compressed, Obtain final lightweight network flow classified model.Specifically, we will propose combining from step study and deep learning Based on the deep neural network flow Denoising Algorithm from step study to achieve the purpose that further to remove network flow noise data. After network flow data passes through deep neural network and calculates penalty values, backpropagation Optimized model ginseng is not carried out directly Number, but penalty values and decision threshold are made comparisons.Ignore the sample greater than this training if threshold value if losing.Decision threshold simultaneously Value can constantly increase the sample for making model gradually learn more complexity with exercise wheel number.Such circulate operation, deep learning mould The type sample that only selection is easier to study every time is learnt, and outlier is finally weeded out.In this way due to finally obtained model Noise sample training is not used as, therefore it will possess stronger noise resisting ability in real network environment.Effectively mention Rise the robustness and classification accuracy of network flow classified model

After obtaining trained traffic classification model, what is proposed in the present invention loses the mould of knowledge distillation based on regularization Type compress technique will compress model, and compressed light weight model not only remains the knowledge that former complex model is acquired, And it typically is provided with stronger generalization ability, have detection speed faster, the classification task being able to satisfy in real network environment.

Embodiment 2

As shown in figure 9, the schematic diagram for the net flow assorted terminal device that one embodiment of the invention provides.The embodiment Net flow assorted terminal device include: processor, memory and storage in the memory and can be in the processor The computer program of upper operation, such as the program of building lightweight network flow classified model.The processor executes the meter The step in above-mentioned each net flow assorted embodiment of the method, such as step S1-S2 shown in Fig. 2 are realized when calculation machine program. Alternatively, the processor realizes the function of each module/unit in above-mentioned each Installation practice, example when executing the computer program Such as train net flow assorted model unit and building lightweight network flow classified model unit.

Illustratively, the computer program can be divided into one or more module/units, one or more A module/unit is stored in the memory, and is executed by the processor, to complete the present invention.It is one or more A module/unit can be the series of computation machine program instruction section that can complete specific function, and the instruction segment is for describing institute State implementation procedure of the computer program in the net flow assorted terminal device.For example, the computer program can be by It is divided into trained net flow assorted model unit and building lightweight network flow classified model unit, each module concrete function It is as follows: based on the deep neural network flow Denoising Algorithm training network flow classified model from step study；It is damaged based on regularization The model compression technology for losing knowledge distillation, is compressed into lightweight network flow classified model for the network flow classified model.

The net flow assorted terminal device can be desktop PC, notebook, palm PC and cloud service Device etc. calculates equipment.The net flow assorted terminal device may include, but be not limited only to, processor, memory.This field skill Art personnel are appreciated that the schematic diagram is only the example of net flow assorted terminal device, do not constitute to network flow The restriction of classified terminal equipment may include perhaps combining certain components or different than illustrating more or fewer components Component, such as the net flow assorted terminal device can also include input-output equipment, network access equipment, bus etc..

Alleged processor can be central processing unit (Central Processing Unit, CPU), can also be it His general processor, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit (Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field- Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic, Discrete hardware components etc..General processor can be microprocessor or the processor is also possible to any conventional processor Deng the processor is the control centre of the net flow assorted terminal device, entire using various interfaces and connection The various pieces of net flow assorted terminal device.

The memory can be used for storing the computer program and/or module, and the processor is by operation or executes Computer program in the memory and/or module are stored, and calls the data being stored in memory, described in realization The various functions of net flow assorted terminal device.The memory can mainly include storing program area and storage data area, In, storing program area can application program needed for storage program area, at least one function (such as sound-playing function, image Playing function etc.) etc.；Storage data area, which can be stored, uses created data (such as audio data, phone directory according to mobile phone Deng) etc..In addition, memory may include high-speed random access memory, it can also include nonvolatile memory, such as firmly Disk, memory, plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) block, flash card (Flash Card), at least one disk memory, flush memory device or other volatile solid-states Part.

If the integrated module/unit of the net flow assorted terminal device is realized in the form of SFU software functional unit simultaneously When sold or used as an independent product, it can store in a computer readable storage medium.Based on such reason Solution, the present invention realize all or part of the process in above-described embodiment method, can also instruct correlation by computer program Hardware complete, the computer program can be stored in a computer readable storage medium, the computer program is in quilt When processor executes, it can be achieved that the step of above-mentioned each embodiment of the method.Wherein, the computer program includes computer program Code, the computer program code can be source code form, object identification code form, executable file or certain intermediate forms Deng.The computer-readable medium may include: any entity or device, record that can carry the computer program code Medium, USB flash disk, mobile hard disk, magnetic disk, CD, computer storage, read-only memory (ROM, Read-Only Memory), with Machine access memory (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium etc.. It should be noted that the content that the computer-readable medium includes can be according to legislation and patent practice in jurisdiction It is required that carrying out increase and decrease appropriate, such as in certain jurisdictions, do not wrapped according to legislation and patent practice, computer-readable medium Include electric carrier signal and telecommunication signal.

The above content is a further detailed description of the present invention in conjunction with specific preferred embodiments, and it cannot be said that Specific implementation of the invention is only limited to these instructions.For those skilled in the art to which the present invention belongs, it is not taking off Under the premise of from present inventive concept, several equivalent substitute or obvious modifications can also be made, and performance or use is identical, all answered When being considered as belonging to protection scope of the present invention.

Claims

1. a kind of net flow assorted method, which is characterized in that including constructing lightweight disaggregated model；

The building lightweight disaggregated model includes the following steps:

S2: the network flow classified model is compressed into light weight by the model compression technology based on regularization loss knowledge distillation Grade network flow classified model.

2. net flow assorted method as described in claim 1, which is characterized in that include the following steps: in step S1

S13: the penalty values and threshold value comparison are ignored into the network flow number if the penalty values are greater than the threshold value According to sample；If the penalty values are less than the threshold value, the penalty values are subjected to backpropagation and optimize the network flow The parameter of disaggregated model.

3. net flow assorted method as claimed in claim 2, which is characterized in that optimize the network flow classified model The method of parameter uses small lot gradient descent algorithm.

4. net flow assorted method as claimed in claim 3, which is characterized in that by calculating the small lot gradient every time Study complexity average value in descent algorithm for the sample of the network flow data of training, in the training network With the weight undated parameter of corresponding size when traffic classification model.

5. net flow assorted method as claimed in claim 2, which is characterized in that obtain optimal initial learning rate and training Network flow classified model, and periodically regularized learning algorithm rate.

6. net flow assorted method as claimed in claim 5, which is characterized in that the method for obtaining optimal initial learning rate Include the following steps:

T1: one initial learning rate of setting is trained network flow classified model, and the network flow is updated after each batch The sample for measuring data, increases learning rate when updating the sample of the network flow data；

T3: the optimal initial learning rate is obtained.

7. net flow assorted method as described in claim 1, which is characterized in that include the following steps: in step S2

S21: the concept of " temperature " is introduced in the softmax layer of the network flow classified model, softmax layers of output is become Are as follows:

S22: the sample of each network flow data is input to the new network flow classified model, obtains each institute State output of the sample of network flow data at softmax layers, i.e., soft label；

S23: it is re-introduced into L₁Regularization method, at identical temperature T, using described in the soft label and former true tag training Lightweight network flow classified model.

8. net flow assorted method as claimed in claim 7, which is characterized in that propose that regularization loss function is as follows:

Loss=Loss1+Loss2+ λ | | w | |₁

Wherein, λ is hyper parameter, | | w | |₁It is the L of light weight model parameter w₁Norm,The respectively described new network flow The soft label that amount disaggregated model is obtained at mutually synthermal T with lightweight network flow classified model, T > 1,For lightweight mould Softmax layers of the type output valve in temperature T=1；y_trueIt is true tag；α is hyper parameter, for adjusting Loss1's and Loss2 Ratio, CrossEnteropy indicate cross entropy loss function.

9. net flow assorted method as claimed in claim 8, which is characterized in that network flow data passes through the lightweight Prediction label, the temperature of the softmax layer of the lightweight network flow classified model can be obtained in network flow classified model It is 1.

10. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists In being realized when the computer program is executed by processor such as the step of claim 1-9 any the method.