CN112235264A - Network traffic identification method and device based on deep migration learning - Google Patents
Network traffic identification method and device based on deep migration learning Download PDFInfo
- Publication number
- CN112235264A CN112235264A CN202011042795.4A CN202011042795A CN112235264A CN 112235264 A CN112235264 A CN 112235264A CN 202011042795 A CN202011042795 A CN 202011042795A CN 112235264 A CN112235264 A CN 112235264A
- Authority
- CN
- China
- Prior art keywords
- network traffic
- protocol type
- sample
- target
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1425—Traffic logging, e.g. anomaly detection
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/24—Traffic characterised by specific attributes, e.g. priority or QoS
- H04L47/2441—Traffic characterised by specific attributes, e.g. priority or QoS relying on flow classification, e.g. using integrated services [IntServ]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/24—Traffic characterised by specific attributes, e.g. priority or QoS
- H04L47/2483—Traffic characterised by specific attributes, e.g. priority or QoS involving identification of individual flows
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1416—Event detection, e.g. attack signature detection
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/14—Session management
- H04L67/141—Setup of application sessions
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Computer Security & Cryptography (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Computer Hardware Design (AREA)
- Theoretical Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The embodiment of the invention provides a network traffic identification method and device based on deep migration learning, relates to the technical field of network security, and can identify novel network traffic. The technical scheme of the embodiment of the invention comprises the following steps: and extracting message information and communication behavior information of a preset number of data packets from the network traffic to be identified. And then calculating the distance between the message information and the communication behavior information of the network flow to be identified and the clustering center of each cluster, wherein each cluster comprises the message information and the communication behavior information of the network flow of one category. And when the shortest distance in the calculated distances is smaller than the preset distance, obtaining the target category of the category cluster corresponding to the shortest distance. And inputting the message two-dimensional data matrix corresponding to the message information and the behavior two-dimensional data matrix corresponding to the behavior information into a network traffic identification model of the target category, and determining whether the network traffic to be identified is malicious traffic.
Description
Technical Field
The invention relates to the technical field of network security, in particular to a network traffic identification method and device based on deep migration learning.
Background
With the rapid development of the fifth generation mobile communication (5G) technology, the internet of things, the industrial internet and other novel network technologies and the diversification trend of application scenes, the form of the network terminal is more diversified and the number of the network terminal is exponentially increased. Once network attacks such as remote control, information stealing, denial of service and the like initiated by malicious equipment successfully invade a network, the network attacks can form a significant threat to the user information security of the network terminal, and therefore the network security risk faced by the network terminal is increasingly highlighted.
At present, most network attacks need to achieve the malicious purpose through network communication, and if the protocol type of network traffic generated by network attack behaviors can be accurately identified, and whether the network traffic is the network attack or not is judged according to the protocol type, an attacked target system and equipment can be determined, so that effective countermeasures are implemented.
However, existing network monitoring and analyzing means such as port identification and deep packet inspection all need to utilize samples with classification labels in advance to train a classification network, however, in a novel network application scenario, a novel network flow of an unknown protocol type lacks samples with classification labels, and then it is impossible to detect whether the novel network flow is malicious flow.
Disclosure of Invention
The embodiment of the invention aims to provide a network traffic identification method and device based on deep migration learning, so as to solve the problem that novel network traffic cannot be identified. The specific technical scheme is as follows:
in a first aspect, an embodiment of the present invention provides a network traffic identification method based on deep migration learning, where the method includes:
extracting message information and communication behavior information of a preset number of data packets from network traffic to be identified, wherein the network traffic to be identified comprises network traffic generated in a session establishment stage and network traffic transmitted based on the established session;
calculating the distance between the message information and the communication behavior information of the network flow to be identified and the clustering center of each cluster, wherein each cluster comprises the message information and the communication behavior information of the network flow of one category;
when the shortest distance in the calculated distances is smaller than a preset distance, obtaining a target category of a category cluster corresponding to the shortest distance;
inputting a message two-dimensional data matrix corresponding to the message information and a behavior two-dimensional data matrix corresponding to the behavior information into the network traffic identification model of the target category, and determining whether the network traffic to be identified is malicious traffic;
the network traffic identification model of the target category is a model which is constructed by a deep migration learning method on the basis of a pre-training model corresponding to a target protocol type matched with the target category; the pre-training model corresponding to the target protocol type is as follows: training a deep learning model through a training sample set corresponding to the target protocol type to obtain a model; the training sample set corresponding to the target protocol type comprises: the sample two-dimensional data matrix of the sample network traffic of the target protocol type and the normal or malicious label corresponding to the sample network traffic of the target protocol type, where the sample two-dimensional data matrix of each sample network traffic includes: and respectively constructing a sample message two-dimensional data matrix and a sample behavior two-dimensional data matrix based on the message information and the communication behavior information of the data packets with the preset number in the sample network flow.
Optionally, before the message information and the communication behavior information of a preset number of data packets are extracted from the network traffic to be identified, the method further includes:
obtaining sample information sets of known protocol types, each sample information set of a known protocol type comprising: the message information and the communication behavior information of a number of data packets are preset in the sample network flow of the known protocol type;
dividing the pre-collected network traffic of undetermined protocol types by taking a session as a unit to obtain a plurality of unidentified network traffic;
extracting message information and communication behavior information of a preset number of data packets from each unidentified network flow;
clustering the message information and the communication behavior information of the plurality of unidentified network flows to obtain a cluster of each category;
calculating the maximum mean difference MMD between the class cluster of the class and the sample information set of each known protocol type according to each class, determining the known protocol type corresponding to the sample information set with the minimum MMD of the class cluster of the class, and taking the determined known protocol type as the protocol type matched with the class;
and constructing a network traffic identification model of the category by a deep migration learning method on the basis of a pre-training model corresponding to the protocol type matched with the category.
Optionally, the network traffic identification model of the target class is constructed through the following steps:
inputting a two-dimensional sample data matrix of the sample network traffic of the target protocol type into a pre-training model corresponding to the target protocol type;
step two, obtaining an output result of a pre-training model corresponding to the target protocol type;
calculating a loss value according to the output result, a normal or malicious label corresponding to the sample network flow of the target protocol type and the MMD between the sample information set of the target protocol type and the class cluster of the target class;
step four, if the pre-training model corresponding to the target protocol type is determined to be converged based on the loss value, determining that the network traffic identification model of the target type is the pre-training model corresponding to the target protocol type;
and step five, if the pre-training model corresponding to the target protocol type is determined not to be converged based on the loss value, adjusting model parameters of a full connection layer of the pre-training model corresponding to the target protocol type based on the loss value, and returning to the step one.
Optionally, the MMD between a class cluster of a class and a sample information set of a known protocol type is calculated by the following formula:
wherein D istiClass clusters of class i, DskSample information set, n, for a known protocol type ktiIs DtiA corresponding amount of unrecognized network traffic,nskis DskThe corresponding number of sample network traffic volumes,h denotes the calculation of the distance measured by Φ (-) mapping the data into the regenerated kernel hilbert space RKHS.
Optionally, after the pre-training model corresponding to the protocol type matched with the category is used as a basis to construct the network traffic identification model of the category through a deep migration learning method, the method further includes:
aiming at each unidentified network flow of the category, constructing a message two-dimensional data matrix according to the message information of the unidentified network flow, and constructing a behavior two-dimensional data matrix according to the communication behavior information of the unidentified network flow;
and inputting the constructed message two-dimensional data matrix and the behavior two-dimensional data matrix into the network traffic identification model of the category, and determining whether the unidentified network traffic is malicious traffic.
Optionally, the network traffic identification model of the target class includes: a first convolution layer, a second convolution layer, a full connection layer and an output layer; the network traffic identification model of the target category identifies whether the network traffic to be identified is malicious traffic or not through the following steps:
the first convolution layer performs convolution on the message two-dimensional data matrix by using a two-dimensional convolution core to obtain a first characteristic diagram;
the second convolution layer performs convolution on the behavior two-dimensional data matrix by using a two-dimensional convolution core to obtain a second characteristic diagram;
the full connection layer integrates the first characteristic diagram and the second characteristic diagram to obtain a third characteristic diagram;
and the output layer calculates the third characteristic diagram by using a preset classification algorithm to obtain and output whether the network traffic to be identified is malicious traffic.
Optionally, after obtaining the target category of the class cluster corresponding to the shortest distance, the method further includes:
determining the protocol type of the network traffic to be identified as a target protocol type matched with the target type;
if the target protocol type is a protocol type in a preset white list, determining that the network traffic to be identified is trusted network traffic, wherein the preset white list comprises the protocol type of the trusted network traffic;
and if the target protocol type is a protocol type in a preset blacklist, determining that the network traffic to be identified is untrusted network traffic, wherein the preset blacklist comprises the protocol type of the untrusted network traffic.
In a second aspect, an embodiment of the present invention provides a network traffic identification device based on deep migration learning, where the device includes:
the data acquisition module is used for extracting message information and communication behavior information of a preset number of data packets from network traffic to be identified, wherein the network traffic to be identified comprises network traffic generated in a session establishment stage and network traffic transmitted based on the established session;
the distance calculation module is used for calculating the distance between the message information and the communication behavior information of the network flow to be identified and the clustering center of each cluster, and each cluster comprises the message information and the communication behavior information of one type of network flow;
the classification module is used for obtaining the target class of the class cluster corresponding to the shortest distance when the shortest distance in the calculated distances is smaller than a preset distance;
the traffic identification module is used for inputting the message two-dimensional data matrix corresponding to the message information and the behavior two-dimensional data matrix corresponding to the behavior information into the network traffic identification model of the target category, and determining whether the network traffic to be identified is malicious traffic;
the network traffic identification model of the target category is a model which is constructed by a deep migration learning method on the basis of a pre-training model corresponding to a target protocol type matched with the target category; the pre-training model corresponding to the target protocol type is as follows: training a deep learning model through a training sample set corresponding to the target protocol type to obtain a model; the training sample set corresponding to the target protocol type comprises: the sample two-dimensional data matrix of the sample network traffic of the target protocol type and the normal or malicious label corresponding to the sample network traffic of the target protocol type, where the sample two-dimensional data matrix of each sample network traffic includes: and respectively constructing a sample message two-dimensional data matrix and a sample behavior two-dimensional data matrix based on the message information and the communication behavior information of the data packets with the preset number in the sample network flow.
Optionally, the apparatus further comprises: the system comprises a dividing module, an unknown protocol clustering module, a protocol type matching module and an identification model building module;
the data acquisition module is further configured to obtain sample information sets of known protocol types before extracting message information and communication behavior information of a preset number of data packets from the network traffic to be identified, where each sample information set of a known protocol type includes: the message information and the communication behavior information of a number of data packets are preset in the sample network flow of the known protocol type;
the dividing module is used for dividing the pre-collected network traffic of undetermined protocol type by taking a session as a unit to obtain a plurality of unidentified network traffic;
the data acquisition module is also used for extracting message information and communication behavior information of a preset number of data packets from each unidentified network flow;
the unknown protocol clustering module is used for clustering the message information and the communication behavior information of the plurality of unidentified network flows to obtain clusters of various categories;
the protocol type matching module is used for calculating the maximum mean value difference MMD between the class cluster of the class and the sample information set of each known protocol type aiming at each class, determining the known protocol type corresponding to the sample information set with the minimum MMD of the class cluster of the class, and taking the determined known protocol type as the protocol type matched with the class;
the identification model construction module is used for constructing the network traffic identification model of the category by a deep migration learning method on the basis of a pre-training model corresponding to the protocol type matched with the category.
Optionally, the identification model building module is specifically configured to:
inputting a two-dimensional sample data matrix of the sample network traffic of the target protocol type into a pre-training model corresponding to the target protocol type;
step two, obtaining an output result of a pre-training model corresponding to the target protocol type;
calculating a loss value according to the output result, a normal or malicious label corresponding to the sample network flow of the target protocol type and the MMD between the sample information set of the target protocol type and the class cluster of the target class;
step four, if the pre-training model corresponding to the target protocol type is determined to be converged based on the loss value, determining that the network traffic identification model of the target type is the pre-training model corresponding to the target protocol type;
and step five, if the pre-training model corresponding to the target protocol type is determined not to be converged based on the loss value, adjusting model parameters of a full connection layer of the pre-training model corresponding to the target protocol type based on the loss value, and returning to the step one.
Optionally, the MMD between a class cluster of a class and a sample information set of a known protocol type is calculated by the following formula:
wherein D istiClass clusters of class i, DskSample information set, n, for a known protocol type ktiIs DtiA corresponding amount of unrecognized network traffic,nskis DskThe corresponding number of sample network traffic volumes,h denotes the calculation of the distance measured by Φ (-) mapping the data into the regenerated kernel hilbert space RKHS.
Optionally, the apparatus further comprises: a data matrix construction module;
the data matrix construction module is used for constructing a network traffic identification model of the category by a deep migration learning method on the basis of the pre-training model corresponding to the protocol type matched with the category, constructing a message two-dimensional data matrix according to message information of unidentified network traffic aiming at each unidentified network traffic of the category, and constructing a behavior two-dimensional data matrix according to communication behavior information of the unidentified network traffic;
the traffic identification module is further configured to input the constructed two-dimensional data matrix of the message and the two-dimensional data matrix of the behavior into the network traffic identification model of the category, and determine whether the unidentified network traffic is malicious traffic.
Optionally, the network traffic identification model of the target class includes: a first convolution layer, a second convolution layer, a full connection layer and an output layer; the flow identification module is specifically configured to:
the first convolution layer performs convolution on the message two-dimensional data matrix by using a two-dimensional convolution core to obtain a first characteristic diagram;
the second convolution layer performs convolution on the behavior two-dimensional data matrix by using a two-dimensional convolution core to obtain a second characteristic diagram;
the full connection layer integrates the first characteristic diagram and the second characteristic diagram to obtain a third characteristic diagram;
and the output layer calculates the third characteristic diagram by using a preset classification algorithm to obtain and output whether the network traffic to be identified is malicious traffic.
Optionally, the apparatus further comprises: a flow determination module to:
after the target class of the class cluster corresponding to the shortest distance is obtained, determining the protocol type of the network traffic to be identified as a target protocol type matched with the target class;
if the target protocol type is a protocol type in a preset white list, determining that the network traffic to be identified is trusted network traffic, wherein the preset white list comprises the protocol type of the trusted network traffic;
and if the target protocol type is a protocol type in a preset blacklist, determining that the network traffic to be identified is untrusted network traffic, wherein the preset blacklist comprises the protocol type of the untrusted network traffic.
In a third aspect, an embodiment of the present invention provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, where the processor and the communication interface complete communication between the memory and the processor through the communication bus;
a memory for storing a computer program;
and the processor is used for realizing the steps of any network traffic identification method based on deep migration learning when executing the program stored in the memory.
In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when executed by a processor, the computer program implements any of the steps of the deep migration learning based network traffic identification method described above.
In a fifth aspect, an embodiment of the present invention further provides a computer program product containing instructions, which when run on a computer, causes the computer to execute any of the above-mentioned network traffic identification methods based on deep migration learning.
The technical scheme of the embodiment of the invention can at least bring the following beneficial effects: because the message two-dimensional data matrix and the behavior two-dimensional data matrix of the network traffic identification model can be automatically extracted, the traffic characteristics do not need to be manually designed and extracted, and the identification efficiency of the network traffic protocol type is improved. The network traffic to be identified of unknown protocol type is classified to obtain a target class with the shortest distance to the network traffic to be identified, and the network traffic to be identified is identified based on a network traffic identification model of the target class. Network flow types and sample distribution in the novel application scene are obtained through an unsupervised clustering method, and the bottleneck that the novel network lacks unknown network protocol sample labels and cannot be supervised and learned can be effectively overcome; by comparing the distribution difference after the network traffic clustering, the pre-training model of the known network protocol matched with various clusters is found out for transfer learning, so that the identification accuracy of the network traffic of the unknown protocol can be improved.
Of course, not all of the advantages described above need to be achieved at the same time in the practice of any one product or method of the invention.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained by using the drawings without creative efforts.
Fig. 1 is a flowchart of a network traffic identification method based on deep migration learning according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a network traffic identification model according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of another network traffic identification model according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of another network traffic identification model according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of another network traffic identification model according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a network traffic identification apparatus based on deep migration learning according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of another network traffic identification apparatus based on deep migration learning according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to identify a novel network traffic of an unknown protocol type, the embodiment of the invention provides a network traffic identification method based on deep learning, and the method can be applied to electronic equipment, wherein the electronic equipment can be equipment with data processing capability, such as a mobile phone, a computer, a tablet computer and the like. As shown in fig. 1, the method includes the following steps.
In one embodiment, a preset number of data packets in the network traffic to be identified may be collected based on a probe deployed by-pass on a preset network node, and then message information and communication behavior information of each collected data packet may be obtained.
Optionally, the data acquisition probe may bypass to acquire the network traffic in a light splitting or splitting manner, and divide the network traffic by taking a session as a unit, to obtain message information of a data packet in the network traffic corresponding to each session, and to monitor communication behavior information of the data packet in the network traffic corresponding to each session. And then the probe stores the acquired message information of the data packets into a database in a pcap format file form, wherein one pcap format file corresponds to a session data packet set. And storing the obtained communication behavior information of the data packet into a database in a log form. The probe can acquire network traffic under the condition of not influencing network traffic transmission and service application of a network, and meanwhile, the probe also has omnidirectional data acquisition capacity of the network.
When the electronic equipment acquires information, the electronic equipment can read a file in a pcap format from a database, extract the payloads of a preset number of data packets to obtain message information, and read the log corresponding to the session to which the network traffic to be identified belongs to obtain communication behavior information. Wherein the message information belongs to the payload of the data packet.
Specifically, a network session communication phase may be divided into two phases, where the first phase is a session establishment phase and the second phase is a data transmission phase. Optionally, the network session in the embodiment of the present invention may be an encrypted network session or a plaintext network session.
An encrypted web session communication phase can be divided into two phases: the first phase is a plaintext communication phase for establishing connection, which may be called a session establishment phase, and includes handshaking, authentication and key exchange, and a session key is generated in the first phase; the second stage encrypts the transmission data using the key generated in the first stage.
Therefore, the network traffic in the embodiment of the present invention includes the network traffic generated in the session establishment stage and the network traffic transmitted based on the established session.
Illustratively, the preset number may be 6. When the number of the data packets of the network flow is smaller than the preset number, a plurality of data packets with the numerical value of 0 can be added until the number of the data packets after the completion is equal to the preset number.
And 102, calculating the distance between the message information and the communication behavior information of the network flow to be identified and the clustering center of each cluster.
Each class cluster comprises message information and communication behavior information of a preset number of data packets in a class of network flow.
For example, euclidean distances between message information and communication behavior information of network traffic to be identified and the clustering centers of the various clusters may be calculated.
And 103, when the shortest distance in the calculated distances is smaller than a preset distance, obtaining the target category of the category cluster corresponding to the shortest distance.
Optionally, when the shortest distance in the calculated distances is not less than the preset distance, it is determined that the network traffic to be identified is unknown network traffic.
It can be understood that when the shortest distance is less than the preset distance, it indicates that the network traffic to be identified is similar to the network traffic of the category, and it may be determined that the network traffic to be identified belongs to the category. And when the shortest distance is not less than the preset distance, the network to be identified is not similar to the network flow of each category.
And 104, inputting the message two-dimensional data matrix corresponding to the message information and the behavior two-dimensional data matrix corresponding to the behavior information into a network traffic identification model of the target category, and determining whether the network traffic to be identified is malicious traffic.
The network traffic identification model of the target category is a model which is constructed by a deep migration learning method on the basis of a pre-training model corresponding to a target protocol type matched with the target category; the pre-training model corresponding to the target protocol type is as follows: training the deep learning model through a training sample set corresponding to the target protocol type to obtain a model; the training sample set corresponding to the target protocol type comprises: the method comprises the following steps that a sample two-dimensional data matrix of sample network traffic of a target protocol type and a normal or malicious label corresponding to the sample network traffic of the target protocol type are obtained, wherein the sample two-dimensional data matrix of each sample network traffic comprises: and respectively constructing a sample message two-dimensional data matrix and a sample behavior two-dimensional data matrix based on the message information and the communication behavior information of the data packets with the preset number in the sample network flow.
Optionally, the deep learning model may include: convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Long/short term memory networks (LSTM), and the like.
In one embodiment, after the message two-dimensional data matrix and the behavior two-dimensional data matrix are input into the network traffic identification model of the target category, the output result of the network traffic identification model is obtained. Optionally, the output result may be 0 or 1, where 0 represents that the network traffic to be identified is malicious traffic, and 1 represents that the network traffic to be identified is normal traffic.
Optionally, before the network traffic identification model is input, the two-dimensional message data matrix and the two-dimensional behavior data matrix may be preprocessed respectively, and then the preprocessed two-dimensional message data matrix and the preprocessed two-dimensional behavior data matrix are input into the network traffic identification model. For example, the pre-processing may be a normalization of the two-dimensional data matrix.
The technical scheme of the embodiment of the invention can at least bring the following beneficial effects: because the message two-dimensional data matrix and the behavior two-dimensional data matrix input into the network traffic identification model can be automatically extracted without manual design and extraction of traffic characteristics, the identification efficiency of the network traffic protocol type is improved. The network traffic to be identified of unknown protocol type is classified to obtain a target class with the shortest distance to the network traffic to be identified, and the network traffic to be identified is identified based on a network traffic identification model of the target class. Network flow types and sample distribution in the novel application scene are obtained through an unsupervised clustering method, and the bottleneck that the novel network lacks unknown network protocol sample labels and cannot be supervised and learned can be effectively overcome; by comparing the distribution difference after the network traffic clustering, the pre-training model of the known network protocol matched with various clusters is found out for transfer learning, so that the identification accuracy of the network traffic of the unknown protocol can be improved.
In the embodiment of the present invention, before step 104, a message two-dimensional data matrix and a behavior two-dimensional data matrix of the network traffic to be identified may also be constructed.
In one embodiment, for each data packet of a preset number, message information of a first preset length of the data packet may be extracted; and then according to the arrangement sequence of the preset number of data packets, the message information with the specified length of each data packet is formed into a message two-dimensional data matrix.
Optionally, by r1The number of the data packets is one,construction of m1*m1The manner of behaving as a two-dimensional data matrix of (a) includes: for each data packet, extracting the first m of the message information of the data packet according to the sequence1K bytes, wherein k<m1And m is1=k*r1. If the message information of the data packet is less than m1And k bytes, performing zero padding on the message information. Corresponding m to the first data packet1Filling 1 to k columns of the two-dimensional matrix with k bytes, and filling m corresponding to the second data packet1Filling k bytes into k +1 to 2k columns of the two-dimensional matrix respectively, and so on, wherein m corresponding to the r-th data packet1Filling m of two-dimensional matrix of behaviors by k bytes respectively1-k +1 to m1And (4) columns.
In the examples of the present invention, m1Can be set according to actual needs, e.g. m142. Let r be1For 6, the row two-dimensional data matrix with the first preset length 294, 42 × 42 may be constructed by the following method: and extracting the first 294 bytes of the message information of each data packet, and if the message information of the data packet is less than 294 bytes, performing zero padding on the message information. And filling 294 bytes corresponding to the first data packet into 1 to 7 columns of the two-dimensional matrix respectively, filling 294 bytes corresponding to the second data packet into 8 to 14 columns of the two-dimensional matrix respectively, and so on, and filling 294 bytes corresponding to the sixth data packet into 36 to 42 columns of the two-dimensional matrix respectively.
In one embodiment, for each data packet of a preset number, specific information in the communication behavior information of the data packet may be extracted; and then according to the arrangement sequence of the preset number of data packets, forming a behavior two-dimensional data matrix by specified information in the communication behavior information of the data packets.
Optionally, the specific information may be: the method comprises the following steps of counting information, the length of a data packet, a timestamp difference value of adjacent data packets and data packet sequence information, wherein the counting information can comprise a session communication port, the total number of data packets in a session, the direction of the data packets, the session communication time length and the like; the sequence information may be a sequence number. Construction of m2*m2NewspaperThe literal two-dimensional matrix is: the first i columns of the matrix correspond to r2The length of each data packet, i +1 th to i + j th columns of the matrix correspond to r2The time stamp difference value of the adjacent data packet in each data packet corresponds to r from the i + j +1 th column to the i + j + l th column of the matrix2Sequence information of each data packet, i + j + l +1 to n columns of the matrix correspond to r2Statistics of individual packets. Wherein, i, j, l, i + j + l<n。
In the examples of the present invention, m2Can be set according to actual needs, e.g. m232. Let r be2The two-dimensional data matrix of behavior 12, 32 × 32 is: the 1 st to 6 th columns of the matrix correspond to the length of the first 6 data packets; the 8 th to 14 th columns of the matrix correspond to the time stamp difference value of the adjacent data packet in the first 6 data packets; the 15 th to 21 st columns correspond to the sequence information of the first 6 data packets; the 22 nd to 32 th columns of the matrix correspond to the statistics of the first 6 data packets.
Because the information of the input model in the embodiment of the invention is the message two-dimensional data matrix and the behavior two-dimensional data matrix, the message information and the communication behavior information of the network flow can be simultaneously embodied, and the method is more suitable for the structural form of the network flow data.
In the embodiment of the present invention, the message information includes: original messages of a preset number of data packets in the network flow to be identified; the communication behavior information includes at least one of the following information: the method comprises the steps of counting information of a preset number of data packets in the network flow to be identified, data packet sequence information, data packet length, data packet time stamps and time stamp difference values of adjacent data packets.
The timestamp of the data packet may be the sending time of the data packet, and the timestamp difference of the adjacent data packets may be: starting with the second packet, the difference between the time of issuance of each packet and the time of issuance of the last packet in the session.
The packet information of the data packet may be understood as information reflecting the content in the data packet, that is, information reflecting the static characteristics of the data packet in the communication process. For example, the message information may include a field value of a packet, header information of the packet, and the like.
The communication behavior information of the data packet may be understood as attribute information of the data packet related to the communication process, that is, information reflecting the dynamic characteristics of the data packet during the communication process.
The technical scheme of the embodiment of the invention can also bring the following beneficial effects: because the message information and the communication behavior information of the data packet in the network flow can fully reflect the protocol type of the network flow, the message information and the communication behavior information of the data packet in the network flow to be identified can be obtained, and the network flow to be identified can be identified more accurately.
In the embodiment of the present invention, before the step 101, a network traffic identification model of each category may be further constructed, and specifically, the following steps may be included.
Step 1, obtaining a sample information set of each known protocol type. Wherein the sample information set for each known protocol type comprises: the message information and the communication behavior information of a number of data packets are preset in the sample network flow of the known protocol type.
Optionally, the sample information set of each protocol type may further include: the protocol type comprises a message two-dimensional data matrix and a behavior two-dimensional data matrix of the sample network flow.
For example, the source domain contains sample network traffic of N known protocol types, P for each of the N known protocol typess1,...,PsN. Dividing the sample network traffic into N sample information sets according to the protocol type: ds1、Ds2、…、DsN. Each sample information set not only contains message information and communication behavior information of normal network flow, but also contains message information and communication behavior information of malicious network flow.
Optionally, for each known protocol type, the deep learning model may be trained with the training sample set of the known protocol type, so as to obtain a pre-training model of the known protocol type. N independent models were formed. Each pre-trained model can identify whether network traffic of the known protocol type is normal traffic or malicious traffic.
In an alternative implementation, N may be set to 15.
And 2, dividing the pre-collected network traffic of undetermined protocol types by taking the session as a unit to obtain a plurality of unidentified network traffic.
And 3, extracting the message information and the communication behavior information of the data packets with the preset number from each unidentified network flow.
And 4, clustering the message information and the communication behavior information of the plurality of unidentified network flows to obtain a cluster of each category.
Optionally, a K-means clustering algorithm (K-means clustering algorithm) may be used for clustering, or other clustering algorithms may also be used, which is not specifically limited in this embodiment of the present invention. For example, a Clustering algorithm (Density-Based Spatial Clustering of Applications with Noise, DBSCAN) or the like may be used.
The clustering result is M clusters: dt1、Dt2、…、DtM. In an alternative implementation, M may be set to 7.
And 5, calculating the Maximum Mean Difference (MMD) between the class cluster of the class and the sample information set of each known protocol type according to each class, determining the known protocol type corresponding to the sample information set with the minimum MMD of the class cluster of the class, and taking the determined known protocol type as the protocol type matched with the class. And determining the protocol type of the network traffic of the category as: the type of protocol that matches the category.
In one embodiment, the MMD between a class cluster of a class and a sample information set of a known protocol type can be calculated by equation (1):
wherein D istiClass clusters of class i, DskSample information set, n, for a known protocol type ktiIs DtiA corresponding amount of unrecognized network traffic,nskis DskThe corresponding number of sample network traffic volumes,h denotes that the calculation of distance is measured by Φ (-) mapping the data into Regenerative Kernel Hilbert Space (RKHS).
Exemplary, ntiMay be 1000, nskMay be 700.
And 6, constructing a network traffic identification model of the category by a deep migration learning method on the basis of a pre-training model corresponding to the protocol type matched with the category.
And 7, aiming at each unidentified network flow of the category, constructing a message two-dimensional data matrix according to the message information of the unidentified network flow, and constructing a behavior two-dimensional data matrix according to the communication behavior information of the unidentified network flow.
And 8, inputting the constructed message two-dimensional data matrix and the behavior two-dimensional data matrix into the network traffic identification model of the category, and determining whether the unidentified network traffic is malicious traffic.
In the embodiment of the invention, the type and the sample distribution of the network flow of the unknown network protocol are obtained by the unsupervised clustering method, so that the bottleneck that the novel network lacks unknown network protocol sample labels and can not be supervised and learned can be effectively overcome. By comparing the sample distribution difference, the pre-training model of the known network protocol most similar to each unknown protocol is found out for transfer learning, and the accuracy of unknown protocol identification can be improved to the greatest extent.
In the embodiment of the invention, the mode of obtaining the network traffic identification models of all categories by the transfer learning method is the same. The following describes a procedure of constructing a network traffic recognition model, taking the construction of a network traffic recognition model of a target class as an example.
Step one, inputting a sample two-dimensional data matrix of sample network flow of a target protocol type into a pre-training model corresponding to the target protocol type.
And step two, obtaining an output result of the pre-training model corresponding to the target protocol type.
Alternatively, the output result of the pre-trained model may be 1 or 0. Where 1 indicates that the input network traffic is normal traffic, and 0 indicates that the input network traffic is malicious traffic.
And step three, calculating a loss value according to the output result, a normal or malicious label corresponding to the sample network flow of the target protocol type and the MMD between the sample information set of the target protocol type and the class cluster of the target class.
In an alternative embodiment, in order to make DtiAnd DskThe data distribution of (2) is closer, the loss function can be calculated based on the DDC method, and the loss function can be minimized by adopting a gradient descent method.
The loss function may be formula (2).
Wherein L is a loss function,represents a classification loss, DskA sample information set representing a known protocol type k,representing the output result of the network flow identification model to the sample information set of the known protocol type k, wherein lambda is a preset hyper-parameter, DtiIs a class cluster of class i. The MMD can be calculated by referring to equation (1).
In the embodiment of the invention, by adding an adaptation layer between the source domain and the target domain and adding a loss function for domain confusion, the model learns how to classify, and reduces the distribution difference between the source domain and the target domain, thereby realizing the domain adaptation. Wherein the setting of the hyper-parameter lambda value determines the strength of the confusion field.
For example, as shown in fig. 2, the network traffic identification model includes a first convolutional layer 201, a second convolutional layer 202, a first fully-connected layer 203, a second fully-connected layer 204, a third fully-connected layer 205, a domain adaptation layer 206, a fourth fully-connected layer 207, and an output layer 208. Wherein, the first convolution layer 201 and the second convolution layer 202 both include 5 layers, and the first full connection layer 203 and the second full connection layer 204 both include 3 layers. The domain adaptation layer 206 is used to compute the MMD between the sample information set of the target protocol type and the class cluster of the target class.
And step four, if the pre-training model corresponding to the target protocol type is determined to be converged based on the loss value, determining that the network traffic identification model of the target type is the pre-training model corresponding to the target protocol type.
And step five, if the pre-training model corresponding to the target protocol type is determined not to be converged based on the loss value, adjusting model parameters of a full connection layer of the pre-training model corresponding to the target protocol type based on the loss value, and returning to the step one.
In one embodiment, if the difference between the loss value calculated this time and the loss value calculated last time is less than a preset difference, it is determined that the model is converged. And if the difference value between the loss value calculated this time and the loss value calculated last time is not less than the preset difference value, determining that the model is not converged.
In another embodiment, if the loss value calculated this time is smaller than a preset value, it is determined that the model converges. And if the loss value calculated this time is not less than the preset value, determining that the model is not converged.
The embodiment of the invention is based on a deep migration learning method, migrates the message information and the communication behavior information of the known network protocol to construct the network flow identification model of the unknown network protocol, and can quickly and effectively realize the identification capability of the malicious flow of the unknown network protocol by utilizing the existing network protocol knowledge.
The migration network can be regarded as a dual-channel structure and is formed in a network mode of sharing two channel weights. Where the target domain may be a category and the source domain refers to a known protocol type that matches the category. For channel a of the input source domain sample, referring to fig. 2, fig. 2 may be channel a, and a domain adaptation layer is added between fully-connected layers for determining a data distribution difference between the source domain and the target domain, which may also be referred to as a domain adaptation loss.
It will be appreciated that in order to accommodate two different domains, the difference in distribution between the two domains needs to be evaluated, and the difference in probability distribution between the two domains can be estimated by embedding the different domain samples evenly into the RKHS using the MMD algorithm.
For channel B, which is ingress network traffic for the target domain, there is no domain adaptation layer as compared to channel a. And the weight of each network layer is the same as the weight corresponding to the channel A.
Since the source domain data generates output through channel a, a classification penalty is generated with the tag computation, which penalty is minimized to ensure that the model is updated to a more accurate output. Therefore, in the embodiment of the present invention, after the source domain data is output through the channel a, the classification loss value is calculated with the tag, and the back propagation of the error is performed simultaneously with the domain adaptive loss. Since the target domain data is not tagged with data, channel B does not perform task specific penalty calculations and back propagation. And further target domain adaptation is realized.
Namely, the migration process of the model, may be regarded as training the channel a, and the network layers except the adaptive layer in the trained channel a constitute the channel B, namely, the network traffic recognition model.
In an implementation manner of the embodiment of the present invention, as shown in fig. 3, the network traffic identification model of the target class includes: a first convolution layer 301, a second convolution layer 302, a full-link layer 303, and an output layer 304; the network traffic identification model of the target category identifies whether the network traffic to be identified is malicious traffic or not through the following steps:
step (1), the first convolution layer 301 performs convolution on the two-dimensional data matrix of the packet by using a two-dimensional convolution kernel to obtain a first characteristic diagram.
Optionally, the first buildup layer 301 may include one or more layers. The convolutional layer may extract features of the input data.
And (2) performing convolution on the row two-dimensional data matrix by using the second convolution layer 302 by using a two-dimensional convolution kernel to obtain a second characteristic diagram.
Optionally, second convolutional layer 302 may include one or more layers. The convolutional layer may extract features of the input data.
In an alternative embodiment, the first convolutional layer 301 and the second convolutional layer 302 may be shared by weight.
And (3) integrating the first characteristic diagram and the second characteristic diagram by the full connection layer 303 to obtain a third characteristic diagram.
Optionally, the fully-connected layer 303 may include one or more layers. The fully connected layer 303 correlates its input feature images to the size of the category dimension.
In an implementation manner, the network traffic identification model in the embodiment of the present invention may be a model based on a convolutional neural network (AlexNet), and based on this, 5 full-link layers may be added after convolutional layers, as shown in fig. 2.
And (4) calculating the third feature map by using a preset classification algorithm by the output layer 304, and obtaining and outputting whether the network traffic to be identified is malicious traffic.
In one embodiment, the output layer may employ a classification algorithm, for example, the classification algorithm may be a logistic regression (Softmax) algorithm, and the output of the model is normalized to obtain whether the network traffic to be identified is normal traffic or malicious traffic.
The technical scheme of the embodiment of the invention can also bring the following beneficial effects: the network flow identification model is combined with the static characteristics (message information) and the dynamic characteristics (communication behavior information) of the message to identify whether the network flow is malicious flow or not, so that the accuracy of model identification is improved.
In the embodiment of the present invention, the structure of the network traffic identification model is not limited to the structure shown in fig. 2 or fig. 3, and the structure of the network traffic identification model may be determined according to actual requirements. Examples of network traffic recognition models for other architectures are given below.
Optionally, in the traffic identification model, after the convolutional layer, a pooling layer may be further added. For example, as shown in fig. 4, the network traffic identification model includes: a first buildup layer 401, a first pooling layer 402 after the first buildup layer 401, a second buildup layer 403, a second pooling layer 404 after the second buildup layer 403, a global connection layer 405, and an output layer 406.
In an alternative embodiment, the first convolutional layer 401 and the second convolutional layer 403 may be shared by weight; the first pooling layer 402 and the second pooling layer 404 may be weight-shared.
Optionally, the traffic identification model may further include a plurality of pooling layers. For example, as shown in fig. 5, the network traffic identification model includes: a first convolutional layer 501, a first pooling layer 502, a first convolutional layer 501, a second pooling layer 503, a second convolutional layer 504, a third pooling layer 505, a second convolutional layer 504, a fourth pooling layer 506, a full-link layer 507, and an output layer 508.
The full connection layer in the network traffic identification model can be connected with each convolution layer and each pooling layer so as to retain the identification result of each network layer to a greater extent.
Besides being based on a convolutional neural network, the network traffic identification model in the embodiment of the present invention may also be based on other neural networks, which is not specifically limited in the embodiment of the present invention.
The technical scheme of the embodiment of the invention can also bring the following beneficial effects: the preprocessed two-dimensional data matrix of the sample message and the preprocessed two-dimensional data matrix of the sample behavior corresponding to the same session are respectively used as input data of two sub-models, static characteristics and dynamic behavior characteristics of network flow are deeply learned through local perception and weight sharing, weight parameters of the models are independently learned during training, and accurate recognition results can be achieved.
In the embodiment of the present invention, after the step 103, network traffic may be further classified based on a black and white list mechanism. The preset white list includes a protocol type of trusted network traffic, the preset black list includes a protocol type of untrusted network traffic, such as a protocol type of attack or abnormal network traffic, and the gray list includes a protocol type that does not belong to either the white list or the black list.
The white list and the black list can be established based on traditional traffic identification methods such as port identification and Deep Packet Inspection (DPI) identification.
Optionally, the specific classification manner includes:
and step one, determining the protocol type of the network flow to be identified as a target protocol type matched with the target type.
And (II) if the target protocol type is the protocol type in the preset white list, determining that the network traffic to be identified is the credible network traffic.
And step three, if the target protocol type is the protocol type in the preset blacklist, determining that the network traffic to be identified is the untrusted network traffic.
And step four, if the target protocol type is not the protocol type in the preset white list and is not the protocol type in the preset black list, determining that the network traffic to be identified is unknown network traffic.
The scheme provided by the embodiment of the invention is to identify whether the network traffic to be identified is malicious traffic or not, and when a hacker initiates network attack by using the network traffic, the network traffic to be identified may be modified into the network traffic containing malicious attack codes. Therefore, the scheme provided by the embodiment of the invention can identify the network traffic containing the malicious attack codes. Therefore, the scheme provided by the embodiment of the invention can identify abnormal/malicious network traffic, so that the application range of the embodiment of the invention is wider.
Based on the same inventive concept, corresponding to the above method embodiment, an embodiment of the present invention provides a network traffic identification apparatus based on deep migration learning, and referring to fig. 6, the apparatus includes: a data acquisition module 601, a distance calculation module 602, a classification module 603, and a flow identification module 604.
The data acquisition module 601 is configured to extract message information and communication behavior information of a preset number of data packets from a to-be-identified network traffic, where the to-be-identified network traffic includes a network traffic generated in a session establishment stage and a traffic transmitted based on an established session;
a distance calculation module 602, configured to calculate distances between packet information and communication behavior information of network traffic to be identified and a cluster center of each cluster, where each cluster includes packet information and communication behavior information of network traffic of one category;
a classification module 603, configured to obtain a target class of a class cluster corresponding to a shortest distance when the shortest distance in the calculated distances is smaller than a preset distance;
the traffic identification module 604 is configured to input the message two-dimensional data matrix corresponding to the message information and the behavior two-dimensional data matrix corresponding to the behavior information into a network traffic identification model of a target category, and determine whether network traffic to be identified is malicious traffic;
the network traffic identification model of the target category is a model which is constructed by a deep migration learning method on the basis of a pre-training model corresponding to a target protocol type matched with the target category; the pre-training model corresponding to the target protocol type is as follows: training the deep learning model through a training sample set corresponding to the target protocol type to obtain a model; the training sample set corresponding to the target protocol type comprises: the method comprises the following steps that a sample two-dimensional data matrix of sample network traffic of a target protocol type and a normal or malicious label corresponding to the sample network traffic of the target protocol type are obtained, wherein the sample two-dimensional data matrix of each sample network traffic comprises: and respectively constructing a sample message two-dimensional data matrix and a sample behavior two-dimensional data matrix based on the message information and the communication behavior information of the data packets with the preset number in the sample network flow.
Optionally, as shown in fig. 7, the apparatus further includes: a dividing module 605, an unknown protocol clustering module 606, a protocol type matching module 607 and an identification model constructing module 608;
the data acquisition module 601 is further configured to obtain sample information sets of known protocol types before extracting message information and communication behavior information of a preset number of data packets from the network traffic to be identified, where each sample information set of a known protocol type includes: the message information and the communication behavior information of a number of data packets are preset in the sample network flow of the known protocol type;
a dividing module 605, configured to divide pre-collected network traffic of an undetermined protocol type by taking a session as a unit, so as to obtain a plurality of unidentified network traffic;
the data acquisition module 601 is further configured to extract message information and communication behavior information of a preset number of data packets from each unidentified network traffic;
an unknown protocol clustering module 606, configured to cluster the message information and the communication behavior information of multiple unidentified network flows to obtain a cluster of each category;
the protocol type matching module 607 is configured to calculate, for each category, a maximum mean difference MMD between the cluster of the category and the sample information set of each known protocol type, determine a known protocol type corresponding to the sample information set with the minimum MMD of the cluster of the category, and use the determined known protocol type as the protocol type matched with the category;
and the identification model construction module 608 is configured to construct a network traffic identification model of the category by a deep migration learning method based on a pre-training model corresponding to the protocol type matched with the category.
Optionally, the identification model building module 608 is specifically configured to:
inputting a two-dimensional sample data matrix of sample network traffic of a target protocol type into a pre-training model corresponding to the target protocol type;
step two, obtaining an output result of a pre-training model corresponding to the target protocol type;
calculating a loss value according to the output result, a normal or malicious label corresponding to the sample network flow of the target protocol type and the MMD between the sample information set of the target protocol type and the class cluster of the target class;
if the pre-training model corresponding to the target protocol type is determined to be converged based on the loss value, determining that the network traffic identification model of the target type is the pre-training model corresponding to the target protocol type;
and step five, if the pre-training model corresponding to the target protocol type is determined not to be converged based on the loss value, adjusting model parameters of a full connection layer of the pre-training model corresponding to the target protocol type based on the loss value, and returning to the step one.
Optionally, the MMD between a class cluster of a class and a sample information set of a known protocol type is calculated by the following formula:
wherein D istiClass clusters of class i, DskSample information set, n, for a known protocol type ktiIs DtiA corresponding amount of unrecognized network traffic,nskis DskThe corresponding number of sample network traffic volumes,h denotes the calculation of the distance measured by Φ (-) mapping the data into the regenerated kernel hilbert space RKHS.
Optionally, referring to fig. 7, the apparatus further includes: a data matrix construction module 609;
a data matrix construction module 609, configured to construct, based on a pre-training model corresponding to a protocol type matched with the category, a network traffic identification model of the category through a deep migration learning method, and then, for each unidentified network traffic of the category, construct a two-dimensional data matrix of a message according to message information of the unidentified network traffic, and construct a two-dimensional data matrix of a behavior according to communication behavior information of the unidentified network traffic;
the traffic identification module 604 is further configured to input the constructed two-dimensional data matrix of the packet and the two-dimensional data matrix of the behavior into the network traffic identification model of the category, and determine whether the unrecognized network traffic is malicious traffic.
Optionally, the network traffic identification model of the target class includes: a first convolution layer, a second convolution layer, a full connection layer and an output layer; the flow identification module 604 is specifically configured to:
the first convolution layer performs convolution on the two-dimensional data matrix of the message by using a two-dimensional convolution core to obtain a first characteristic diagram;
the second convolution layer performs convolution on the behavior two-dimensional data matrix by using a two-dimensional convolution kernel to obtain a second characteristic diagram;
the full connection layer integrates the first characteristic diagram and the second characteristic diagram to obtain a third characteristic diagram;
and the output layer calculates the third characteristic graph by using a preset classification algorithm to obtain and output whether the network traffic to be identified is malicious traffic.
Optionally, as shown in fig. 7, the apparatus further includes: a flow determination module 610, the flow determination module 610 configured to:
after the target class of the class cluster corresponding to the shortest distance is obtained, determining the protocol type of the network traffic to be identified as a target protocol type matched with the target class;
if the target protocol type is the protocol type in a preset white list, determining that the network traffic to be identified is the trusted network traffic, wherein the preset white list comprises the protocol type of the trusted network traffic;
and if the target protocol type is the protocol type in a preset blacklist, determining that the network traffic to be identified is the untrustworthy network traffic, wherein the preset blacklist comprises the protocol type of the untrustworthy network traffic.
An embodiment of the present invention further provides an electronic device, as shown in fig. 8, which includes a processor 801, a communication interface 802, a memory 803, and a communication bus 804, where the processor 801, the communication interface 802, and the memory 803 complete mutual communication through the communication bus 804,
a memory 803 for storing a computer program;
the processor 801 is configured to implement the method steps in the above-described method embodiments when executing the program stored in the memory 803.
The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface is used for communication between the electronic equipment and other equipment.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
In another embodiment of the present invention, a computer-readable storage medium is further provided, in which a computer program is stored, and the computer program, when executed by a processor, implements the steps of any of the above-mentioned deep migration learning-based network traffic identification methods.
In yet another embodiment, a computer program product containing instructions is provided, which when run on a computer, causes the computer to execute any of the above-mentioned network traffic identification methods based on deep migration learning.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the apparatus embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.
Claims (10)
1. A network traffic identification method based on deep migration learning is characterized by comprising the following steps:
extracting message information and communication behavior information of a preset number of data packets from network traffic to be identified, wherein the network traffic to be identified comprises network traffic generated in a session establishment stage and network traffic transmitted based on the established session;
calculating the distance between the message information and the communication behavior information of the network flow to be identified and the clustering center of each cluster, wherein each cluster comprises the message information and the communication behavior information of the network flow of one category;
when the shortest distance in the calculated distances is smaller than a preset distance, obtaining a target category of a category cluster corresponding to the shortest distance;
inputting a message two-dimensional data matrix corresponding to the message information and a behavior two-dimensional data matrix corresponding to the behavior information into the network traffic identification model of the target category, and determining whether the network traffic to be identified is malicious traffic;
the network traffic identification model of the target category is a model which is constructed by a deep migration learning method on the basis of a pre-training model corresponding to a target protocol type matched with the target category; the pre-training model corresponding to the target protocol type is as follows: training a deep learning model through a training sample set corresponding to the target protocol type to obtain a model; the training sample set corresponding to the target protocol type comprises: the sample two-dimensional data matrix of the sample network traffic of the target protocol type and the normal or malicious label corresponding to the sample network traffic of the target protocol type, where the sample two-dimensional data matrix of each sample network traffic includes: and respectively constructing a sample message two-dimensional data matrix and a sample behavior two-dimensional data matrix based on the message information and the communication behavior information of the data packets with the preset number in the sample network flow.
2. The method according to claim 1, wherein before the extracting message information and communication behavior information of a preset number of data packets from the network traffic to be identified, the method further comprises:
obtaining sample information sets of known protocol types, each sample information set of a known protocol type comprising: the message information and the communication behavior information of a number of data packets are preset in the sample network flow of the known protocol type;
dividing the pre-collected network traffic of undetermined protocol types by taking a session as a unit to obtain a plurality of unidentified network traffic;
extracting message information and communication behavior information of a preset number of data packets from each unidentified network flow;
clustering the message information and the communication behavior information of the plurality of unidentified network flows to obtain a cluster of each category;
calculating the maximum mean difference MMD between the class cluster of the class and the sample information set of each known protocol type according to each class, determining the known protocol type corresponding to the sample information set with the minimum MMD of the class cluster of the class, and taking the determined known protocol type as the protocol type matched with the class;
and constructing a network traffic identification model of the category by a deep migration learning method on the basis of a pre-training model corresponding to the protocol type matched with the category.
3. The method of claim 2, wherein the network traffic recognition model for the target class is constructed by:
inputting a two-dimensional sample data matrix of the sample network traffic of the target protocol type into a pre-training model corresponding to the target protocol type;
step two, obtaining an output result of a pre-training model corresponding to the target protocol type;
calculating a loss value according to the output result, a normal or malicious label corresponding to the sample network flow of the target protocol type and the MMD between the sample information set of the target protocol type and the class cluster of the target class;
step four, if the pre-training model corresponding to the target protocol type is determined to be converged based on the loss value, determining that the network traffic identification model of the target type is the pre-training model corresponding to the target protocol type;
and step five, if the pre-training model corresponding to the target protocol type is determined not to be converged based on the loss value, adjusting model parameters of a full connection layer of the pre-training model corresponding to the target protocol type based on the loss value, and returning to the step one.
4. A method according to claim 2 or 3, characterized by calculating the MMD between a class cluster of a class and a sample information set of a known protocol type by the following formula:
wherein D istiClass clusters of class i, DskSample information set, n, for a known protocol type ktiIs DtiA corresponding amount of unrecognized network traffic,nskis DskCorresponding sample networkThe amount of the flow rate is such that,h denotes the calculation of the distance measured by Φ (-) mapping the data into the regenerated kernel hilbert space RKHS.
5. The method according to claim 2, wherein after the pre-trained model corresponding to the protocol type matching the category is used as a basis to construct the network traffic recognition model of the category through a deep migration learning method, the method further comprises:
aiming at each unidentified network flow of the category, constructing a message two-dimensional data matrix according to the message information of the unidentified network flow, and constructing a behavior two-dimensional data matrix according to the communication behavior information of the unidentified network flow;
and inputting the constructed message two-dimensional data matrix and the behavior two-dimensional data matrix into the network traffic identification model of the category, and determining whether the unidentified network traffic is malicious traffic.
6. The method of claim 1, wherein the network traffic identification model for the target class comprises: a first convolution layer, a second convolution layer, a full connection layer and an output layer; the network traffic identification model of the target category identifies whether the network traffic to be identified is malicious traffic or not through the following steps:
the first convolution layer performs convolution on the message two-dimensional data matrix by using a two-dimensional convolution core to obtain a first characteristic diagram;
the second convolution layer performs convolution on the behavior two-dimensional data matrix by using a two-dimensional convolution core to obtain a second characteristic diagram;
the full connection layer integrates the first characteristic diagram and the second characteristic diagram to obtain a third characteristic diagram;
and the output layer calculates the third characteristic diagram by using a preset classification algorithm to obtain and output whether the network traffic to be identified is malicious traffic.
7. The method according to claim 1, wherein after the obtaining the target class of the class cluster corresponding to the shortest distance, the method further comprises:
determining the protocol type of the network traffic to be identified as a target protocol type matched with the target type;
if the target protocol type is a protocol type in a preset white list, determining that the network traffic to be identified is trusted network traffic, wherein the preset white list comprises the protocol type of the trusted network traffic;
and if the target protocol type is a protocol type in a preset blacklist, determining that the network traffic to be identified is untrusted network traffic, wherein the preset blacklist comprises the protocol type of the untrusted network traffic.
8. An apparatus for identifying network traffic based on deep migration learning, the apparatus comprising:
the data acquisition module is used for extracting message information and communication behavior information of a preset number of data packets from network traffic to be identified, wherein the network traffic to be identified comprises network traffic generated in a session establishment stage and network traffic transmitted based on the established session;
the distance calculation module is used for calculating the distance between the message information and the communication behavior information of the network flow to be identified and the clustering center of each cluster, and each cluster comprises the message information and the communication behavior information of one type of network flow;
the classification module is used for obtaining the target class of the class cluster corresponding to the shortest distance when the shortest distance in the calculated distances is smaller than a preset distance;
the traffic identification module is used for inputting the message two-dimensional data matrix corresponding to the message information and the behavior two-dimensional data matrix corresponding to the behavior information into the network traffic identification model of the target category, and determining whether the network traffic to be identified is malicious traffic;
the network traffic identification model of the target category is a model which is constructed by a deep migration learning method on the basis of a pre-training model corresponding to a target protocol type matched with the target category; the pre-training model corresponding to the target protocol type is as follows: training a deep learning model through a training sample set corresponding to the target protocol type to obtain a model; the training sample set corresponding to the target protocol type comprises: the sample two-dimensional data matrix of the sample network traffic of the target protocol type and the normal or malicious label corresponding to the sample network traffic of the target protocol type, where the sample two-dimensional data matrix of each sample network traffic includes: and respectively constructing a sample message two-dimensional data matrix and a sample behavior two-dimensional data matrix based on the message information and the communication behavior information of the data packets with the preset number in the sample network flow.
9. The apparatus of claim 8, further comprising: the system comprises a dividing module, an unknown protocol clustering module, a protocol type matching module and an identification model building module;
the data acquisition module is further configured to obtain sample information sets of known protocol types before extracting message information and communication behavior information of a preset number of data packets from the network traffic to be identified, where each sample information set of a known protocol type includes: the message information and the communication behavior information of a number of data packets are preset in the sample network flow of the known protocol type;
the dividing module is used for dividing the pre-collected network traffic of undetermined protocol type by taking a session as a unit to obtain a plurality of unidentified network traffic;
the data acquisition module is also used for extracting message information and communication behavior information of a preset number of data packets from each unidentified network flow;
the unknown protocol clustering module is used for clustering the message information and the communication behavior information of the plurality of unidentified network flows to obtain clusters of various categories;
the protocol type matching module is used for calculating the maximum mean value difference MMD between the class cluster of the class and the sample information set of each known protocol type aiming at each class, determining the known protocol type corresponding to the sample information set with the minimum MMD of the class cluster of the class, and taking the determined known protocol type as the protocol type matched with the class;
the identification model construction module is used for constructing the network traffic identification model of the category by a deep migration learning method on the basis of a pre-training model corresponding to the protocol type matched with the category.
10. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;
a memory for storing a computer program;
a processor for implementing the method steps of any of claims 1 to 7 when executing a program stored in the memory.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202011042795.4A CN112235264B (en) | 2020-09-28 | 2020-09-28 | Network traffic identification method and device based on deep migration learning |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202011042795.4A CN112235264B (en) | 2020-09-28 | 2020-09-28 | Network traffic identification method and device based on deep migration learning |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN112235264A true CN112235264A (en) | 2021-01-15 |
| CN112235264B CN112235264B (en) | 2022-10-14 |
Family
ID=74120864
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202011042795.4A Expired - Fee Related CN112235264B (en) | 2020-09-28 | 2020-09-28 | Network traffic identification method and device based on deep migration learning |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN112235264B (en) |
Cited By (21)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113486354A (en) * | 2021-08-20 | 2021-10-08 | 国网山东省电力公司电力科学研究院 | Firmware safety evaluation method, system, medium and electronic equipment |
| CN113516231A (en) * | 2021-08-10 | 2021-10-19 | 大连海事大学 | A DSN-based Deep Adversarial Transfer Network for Recognition of Daily Behavior Transfer |
| CN113762377A (en) * | 2021-09-02 | 2021-12-07 | 北京恒安嘉新安全技术有限公司 | Network traffic identification method, device, equipment and storage medium |
| CN114254319A (en) * | 2021-12-13 | 2022-03-29 | 安天科技集团股份有限公司 | Network virus identification method and device, computer equipment and storage medium |
| CN114297542A (en) * | 2021-12-28 | 2022-04-08 | 杭州迪普科技股份有限公司 | Method, device, terminal and medium for recognizing traffic intention based on transfer learning |
| CN114358170A (en) * | 2021-12-30 | 2022-04-15 | 国网宁夏电力有限公司电力科学研究院 | Application type identification method and device based on flow characteristics |
| CN114640611A (en) * | 2022-03-09 | 2022-06-17 | 西安电子科技大学 | Unknown heterogeneous industrial protocol detection and identification method, system, equipment and medium |
| CN114724069A (en) * | 2022-04-09 | 2022-07-08 | 北京天防安全科技有限公司 | Video equipment model confirming method, device, equipment and medium |
| CN114884894A (en) * | 2022-04-18 | 2022-08-09 | 南京邮电大学 | Semi-supervised network traffic classification method based on transfer learning |
| CN114970680A (en) * | 2022-04-26 | 2022-08-30 | 北京科技大学 | CNN + LSTM-based flow terminal real-time identification method and device |
| CN115022049A (en) * | 2022-06-06 | 2022-09-06 | 哈尔滨工业大学 | A method, electronic device and storage medium for detecting out-of-distribution network traffic data based on calculating Mahalanobis distance |
| CN115037641A (en) * | 2022-06-01 | 2022-09-09 | 网络通信与安全紫金山实验室 | Network traffic detection method and device based on small samples, electronic equipment and medium |
| CN115134176A (en) * | 2022-09-02 | 2022-09-30 | 南京航空航天大学 | Hidden network encrypted traffic classification method based on incomplete supervision |
| CN115277587A (en) * | 2022-07-29 | 2022-11-01 | 中国电信股份有限公司 | Network traffic identification method, device, equipment and medium |
| CN115643182A (en) * | 2022-10-10 | 2023-01-24 | 北京百度网讯科技有限公司 | Flow detection method and device and electronic equipment |
| CN115842788A (en) * | 2021-09-16 | 2023-03-24 | 中国移动通信集团辽宁有限公司 | Flow identification method, device and equipment and computer storage medium |
| CN116896514A (en) * | 2023-07-19 | 2023-10-17 | 上海螣龙科技有限公司 | Network asset identification method, device, equipment and medium based on deep learning |
| CN116915720A (en) * | 2023-09-12 | 2023-10-20 | 武汉烽火凯卓科技有限公司 | Internet of things equipment flow identification method and system, electronic equipment and storage medium |
| CN119025972A (en) * | 2024-08-22 | 2024-11-26 | 北京赋乐科技有限公司 | Method, product and equipment for classifying and identifying bad applications based on traffic time series characteristics |
| CN119107111A (en) * | 2024-09-03 | 2024-12-10 | 广东联想懂的通信有限公司 | E-SIM user portrait generation method and system |
| CN119202804A (en) * | 2024-11-26 | 2024-12-27 | 南京信息工程大学 | A Tor network traffic perception method based on impulse sequence response |
Citations (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20150085678A1 (en) * | 2013-09-23 | 2015-03-26 | Calix, Inc. | Distributed system and method for flow identification in an access network |
| US20150128263A1 (en) * | 2013-11-07 | 2015-05-07 | Cyberpoint International, LLC | Methods and systems for malware detection |
| CN105553998A (en) * | 2015-12-23 | 2016-05-04 | 中国电子科技集团公司第三十研究所 | Network attack abnormality detection method |
| CN108040073A (en) * | 2018-01-23 | 2018-05-15 | 杭州电子科技大学 | Malicious attack detection method based on deep learning in information physical traffic system |
| CN109299742A (en) * | 2018-10-17 | 2019-02-01 | 深圳信息职业技术学院 | Method, device, device and storage medium for automatically discovering unknown network flow |
| CN109815339A (en) * | 2019-01-02 | 2019-05-28 | 平安科技(深圳)有限公司 | Knowledge extraction method, device, computer equipment and storage medium based on TextCNN |
| CN109995611A (en) * | 2019-03-18 | 2019-07-09 | 新华三信息安全技术有限公司 | Traffic classification model foundation and traffic classification method, apparatus, equipment and server |
| CN111031071A (en) * | 2019-12-30 | 2020-04-17 | 杭州迪普科技股份有限公司 | Malicious traffic identification method and device, computer equipment and storage medium |
| WO2020119481A1 (en) * | 2018-12-11 | 2020-06-18 | 深圳先进技术研究院 | Network traffic classification method and system based on deep learning, and electronic device |
-
2020
- 2020-09-28 CN CN202011042795.4A patent/CN112235264B/en not_active Expired - Fee Related
Patent Citations (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20150085678A1 (en) * | 2013-09-23 | 2015-03-26 | Calix, Inc. | Distributed system and method for flow identification in an access network |
| US20150128263A1 (en) * | 2013-11-07 | 2015-05-07 | Cyberpoint International, LLC | Methods and systems for malware detection |
| CN105553998A (en) * | 2015-12-23 | 2016-05-04 | 中国电子科技集团公司第三十研究所 | Network attack abnormality detection method |
| CN108040073A (en) * | 2018-01-23 | 2018-05-15 | 杭州电子科技大学 | Malicious attack detection method based on deep learning in information physical traffic system |
| CN109299742A (en) * | 2018-10-17 | 2019-02-01 | 深圳信息职业技术学院 | Method, device, device and storage medium for automatically discovering unknown network flow |
| WO2020119481A1 (en) * | 2018-12-11 | 2020-06-18 | 深圳先进技术研究院 | Network traffic classification method and system based on deep learning, and electronic device |
| CN109815339A (en) * | 2019-01-02 | 2019-05-28 | 平安科技(深圳)有限公司 | Knowledge extraction method, device, computer equipment and storage medium based on TextCNN |
| CN109995611A (en) * | 2019-03-18 | 2019-07-09 | 新华三信息安全技术有限公司 | Traffic classification model foundation and traffic classification method, apparatus, equipment and server |
| CN111031071A (en) * | 2019-12-30 | 2020-04-17 | 杭州迪普科技股份有限公司 | Malicious traffic identification method and device, computer equipment and storage medium |
Cited By (33)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113516231A (en) * | 2021-08-10 | 2021-10-19 | 大连海事大学 | A DSN-based Deep Adversarial Transfer Network for Recognition of Daily Behavior Transfer |
| CN113516231B (en) * | 2021-08-10 | 2024-03-29 | 大连海事大学 | A daily behavior migration recognition method based on DSN deep adversarial migration network |
| CN113486354A (en) * | 2021-08-20 | 2021-10-08 | 国网山东省电力公司电力科学研究院 | Firmware safety evaluation method, system, medium and electronic equipment |
| CN113762377A (en) * | 2021-09-02 | 2021-12-07 | 北京恒安嘉新安全技术有限公司 | Network traffic identification method, device, equipment and storage medium |
| CN113762377B (en) * | 2021-09-02 | 2024-03-08 | 北京恒安嘉新安全技术有限公司 | Network traffic identification method, device, equipment and storage medium |
| CN115842788A (en) * | 2021-09-16 | 2023-03-24 | 中国移动通信集团辽宁有限公司 | Flow identification method, device and equipment and computer storage medium |
| CN115842788B (en) * | 2021-09-16 | 2025-04-25 | 中国移动通信集团辽宁有限公司 | A flow identification method, device, equipment and computer storage medium |
| CN114254319A (en) * | 2021-12-13 | 2022-03-29 | 安天科技集团股份有限公司 | Network virus identification method and device, computer equipment and storage medium |
| CN114297542A (en) * | 2021-12-28 | 2022-04-08 | 杭州迪普科技股份有限公司 | Method, device, terminal and medium for recognizing traffic intention based on transfer learning |
| CN114358170A (en) * | 2021-12-30 | 2022-04-15 | 国网宁夏电力有限公司电力科学研究院 | Application type identification method and device based on flow characteristics |
| CN114640611A (en) * | 2022-03-09 | 2022-06-17 | 西安电子科技大学 | Unknown heterogeneous industrial protocol detection and identification method, system, equipment and medium |
| CN114724069A (en) * | 2022-04-09 | 2022-07-08 | 北京天防安全科技有限公司 | Video equipment model confirming method, device, equipment and medium |
| CN114884894A (en) * | 2022-04-18 | 2022-08-09 | 南京邮电大学 | Semi-supervised network traffic classification method based on transfer learning |
| CN114884894B (en) * | 2022-04-18 | 2023-10-20 | 南京邮电大学 | Semi-supervised network traffic classification method based on transfer learning |
| CN114970680A (en) * | 2022-04-26 | 2022-08-30 | 北京科技大学 | CNN + LSTM-based flow terminal real-time identification method and device |
| CN115037641B (en) * | 2022-06-01 | 2024-05-03 | 网络通信与安全紫金山实验室 | Network traffic detection method, device, electronic device and medium based on small sample |
| CN115037641A (en) * | 2022-06-01 | 2022-09-09 | 网络通信与安全紫金山实验室 | Network traffic detection method and device based on small samples, electronic equipment and medium |
| CN115022049A (en) * | 2022-06-06 | 2022-09-06 | 哈尔滨工业大学 | A method, electronic device and storage medium for detecting out-of-distribution network traffic data based on calculating Mahalanobis distance |
| CN115022049B (en) * | 2022-06-06 | 2024-05-14 | 哈尔滨工业大学 | A method for detecting out-of-distribution network traffic data based on calculating Mahalanobis distance, electronic device and storage medium |
| CN115277587B (en) * | 2022-07-29 | 2023-10-31 | 中国电信股份有限公司 | Network traffic identification method, device, equipment and medium |
| CN115277587A (en) * | 2022-07-29 | 2022-11-01 | 中国电信股份有限公司 | Network traffic identification method, device, equipment and medium |
| CN115134176B (en) * | 2022-09-02 | 2022-11-29 | 南京航空航天大学 | A Classification Method of Darknet Encrypted Traffic Based on Incomplete Supervision |
| CN115134176A (en) * | 2022-09-02 | 2022-09-30 | 南京航空航天大学 | Hidden network encrypted traffic classification method based on incomplete supervision |
| CN115643182A (en) * | 2022-10-10 | 2023-01-24 | 北京百度网讯科技有限公司 | Flow detection method and device and electronic equipment |
| CN116896514A (en) * | 2023-07-19 | 2023-10-17 | 上海螣龙科技有限公司 | Network asset identification method, device, equipment and medium based on deep learning |
| CN116896514B (en) * | 2023-07-19 | 2024-04-09 | 上海螣龙科技有限公司 | Network asset identification method, device, equipment and medium based on deep learning |
| CN116915720B (en) * | 2023-09-12 | 2023-12-01 | 武汉烽火凯卓科技有限公司 | Internet of things equipment flow identification method and system, electronic equipment and storage medium |
| CN116915720A (en) * | 2023-09-12 | 2023-10-20 | 武汉烽火凯卓科技有限公司 | Internet of things equipment flow identification method and system, electronic equipment and storage medium |
| CN119025972A (en) * | 2024-08-22 | 2024-11-26 | 北京赋乐科技有限公司 | Method, product and equipment for classifying and identifying bad applications based on traffic time series characteristics |
| CN119107111A (en) * | 2024-09-03 | 2024-12-10 | 广东联想懂的通信有限公司 | E-SIM user portrait generation method and system |
| CN119107111B (en) * | 2024-09-03 | 2025-02-25 | 广东联想懂的通信有限公司 | E-SIM user portrait generation method and system |
| CN119202804A (en) * | 2024-11-26 | 2024-12-27 | 南京信息工程大学 | A Tor network traffic perception method based on impulse sequence response |
| CN119202804B (en) * | 2024-11-26 | 2025-05-09 | 南京信息工程大学 | A Tor network traffic perception method based on impulse sequence response |
Also Published As
| Publication number | Publication date |
|---|---|
| CN112235264B (en) | 2022-10-14 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN112235264B (en) | Network traffic identification method and device based on deep migration learning | |
| CN112003870A (en) | Network encryption traffic identification method and device based on deep learning | |
| CN112839034B (en) | A Network Intrusion Detection Method Based on CNN-GRU Hierarchical Neural Network | |
| Min et al. | TR‐IDS: Anomaly‐based intrusion detection through text‐convolutional neural network and random forest | |
| CN111866024B (en) | Network encryption traffic identification method and device | |
| Wang et al. | Real network traffic collection and deep learning for mobile app identification | |
| WO2022227388A1 (en) | Log anomaly detection model training method, apparatus and device | |
| CN113469366B (en) | Encrypted traffic identification method, device and equipment | |
| CN112165484B (en) | Network encryption traffic identification method and device based on deep learning and side channel analysis | |
| CN109284606A (en) | Data flow anomaly detection system based on empirical characteristics and convolutional neural network | |
| WO2023185539A1 (en) | Machine learning model training method, service data processing method, apparatuses, and systems | |
| CN113015167B (en) | Encrypted flow data detection method, system, electronic device and storage medium | |
| CN108768883A (en) | A kind of network flow identification method and device | |
| CN113821793B (en) | Multi-stage attack scene construction method and system based on graph convolution neural network | |
| CN111224941A (en) | Threat type identification method and device | |
| CN119420712B (en) | Data access method and system of intelligent Internet of things gateway | |
| CN112364304B (en) | Method and device for detecting solar erosion attack of block chain | |
| CN112671739B (en) | Node property identification method of distributed system | |
| CN112468324A (en) | Graph convolution neural network-based encrypted traffic classification method and device | |
| CN113810372A (en) | A low-throughput DNS covert channel detection method and device | |
| Zheng et al. | Preprocessing method for encrypted traffic based on semisupervised clustering | |
| Koniki et al. | An anomaly based network intrusion detection system using LSTM and GRU | |
| CN115333802B (en) | Malicious program detection method and system based on neural network | |
| CN112822208A (en) | Internet of things equipment identification method and system based on block chain | |
| CN114826681A (en) | DGA domain name detection method, system, medium, equipment and terminal |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant | ||
| CF01 | Termination of patent right due to non-payment of annual fee | ||
| CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20221014 |