Background
In recent years, deep learning has achieved excellent success in a number of challenging areas such as computer vision and natural language processing. In practical applications, deep learning is usually applied to large data sets, and a large amount of noise including countersample noise and natural noise is inevitably included in the data sets formed by data collected from daily life. While these noises have no impact on human cognition and object recognition, they can mislead deep neural networks to make wrong decisions, which poses a serious security threat to the practical use of machine learning in the digital and physical world.
Meanwhile, the importance of the interpretable deep learning is highlighted by the fact that the tiny noise can cause the deep neural network to make completely wrong decisions and the basis adopted by the deep model in the classification and judgment. Therefore, training robust, interpretable deep neural networks has received much attention in recent research.
As is well known, the instability of the deep learning model to noise generally occurs in a sudden change of a certain hidden layer feature map and a neuron activation value when the noise is propagated in a forward direction, so that the stability of each neural unit and a path composed of the neural units is important.
Disclosure of Invention
The invention aims to solve the primary technical problem of providing a method for determining a key attack path in a neural network.
Another technical problem to be solved by the present invention is to provide a device for determining a critical attack path in a neural network.
In order to achieve the purpose, the invention adopts the following technical scheme:
according to a first aspect of the embodiments of the present invention, a method for determining a critical attack path in a neural network is provided, which includes the following steps:
inputting each common sample into a neural network, and after each layer of the neural network is output, executing the following operations to obtain a key attack unit at the sample level:
finding out a key attack unit of the last layer from each neuron of the last layer of the neural network according to a loss function of the neural network;
obtaining a key attack unit of the previous layer based on the influence of each neuron in the previous layer of the neural network on the key attack unit of the next layer;
and aggregating the key attack units of the sample level of each layer by layer aiming at each common sample input to obtain the key attack path of the neural network at the model level.
Preferably, the finding a last layer of key attack units from each neuron element of a last layer of the neural network according to the loss function of the neural network specifically includes:
solving the gradient of the loss function to each neuron of the last layer;
and selecting the first k neurons meeting the preset standard gradient as the key attack units of the last layer.
Preferably, the gradient is expressed as:
in the formula (1), the first and second groups,
representing the loss function of the neural network, Z
L mRepresents the output of the mth neuron of the last layer L of the neural network,
representing the gradient of the loss function over neuron m.
Preferably, the expression of the key attack unit of the last layer is:
in the formula (2), gLRepresenting the gradient of all neurons in the last layer L of the neural network, FLRepresenting the last layer of the neural network, top-k (-) represents the picking of k elements from the set according to a predetermined criterion.
Preferably, the obtaining of the key attack unit of the previous layer based on the influence of each neuron in the previous layer of the neural network on the key attack unit of the next layer includes:
for the ith neuron at layer l-1 of the neural network:
calculating the influence of the ith neuron of the l-1 layer on a certain element in the output of a certain key attack unit of the l layer;
adding the influence of the ith neuron of the l-1 layer on each element in the output of a certain key attack unit of the l layer to obtain the influence of the ith neuron of the l-1 layer on the certain key attack unit of the l layer;
and selecting the first k neurons meeting the influence of a preset standard from the influence of each neuron in the l-1 layer on each key attack unit of the l-1 layer as the key attack units of the l-1 layer.
Preferably, the expression of the influence of the ith neuron of the l-1 th layer on a certain element in the output of a certain key attack unit of the l-1 th layer is as follows:
in the formula (3), the first and second groups,
representing the output of the ith neuron at layer l-1
Output to the l-th layer key attack unit j
Influence of the middle z element.
Preferably, the expression of the influence of the ith neuron of the l-1 th layer on a certain key attack unit of the l-1 th layer is as follows:
in the formula (4), the first and second groups,
represents the output of the ith neuron of the l-1 layer to the key attack unit j of the l layer
Influence of all elements in (1).
Preferably, the expression of the key attack unit of the l-1 layer is as follows:
in equation (5), top-k (-) indicates that k elements are selected from the set according to a preset standard, F
lThe l-th layer of the neural network is represented,
the influence of the ith neuron on each key attack unit of the l-1 layer is represented by the following expression:
in the formula (6), Fl jRepresents the j attack unit of the l layer of the neural network.
Preferably, the aggregating the key attack units at the sample level of each layer by layer aiming at each common sample input to obtain the key attack path of the neural network at the model level specifically comprises:
aggregating the key attack units of each layer of sample level layer by layer aiming at each common sample to obtain the path of each common sample;
counting the frequency of the key attack units of each sample level in each layer of the neural network appearing in each common sample path;
taking k sample-level key attack units with frequencies meeting preset conditions in each layer of the neural network as key attack paths of the layer to which the key attack units belong;
and aggregating the key attack paths of each layer of the neural network layer by layer to form the key attack path of the neural network.
According to a second aspect of the embodiments of the present invention, there is provided an apparatus for determining a critical attack path in a neural network, including a processor and a memory, where the processor reads a computer program in the memory to perform the following operations:
inputting each common sample into a neural network, and after each layer of the neural network is output, executing the following operations to obtain a key attack unit at the sample level:
finding out a key attack unit of the last layer from each neuron of the last layer of the neural network according to a loss function of the neural network;
obtaining a key attack unit of the previous layer based on the influence of each neuron in the previous layer of the neural network on the key attack unit of the next layer;
and aggregating the key attack units of the sample level of each layer by layer aiming at each common sample input to obtain the key attack path of the neural network at the model level.
The invention provides a concept of a key attack path, aims to disclose the transmission and amplification process of noise in a model through the path, and provides a technical basis for improving the robustness of the model through the path. According to the invention, the key attack units of each layer are searched through the influence of each layer of neurons in the neural network on each next layer of neurons, so that the key path of the neural network is obtained.
Detailed Description
The technical contents of the invention are described in detail below with reference to the accompanying drawings and specific embodiments.
As shown in fig. 1, the method for determining a critical attack path in a neural network according to an embodiment of the present invention includes:
inputting each common sample into a neural network, and executing 101 to 102 to obtain a key attack unit of a sample level after each layer of the neural network is output:
101. finding out a key attack unit of the last layer from each neuron of the last layer of the neural network according to a loss function of the neural network;
102. obtaining a key attack unit of the previous layer based on the influence of each neuron in the previous layer of the neural network on the key attack unit of the next layer;
103. and aggregating the key attack units of the sample level of each layer by layer aiming at each common sample input to obtain the key attack path of the neural network at the model level.
In the embodiment of the present invention, a data set D is first established, which includes N data samples. The data samples are normal samples, non-countermeasure samples. As shown in fig. 2, three common samples, image x1 of category a, image x2 of category B, and image x3 of category C, are selected; these three normal samples are input to the neural network, respectively, and 101 to 103 are executed for the process of deep learning in the neural network for each normal sample.
The 101 specifically includes:
1011. solving the gradient of the loss function to each neuron of the last layer;
1012. and selecting the first k neurons meeting the preset standard gradient as the key attack units of the last layer.
In the embodiment of the present invention, taking the image x1 of the category a as an example, the key attack unit of x1 is found: after x1 is input into the neural network, x1 first passes through the first layer of the neural network; and (4) generating output after learning by each neuron pair x1 of the first layer, sending the output to the second layer, and so on, and when the output of the penultimate layer reaches the last layer, firstly solving the gradient of the loss function of the neural network to each neuron of the last layer.
The expression of the gradient is:
in the formula (1), the first and second groups,
representing the loss function of the neural network, Z
L mRepresents the output of the mth neuron of the last layer L of the neural network,
representing the gradient of the loss function over neuron m.
And then, selecting the key attack units of the last layer from the neurons of the last layer according to the gradient of the neurons.
The expression of the key attack unit of the last layer is as follows:
in the formula (2), gLRepresenting the gradient of all neurons in the last layer L of the neural network, FLRepresenting the last layer of the neural network, top-k (-) represents the picking of k elements from the set according to a predetermined criterion.
In 1012, the gradient absolute values of the neurons in the last layer are arranged from large to small, and the neuron with the gradient absolute value rank k before is selected as a key attack unit of the last layer. Thereby resulting in the k key attack units of the last layer.
The 102 specifically includes:
for the ith neuron at layer l-1 of the neural network:
1021. calculating the influence of the ith neuron of the l-1 layer on a certain element in the output of a certain key attack unit of the l layer;
in the embodiment of the invention, after the key attack unit of the last layer is obtained, the key attack unit of the penultimate layer is obtained according to the key attack unit of the last layer. Firstly, the influence of each neuron of the penultimate layer on each output of each key attack unit of the last layer is required to be obtained:
the expression of the influence of the ith neuron of the l-1 layer on a certain element in the output of a certain key attack unit of the l layer is as follows:
in the formula (3), the first and second groups,
representing the output of the ith neuron at layer l-1
Output to the l-th layer key attack unit j
Influence of the middle z element.
1022. Adding the influence of the ith neuron of the l-1 layer on each element in the output of a certain key attack unit of the l layer to obtain the influence of the ith neuron of the l-1 layer on the certain key attack unit of the l layer;
then, the influence forces obtained in 1021 are added to obtain the influence forces of each neuron in the penultimate layer on each key attack unit in the last layer:
the expression of the influence of the ith neuron of the l-1 layer on a certain key attack unit of the l layer is as follows:
in the formula (4), the first and second groups,
represents the output of the ith neuron of the l-1 layer to the key attack unit j of the l layer
Influence of all elements in (1).
1023. And selecting the first k neurons meeting the influence of a preset standard from the influence of each neuron in the l-1 layer on each key attack unit of the l-1 layer as the key attack units of the l-1 layer.
Sequencing the influence forces obtained in 1022 from big to small, and selecting the neuron with the influence force ranking k at the top as a key attack unit of the penultimate layer:
the expression of the key attack unit of the l-1 layer is as follows:
in equation (5), top-k (-) indicates that k elements are selected from the set according to a preset standard, F
lThe l-th layer of the neural network is represented,
the influence of the ith neuron on each key attack unit of the l-1 layer is represented by the following expression:
in the formula (6), Fl jRepresents the j attack unit of the l layer of the neural network.
Therefore, each key attack unit of the penultimate layer of the neural network corresponding to the common sample x1 is obtained. And so on until finding the key attack unit of the common sample x1 corresponding to the first layer.
And (4) repeatedly executing 101 to 103 aiming at the normal samples x2 and x3 respectively to obtain key attack units of each layer of the neural network corresponding to the normal samples x2 and x3 respectively.
The step 103 specifically includes:
1031. aggregating the key attack units of each layer of sample level layer by layer aiming at each common sample to obtain the path of each common sample;
in the embodiment of the present invention, as shown in fig. 2, the path of the normal sample x1 is:
the path of normal sample x2 is:
the path of normal sample x3 is:
1032. counting the frequency of the key attack units of each sample level in each layer of the neural network appearing in each common sample path;
after each sample path { R (x) }1),R(x2),...,R(xN) And thirdly, counting the frequency of each key attack unit in each layer in each common sample path layer by layer, and sequencing from large to small.
1033. Taking k sample-level key attack units with frequencies meeting preset conditions in each layer of the neural network as key attack paths of the layer to which the key attack units belong;
after sequencing, selecting a key attack unit with the frequency ranking k at each layer as a key attack path of the layer to which the key attack unit belongs:
Ωl=top-k(Fl,fl) (7)
in the formula (7), f is the frequency of each key attack unit appearing in the key attack unit corresponding to all the common samples in the layer l.
1034. Aggregating key attack paths of each layer of the neural network layer by layer to form a key attack path of the neural network;
as shown in fig. 2, after obtaining the key attack units (key attack paths of each layer) of each layer of the model level, the key attack paths of the entire model are:
R(D)={Ω1,Ω2,...,ΩL} (8)
the determination method provided by the embodiment of the invention provides a concept of a key attack path, and the path plays a main role in noise propagation in a neural network, in other words, the noise is mainly propagated and amplified through the path and interferes with the final classification; in addition, this path is also the main propagation path for valid semantic information. Therefore, the embodiment of the invention provides a technical basis for improving the robustness of the model by researching the path.
To verify the important role that the critical path plays in the noise propagation process, a data set is given
Wherein x
iX 'for original sample'
iFor the corresponding confrontation sample of the original sample, use
I.e. partial cell-wise intermediate features of the original sample, to replace
Namely, part of the unit intermediate features when the confrontation sample is input in the neural network. Namely, the unit features on the key attack path after resisting the sample input model are replaced by the features after the original sample input model. The output results after replacement illustrate the dominant role played by the critical path in noise propagation.
In order to realize the method for determining the key attack path in the neural network, the invention also provides a device for determining the key attack path in the neural network. As shown in fig. 3, the determining means includes a processor 32 and a memory 31, and may further include a communication component, a sensor component, a power component, a multimedia component, and an input/output interface according to actual needs. The memory, communication components, sensor components, power components, multimedia components, and input/output interfaces are all connected to the processor 32. As mentioned above, the memory 31 in the node device may be a Static Random Access Memory (SRAM), an Electrically Erasable Programmable Read Only Memory (EEPROM), an Erasable Programmable Read Only Memory (EPROM), a Programmable Read Only Memory (PROM), a Read Only Memory (ROM), a magnetic memory, a flash memory, etc., and the processor may be a Central Processing Unit (CPU), a Graphic Processing Unit (GPU), a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), a Digital Signal Processing (DSP) chip, etc. Other communication components, sensor components, power components, multimedia components, etc. may be implemented using common components found in existing smart devices and are not specifically described herein.
On the other hand, in the apparatus for determining a critical attack path in a neural network, the processor 32 reads the computer program in the memory 31 to perform the following operations:
inputting each common sample into a neural network, and after each layer of the neural network is output:
finding out a key attack unit of the last layer from each neuron of the last layer of the neural network according to a loss function of the neural network;
obtaining a key attack unit of the previous layer based on the influence of each neuron in the previous layer of the neural network on the key attack unit of the next layer;
and aggregating the key attack units of each layer by layer aiming at each common sample to obtain the key attack path of the neural network at the model level.
The method and the device for determining the key attack path in the neural network provided by the embodiment of the invention are explained in detail above. It will be apparent to those skilled in the art that any obvious modifications thereof can be made without departing from the spirit of the invention, which infringes the patent right of the invention and bears the corresponding legal responsibility.