CN111598805A

CN111598805A - Confrontation sample defense method and system based on VAE-GAN

Info

Publication number: CN111598805A
Application number: CN202010402772.3A
Authority: CN
Inventors: 何永庆; 王海卫; 王荣耀; 王珂
Original assignee: Huazhong University of Science and Technology
Current assignee: Huazhong University of Science and Technology
Priority date: 2020-05-13
Filing date: 2020-05-13
Publication date: 2020-08-28

Abstract

The invention belongs to the technical field of adversarial sample defense, and discloses a VAE-GAN-based adversarial sample defense method and system. The variational auto-encoder VAE and the generative adversarial network GAN are used to denoise the adversarial samples, and the VAE is used as a classifier to predict The processing model denoises the adversarial samples, and GAN is used to assist the training of the VAE, so that the image output of the VAE is closer to the original noise-free image. The VAE-GAN-based adversarial sample defense method provided by the present invention belongs to input preprocessing, and can learn and transfer between different classification models; it does not need to retrain the original classification network, and the training cost is low; it hardly affects the original noise-free sample classification accuracy ; Do not need to use adversarial samples, so there is no need to train adversarial samples; the defense effect against adversarial samples with less noise is also good; the preprocessing speed is fast and the output image quality is close to the original noise-free image.

Description

An adversarial sample defense method and system based on VAE-GAN

技术领域technical field

本发明属于对抗样本防御技术领域，尤其涉及一种基于VAE-GAN的对抗样本防御方法及系统。The invention belongs to the technical field of adversarial sample defense, and in particular relates to a VAE-GAN-based adversarial sample defense method and system.

背景技术Background technique

目前，深度神经网络在许多传统机器学习难以解决的问题上有着优秀的表现。随着深度神经网络模型的不断完善，越来越多的深度学习解决方案慢慢的进入人们的日常生活，比如：图形识别，人脸识别，自动驾驶，语音指令识别等。尽管深度神经网络在各领域有着优秀的表现，但Szegedy等人证明现代深度神经网络非常容易受到对抗样本的攻击，这些对抗样本仅仅是在原始图片上添加细微扰动(人类视觉无法察觉)，就能导致深度神经网络模型对图像的错误分类(如图5所示)。目前深度神经网络对抗攻击的手段越来越多，攻击所需的扰动也越来越小，传统对图像的去噪及降低深度神经网络过拟合程度的方法已经无法防御这些对抗样本的攻击。并且目前的防御方案存在训练成本大，防御迁移能力差的缺点。At present, deep neural networks have excellent performance on many problems that are difficult to solve by traditional machine learning. With the continuous improvement of deep neural network models, more and more deep learning solutions are slowly entering people's daily lives, such as: pattern recognition, face recognition, automatic driving, voice command recognition, etc. Despite the excellent performance of deep neural networks in various fields, Szegedy et al. have demonstrated that modern deep neural networks are very vulnerable to adversarial examples, which simply add subtle perturbations to the original image (undetectable by human vision) and can This results in the misclassification of images by the deep neural network model (as shown in Figure 5). At present, there are more and more methods for deep neural networks to resist attacks, and the disturbance required for attacks is getting smaller and smaller. Traditional methods of denoising images and reducing the degree of overfitting of deep neural networks have been unable to defend against these adversarial sample attacks. And the current defense scheme has the disadvantages of high training cost and poor defense transfer ability.

目前存在的防御方案分为三个方向：输入预处理，改进神经网络模型以及仅识别是否为对抗样本而不进行处理。目前的防御方案如下：The existing defense schemes are divided into three directions: input preprocessing, improving neural network models, and only identifying whether they are adversarial samples without processing. The current defense plan is as follows:

(1)输入预处理：对图像进行压缩重建，对图像进行缩放，降低图像分辨率，对图片进行去噪；(1) Input preprocessing: compress and reconstruct the image, scale the image, reduce the image resolution, and denoise the image;

(2)改进神经网络模型：限制神经元的输出，在神经网络模型中添加不可微部分，降低神经网络过拟合，以及在训练集中添加对抗样本提高神经网络模型的健壮性。(2) Improve the neural network model: limit the output of neurons, add non-differentiable parts to the neural network model, reduce the overfitting of the neural network, and add adversarial samples in the training set to improve the robustness of the neural network model.

(3)仅识别是否为对抗样本不进行处理：利用svm分辨输入数据是否为对抗样本，使用胶囊网络来分辨输入数据是否为对抗样本。(3) Only identify whether it is an adversarial sample without processing: use svm to distinguish whether the input data is an adversarial sample, and use a capsule network to distinguish whether the input data is an adversarial sample.

但是，目前的防御方案中，输入预处理会导致输入图片质量下降，降低原始无噪声图像的分类准确率，同时该防御方案大多对扰动较大的对抗样本有较好的防御效果，扰动越小防御效果越差。However, in the current defense scheme, the input preprocessing will lead to the degradation of the quality of the input image and the classification accuracy of the original noise-free image. At the same time, most of the defense schemes have a better defense effect on the adversarial samples with large disturbance, and the smaller the disturbance The defense is less effective.

改进神经网络模型在一定程度下会降低神经网络模型的分类准确率，但其更大缺点是需要重新训练网络模型，同时仅在当前神经网络模型下有防御作用，防御策略无法迁移至其他网络模型，并且随着对抗攻击的升级，防御策略也要跟随升级(否则无法防御更新后的对抗攻击)，这导致该防御有着极高的网络训练成本。Improving the neural network model will reduce the classification accuracy of the neural network model to a certain extent, but its greater disadvantage is that the network model needs to be retrained. At the same time, it only has a defensive effect under the current neural network model, and the defense strategy cannot be transferred to other network models. , and as the adversarial attack is upgraded, the defense strategy must also be upgraded (otherwise, the updated adversarial attack cannot be defended), which results in a very high network training cost for this defense.

仅识别是否为对抗样本不进行处理：无法对对抗样本进行识别，有时会误识别受到轻微随机噪声影响的输入数据。Only identify whether it is an adversarial example without processing: Adversarial examples cannot be identified, and input data that is affected by slight random noise is sometimes misidentified.

目前大部分防御策略要使用对抗样本进行训练，造成了额外的训练成本。At present, most defense strategies are trained using adversarial samples, resulting in additional training costs.

综上所述，现有技术存在的问题是：(1)传统对图像的去噪及降低深度神经网络过拟合程度已经无法防御这些对抗样本的攻击，并且目前的防御方案存在训练成本大，防御迁移能力差的缺点。To sum up, the problems existing in the existing technology are: (1) The traditional denoising of images and reducing the degree of overfitting of deep neural networks have been unable to defend against these adversarial sample attacks, and the current defense scheme has high training costs, The disadvantage of poor defense migration ability.

(2)目前的防御方案中，输入预处理会导致输入图片质量下降，降低原始无噪声图像的分类准确率，同时该防御方案大多对扰动较大的对抗样本有较好的防御效果，扰动越小防御效果越差。(2) In the current defense scheme, the input preprocessing will lead to the degradation of the quality of the input image and the classification accuracy of the original noise-free image. At the same time, most of the defense schemes have a better defense effect on the adversarial samples with larger disturbances. The smaller the defense effect, the worse.

(3)改进神经网络模型的方法需要重新训练网络模型，同时仅在当前神经网络模型下有防御作用，防御策略无法迁移至其他网络模型，并且随着对抗攻击的升级，防御策略也要跟随升级(否则无法防御更新后的对抗攻击)，这导致该防御有着极高的网络训练成本。(3) The method of improving the neural network model needs to retrain the network model. At the same time, it only has a defensive effect under the current neural network model, and the defense strategy cannot be transferred to other network models. With the upgrade of the confrontation attack, the defense strategy should also be upgraded. (Otherwise, the updated adversarial attack cannot be defended), which results in a very high network training cost for this defense.

(4)仅识别是否为对抗样本而不进行处理：无法对对抗样本进行识别，有时会误识别受到轻微随机噪声影响的输入数据。目前大部份防御策略要使用对抗样本进行训练，造成了训练成本过高。(4) Only identify whether it is an adversarial sample without processing: The adversarial sample cannot be identified, and sometimes the input data affected by slight random noise is misidentified. At present, most defense strategies use adversarial samples for training, resulting in high training costs.

解决上述技术问题的难度：由于成对抗样本的算法在不断的改进，导致：The difficulty of solving the above technical problems: due to the continuous improvement of the algorithm of paired adversarial samples, resulting in:

生成对抗样本的代价也越来越低，更容易生成对抗样本用于攻击。The cost of generating adversarial examples is also getting lower and lower, and it is easier to generate adversarial examples for attack.

对抗样本攻击能力也越来越强。The adversarial sample attack capability is also getting stronger and stronger.

(3)攻击方式也由纯白盒攻击转向黑盒攻击。(3) The attack method also changes from pure white box attack to black box attack.

解决上述技术问题的意义：目前很多深度学习解决方案已经进入人们的日常生活，如人脸识别、自动驾驶、视频检测等，对抗样本的存在给这些解决方案的使用带来了巨大的风险。比如，在人脸识别系统中，不法分子可以利用对抗样本冒用他人身份，入侵政府或公司内部系统，窃取机密信息。又或者，在自动驾驶过程中，使用对抗样本覆盖真实的路标，车辆自动驾驶系统将无法对该路标做出正确的决策，从而引发严重的交通事故。因此设计安全性要求严苛的深度学习解决方案时，必须考虑如何防御对抗样本的攻击。对抗样本的存在极大的限制了深度学习解决方案的使用，因此，研究如何有效的防御对抗样本攻击具有巨大的现实意义。The significance of solving the above technical problems: At present, many deep learning solutions have entered people's daily life, such as face recognition, automatic driving, video detection, etc. The existence of adversarial samples brings huge risks to the use of these solutions. For example, in the face recognition system, criminals can use adversarial samples to impersonate the identity of others, invade the internal systems of governments or companies, and steal confidential information. Or, in the process of automatic driving, using adversarial samples to cover the real road signs, the vehicle automatic driving system will not be able to make correct decisions on the road signs, resulting in serious traffic accidents. Therefore, when designing a security-critical deep learning solution, it is necessary to consider how to defend against adversarial attacks. The existence of adversarial examples greatly limits the use of deep learning solutions, so it is of great practical significance to study how to effectively defend against adversarial examples.

发明内容SUMMARY OF THE INVENTION

针对现有技术存在的问题，本发明提供了一种基于VAE-GAN的对抗样本防御方法，旨在解决现有对抗样本防御方案存在的防御训练成本高、防御方案通用性差以及防御方案可能导致原始分类器分类精度下降等问题。In view of the problems existing in the prior art, the present invention provides an adversarial sample defense method based on VAE-GAN, which aims to solve the existing adversarial sample defense schemes such as high defense training cost, poor generality of the defense scheme, and the possibility that the defense scheme may lead to the original The classification accuracy of the classifier decreases and so on.

本发明是这样实现的，一种基于VAE-GAN的对抗样本防御方法，所述基于VAE-GAN的对抗样本防御方法使用变分自动编码器(VAE)和生成对抗网络(GAN)对对抗样本进行去噪，VAE作为分类器的预处理模型对对抗样本进行去噪处理，GAN用于辅助VAE的训练，使得VAE输出的图像结果更加接近于原始无噪声的图像。The present invention is implemented by a VAE-GAN-based adversarial sample defense method, which uses a variational autoencoder (VAE) and a generative adversarial network (GAN) to perform adversarial samples on adversarial samples. Denoising, VAE is used as the preprocessing model of the classifier to denoise the adversarial samples, and GAN is used to assist the training of VAE, so that the image output of VAE is closer to the original noise-free image.

进一步，所述基于VAE-GAN的对抗样本防御方法包括以下步骤：Further, the VAE-GAN-based adversarial sample defense method includes the following steps:

步骤一，选择合适的网络结构构建VAE-GAN训练网络。Step 1, select the appropriate network structure to construct the VAE-GAN training network.

步骤二，对VAE-GAN网络进行训练。The second step is to train the VAE-GAN network.

步骤三，使用VAE-GAN模块作为分类器预处理模块防御对抗样本的攻击。Step 3, use the VAE-GAN module as a classifier preprocessing module to defend against adversarial sample attacks.

采用VAE和GAN的融合VAE-GAN模型来防御对抗样本攻击，模型如图6所示，通过步骤二训练一个专门针对于对抗样本的VAE-GAN降噪模型作为分类器的预处理块，任何输入到分类器的样本都需要先通过VAE-GAN模型的降噪，然后才送入分类器模型中进行分类，最终使分类器能正确识别对抗样本。The fusion VAE-GAN model of VAE and GAN is used to defend against adversarial sample attacks. The model is shown in Figure 6. Through step 2, a VAE-GAN noise reduction model specially designed for adversarial samples is trained as the preprocessing block of the classifier. Any input The samples to the classifier need to be denoised by the VAE-GAN model first, and then sent to the classifier model for classification, so that the classifier can correctly identify the adversarial samples.

进一步，步骤一中，所述选择合适的网络结构构建VAE-GAN训练网络的方法如下：Further, in step 1, the method for selecting an appropriate network structure to construct a VAE-GAN training network is as follows:

选择合适的网络结构构建VAE-GAN训练网络，对于分类器的不同VAE和GAN选择的神经网络结构也不同，对于小的数据集VAE和GAN选择使用DNN结构，对于大的数据集VAE和GAN就需要选择更深的神经网络或者卷积神经网络。Select the appropriate network structure to build the VAE-GAN training network. The neural network structure selected for different VAEs and GANs of the classifier is also different. For small data sets VAE and GAN choose to use DNN structure. For large data sets VAE and GAN are used. Need to choose a deeper neural network or a convolutional neural network.

进一步，步骤一中，所述VAE-GAN包含两个部分：VAE部分与GAN部分。VAE主要功能是将输入的图像进行去噪，然后通过GAN使得VAE重建图像分布尽可能的接近原始图像分布，以保证去除噪声后图像质量接近于原始无噪声图像。Further, in step 1, the VAE-GAN includes two parts: a VAE part and a GAN part. The main function of VAE is to denoise the input image, and then use GAN to make the distribution of the VAE reconstructed image as close to the original image distribution as possible to ensure that the image quality after noise removal is close to the original noise-free image.

VAE又可以分为Encoder编码部分以及Decoder解码部分。Encoder主要是将输入的图像样本映射成两组n维矢量(均值矢量和标准差矢量)。Decoder主要将这两组n维矢量添加噪声后恢复为原始的图像样本。本发明通过GAN辅助训练变分自动编码器的Decoder，从而提高变分自动编码器(VAE)生成图像的质量。GAN包含两个部分：生成器(Generator)将一维矢量z转化为图像，辨别器(Discriminator)用于识别输入图像是真实图像还是生成器生成图像。在解码/生成网络中VAE的解码网络和GAN的生成网络共用该部分网络参数，为方便描述，之后统一称为解码网络。VAE can be divided into Encoder encoding part and Decoder decoding part. The Encoder mainly maps the input image samples into two sets of n-dimensional vectors (mean vector and standard deviation vector). The Decoder mainly restores these two sets of n-dimensional vectors to the original image samples after adding noise. The present invention uses GAN to assist in training the Decoder of the variational auto-encoder, thereby improving the quality of the image generated by the variational auto-encoder (VAE). GAN consists of two parts: the generator (Generator) converts the one-dimensional vector z into an image, and the discriminator (Discriminator) is used to identify whether the input image is a real image or a generator-generated image. In the decoding/generating network, the decoding network of VAE and the generating network of GAN share this part of the network parameters. For the convenience of description, it will be collectively referred to as the decoding network.

进一步，步骤二中，所述对VAE-GAN网络进行训练的方法如下：Further, in step 2, the method for training the VAE-GAN network is as follows:

在完成VAE-GAN的网络构建后，需要先确定整个网络的优化目标，所述VAE-GAN防御模型包含三个网络，分别为编码网络、解码网络和判别网络，三个网络的优化目标各不相同。After completing the network construction of VAE-GAN, the optimization goal of the entire network needs to be determined first. The VAE-GAN defense model includes three networks, namely the encoding network, the decoding network and the discriminant network. The optimization goals of the three networks are different. same.

模型整体优化目标函数定义如下：The overall optimization objective function of the model is defined as follows:

其中σ和μ表示隐变量z的后验分布的均值和方差，x表示原始图像(不一定是输入图像)，

表示解码网络的输出，D表示判别网络，G表示生成网络(解码网络)，γ为引入的超参数，一般取值为0.2。where σ and μ represent the mean and variance of the posterior distribution of the latent variable z, and x represents the original image (not necessarily the input image),

Represents the output of the decoding network, D represents the discriminative network, G represents the generation network (decoding network), γ is the introduced hyperparameter, and the general value is 0.2.

相应的模型整体损失函数为：The corresponding model overall loss function is:

定义完目标函数后，可以直接通过梯度下降(SGD)或其他优化算法进行模型训练。After defining the objective function, the model can be trained directly by gradient descent (SGD) or other optimization algorithms.

本发明的另一目的在于提供一种实施所述基于VAE-GAN的对抗样本防御方法的基于VAE-GAN的对抗样本防御控制系统。Another object of the present invention is to provide a VAE-GAN based adversarial example defense control system implementing the VAE-GAN based adversarial example defense method.

本发明的另一目的在于提供一种存储在计算机可读介质上的计算机程序产品，包括计算机可读程序，供于电子装置上执行时，提供用户输入接口以实施所述基于VAE-GAN的对抗样本防御方法。Another object of the present invention is to provide a computer program product stored on a computer-readable medium, including a computer-readable program that, when executed on an electronic device, provides a user input interface to implement the VAE-GAN-based countermeasures Sample defense methods.

本发明的另一目的在于提供一种计算机可读存储介质，储存有指令，当所述指令在计算机上运行时，使得计算机执行所述基于VAE-GAN的对抗样本防御方法。Another object of the present invention is to provide a computer-readable storage medium storing instructions that, when the instructions are executed on a computer, cause the computer to execute the VAE-GAN-based adversarial sample defense method.

综上所述，本发明的优点及积极效果为：本发明提供的基于VAE-GAN的对抗样本防御方法属于输入预处理，可在不同分类模型之间学习迁移；无需重新训练原有分类网络，训练成本较低；几乎不影响原始无噪声样本分类精度；不需要使用对抗样本，所以无需而外的训练对抗样本；对噪声较小的对抗样本防御效果也很好；预处理速度快且输出图像质量接近于原始无噪声图像。实验表明，经过VAE-GAN模型降噪后的普通样本，分类准确率仅下降了1～4％说明该防御模型几乎不影响分类器对普通样本的识别；在黑白盒条件下，经过VAE-GAN模型降噪后的对抗样本，分类准确率相较于其原始样本下降了2～24％，To sum up, the advantages and positive effects of the present invention are as follows: the VAE-GAN-based adversarial sample defense method provided by the present invention belongs to input preprocessing, and can learn and transfer between different classification models; no need to retrain the original classification network, The training cost is low; the classification accuracy of the original noise-free samples is hardly affected; the adversarial samples are not required, so there is no need to train the adversarial samples; the defense effect of the adversarial samples with less noise is also very good; the preprocessing speed is fast and the output image The quality is close to the original noise-free image. Experiments show that the classification accuracy of ordinary samples after noise reduction by the VAE-GAN model is only reduced by 1 to 4%, indicating that the defense model hardly affects the classifier's recognition of ordinary samples; under the condition of black and white boxes, after VAE-GAN The classification accuracy of the adversarial samples after model noise reduction is 2-24% lower than that of the original samples.

证明本发明确实能够防御对抗样本的攻击，并且基本上对分类器的分类精度影响较小。It is proved that the present invention can indeed defend against the attack of adversarial samples, and basically has little effect on the classification accuracy of the classifier.

附图说明Description of drawings

图1是本发明实施例提供的基于VAE-GAN的对抗样本防御方法流程图。FIG. 1 is a flowchart of a method for adversarial sample defense based on VAE-GAN provided by an embodiment of the present invention.

图2是本发明实施例提供的VAE-GAN结构图。FIG. 2 is a structural diagram of a VAE-GAN provided by an embodiment of the present invention.

图3是本发明实施例提供的VAE作为分类器的预处理模块防御对抗样本攻击示意图。FIG. 3 is a schematic diagram of a VAE as a preprocessing module of a classifier to defend against an adversarial sample attack provided by an embodiment of the present invention.

图4是本发明实施例提供的209次训练后VAE的降噪效果图。FIG. 4 is a noise reduction effect diagram of a VAE after 209 times of training provided by an embodiment of the present invention.

图5是本发明实施例提供的深度神经网络对抗攻击示意图。FIG. 5 is a schematic diagram of a deep neural network against an attack provided by an embodiment of the present invention.

图6是本发明实施例提供的采用VAE和GAN的融合VAE-GAN模型来防御对抗样本攻击模型图。FIG. 6 is a model diagram of an adversarial sample attack defense model provided by an embodiment of the present invention using a fusion VAE-GAN model of VAE and GAN.

具体实施方式Detailed ways

为了使本发明的目的、技术方案及优点更加清楚明白，以下结合实施例，对本发明进行进一步详细说明。应当理解，此处所描述的具体实施例仅仅用以解释本发明，并不用于限定本发明。In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.

传统对图像的去噪及降低深度神经网络过拟合程度已经无法防御这些对抗样本的攻击，并且目前的防御方案存在训练成本大，防御迁移能力性差的缺点。Traditional denoising of images and reducing the degree of overfitting of deep neural networks have been unable to defend against these adversarial sample attacks, and the current defense schemes have the disadvantages of high training cost and poor defense transfer ability.

目前的防御方案中，输入预处理会导致输入图片质量下降，降低原始无噪声图像的分类准确率，同时该防御方案大多对扰动较大的对抗样本有较好的防御效果，扰动越小防御效果越差。In the current defense scheme, the input preprocessing will lead to the degradation of the quality of the input image and the classification accuracy of the original noise-free image. At the same time, most of the defense schemes have a better defense effect on the adversarial samples with larger disturbance, and the smaller the disturbance, the better the defense effect. the worse.

改进神经网络模型的方法需要重新训练网络模型，同时仅在当前神经网络模型下有防御作用，防御策略无法迁移至其他网络模型，并且随着对抗攻击的升级，防御策略也要跟随升级(否则无法防御更新后的对抗攻击)，这导致该防御有着极高的网络训练成本。The method of improving the neural network model requires retraining the network model, and at the same time, it only has a defensive effect under the current neural network model, and the defense strategy cannot be transferred to other network models. defense against updated adversarial attacks), which results in an extremely high network training cost for this defense.

仅识别是否为对抗样本不进行处理：无法对对抗样本进行识别，有时会误识别受到轻微随机噪声影响的输入数据。目前大部份防御策略要使用对抗样本进行训练，造成了而外的训练成本。Only identify whether it is an adversarial example without processing: Adversarial examples cannot be identified, and input data that is affected by slight random noise is sometimes misidentified. At present, most defense strategies use adversarial samples for training, resulting in extra training costs.

针对现有技术存在的问题，本发明提供了一种基于VAE-GAN的对抗样本防御方法及系统，下面结合附图对本发明作详细的描述。In view of the problems existing in the prior art, the present invention provides a method and system for adversarial sample defense based on VAE-GAN. The present invention is described in detail below with reference to the accompanying drawings.

本发明实施例提供的基于VAE-GAN的对抗样本防御方法使用变分自动编码器(VAE)和生成对抗网络(GAN)对对抗样本进行去噪，VAE-GAN模块作为分类器的预处理模型对对抗样本进行去噪处理，GAN用于辅助VAE的训练，使得VAE输出的图像结果更加接近于原始无噪声的图像。The VAE-GAN-based adversarial sample defense method provided by the embodiment of the present invention uses a variational autoencoder (VAE) and a generative adversarial network (GAN) to denoise the adversarial samples, and the VAE-GAN module is used as a preprocessing model for the classifier. The adversarial samples are denoised, and the GAN is used to assist the training of the VAE, so that the image output by the VAE is closer to the original noise-free image.

如图1所示，本发明实施例提供的基于VAE-GAN的对抗样本防御方法包括以下步骤：2As shown in FIG. 1 , the VAE-GAN-based adversarial sample defense method provided by the embodiment of the present invention includes the following steps: 2

S101，选择合适的网络结构构建VAE-GAN训练网络。S101, select an appropriate network structure to construct a VAE-GAN training network.

S102，对VAE-GAN网络进行训练。S102, train the VAE-GAN network.

S103，使用训练好的VAE-GAN模型作为分类器的预处理模块防御对抗样本的攻击。S103, use the trained VAE-GAN model as a preprocessing module of the classifier to defend against adversarial sample attacks.

步骤S103中，采用VAE和GAN的融合VAE-GAN模型来防御对抗样本攻击，模型如图6所示，通过步骤S102训练一个专门针对于对抗样本的VAE-GAN降噪模型作为分类器的预处理块，任何输入到分类器的样本都需要先通过VAE-GAN模型的降噪，然后才送入分类器模型中进行分类，最终使分类器能正确识别对抗样本。In step S103, the fusion VAE-GAN model of VAE and GAN is used to defend against adversarial sample attacks. The model is shown in Figure 6. Step S102 is used to train a VAE-GAN noise reduction model specifically for adversarial samples as the preprocessing of the classifier. Block, any sample input to the classifier needs to be denoised by the VAE-GAN model first, and then sent to the classifier model for classification, so that the classifier can correctly identify the adversarial samples.

下面结合具体分析对本发明作进一步描述。The present invention will be further described below in conjunction with specific analysis.

对抗样本：对抗样本由Christian Szegedy等人提出，是指在数据集中通过故意添加细微的干扰所形成的输入样本，导致模型以高置信度给出一个错误的输出。Adversarial samples: Adversarial samples, proposed by Christian Szegedy et al., refer to input samples formed by deliberately adding subtle disturbances to a dataset, causing the model to give an incorrect output with high confidence.

生成对抗网络(GAN)：生成式对抗网络(GAN,Generative Adversarial Networks)是一种深度学习模型，是近年来复杂分布上无监督学习最具前景的方法之一。模型通过框架中(至少)两个模块：生成模型(Generative Model)和判别模型(Discriminative Model)的互相博弈学习产生相当好的输出。Generative Adversarial Network (GAN): Generative Adversarial Networks (GAN) is a deep learning model and one of the most promising approaches for unsupervised learning on complex distributions in recent years. The model produces fairly good outputs through mutual game learning of (at least) two modules in the framework: Generative Model and Discriminative Model.

变分自动编码器(VAE)：变分自动编码器有一对相互连接的神经网络组成，输入端神经网络为编码器，输出端神经网络为解码器。编码器将输入数据转化为两组n维矢量：分别为均值矢量μ和标准差矢量σ。解码器将这两组n维矢量恢复为原始输入数据。Variational Autoencoder (VAE): A variational autoencoder consists of a pair of interconnected neural networks, where the input neural network is the encoder and the output neural network is the decoder. The encoder converts the input data into two sets of n-dimensional vectors: the mean vector μ and the standard deviation vector σ. The decoder restores these two sets of n-dimensional vectors to the original input data.

下面结合具体实施例对本发明作进一步描述。The present invention will be further described below in conjunction with specific embodiments.

目前存在的对抗样本防御方案主要存在三个问题：防御训练成本高，防御方案通用性差以及防御方案可能导致原始分类器分类精度下降。针对于上面存在的问题，本发明方案提出使用变分自动编码器(VAE)和生成对抗网络(GAN)对对抗样本进行去噪的对抗样本防御方案。VAE作为分类器的预处理模型对对抗样本进行去噪处理，GAN用于辅助VAE的训练，使得VAE输出的图像结果更加接近于原始无噪声的图像。由于需要防御的分类器性能和数据集不同，VAE和GAN使用的神经网络结构也不同(根据分类器的需求选择合适的网络结构，减少训练成本)。本防御方案分为三个步骤：选择合适的网络结构构建VAE-GAN训练网络，对VAE-GAN网络进行训练，使用VAE模块作为分类器预处理模块防御对抗样本的攻击。There are three main problems in the existing adversarial sample defense schemes: high cost of defense training, poor generality of defense schemes, and a decrease in the classification accuracy of the original classifier due to the defense scheme. In view of the above problems, the solution of the present invention proposes an adversarial sample defense scheme that uses variational autoencoder (VAE) and generative adversarial network (GAN) to denoise adversarial samples. VAE is used as the preprocessing model of the classifier to denoise the adversarial samples, and GAN is used to assist the training of the VAE, so that the image output of the VAE is closer to the original noise-free image. Due to the different classifier performance and datasets that need to be defended, the neural network structure used by VAE and GAN is also different (select the appropriate network structure according to the needs of the classifier to reduce training costs). This defense scheme is divided into three steps: select an appropriate network structure to build a VAE-GAN training network, train the VAE-GAN network, and use the VAE module as a classifier preprocessing module to defend against adversarial sample attacks.

(1)选择合适的网络结构构建VAE-GAN训练网络，对于分类器的不同我们VAE和GAN选择的神经网络结构也不同，对于小的数据集VAE和GAN可以选择使用DNN结构，对于大的数据集VAE和GAN就需要选择更深的神经网络或者卷积神经网络，构建的VAE-GAN的结构如图2所示。(1) Select the appropriate network structure to build the VAE-GAN training network. For different classifiers, the neural network structure selected by VAE and GAN is also different. For small data sets VAE and GAN can choose to use DNN structure, for large data. To set VAE and GAN, it is necessary to choose a deeper neural network or a convolutional neural network. The structure of the constructed VAE-GAN is shown in Figure 2.

如图2所示的VAE-GAN一共包含2个部分：VAE部分与GAN部分。VAE主要功能是将输入的图像进行去噪，然后通过GAN使得VAE重建图像分布尽可能的接近原始图像分布，以保证去除噪声后图像质量接近于原始无噪声图像。VAE又可以分为Encoder编码部分以及Decoder解码部分。Encoder主要是将输入的图像样本映射成两组n维矢量(均值矢量和标准差矢量)。Decoder主要将这两组n维矢量添加噪声后恢复为原始的图像样本。传统VAE生成的图片比较模糊，使用VAE对图片进行去噪处理后图像质量要远低于原始图像质量。本发明通过GAN辅助训练变分自动编码器的Decoder，从而提高变分自动编码器(VAE)生成图像的质量。GAN包含两个部分：生成器(Generator)将一维矢量z转化为图像，辨别器(Discriminator)用于识别输入图像是真实图像还是生成器生成图像。The VAE-GAN shown in Figure 2 consists of two parts: the VAE part and the GAN part. The main function of VAE is to denoise the input image, and then use GAN to make the distribution of the VAE reconstructed image as close to the original image distribution as possible to ensure that the image quality after noise removal is close to the original noise-free image. VAE can be divided into Encoder encoding part and Decoder decoding part. The Encoder mainly maps the input image samples into two sets of n-dimensional vectors (mean vector and standard deviation vector). The Decoder mainly restores these two sets of n-dimensional vectors to the original image samples after adding noise. The image generated by traditional VAE is relatively blurry, and the image quality after denoising the image with VAE is much lower than the original image quality. The present invention uses GAN to assist in training the Decoder of the variational auto-encoder, thereby improving the quality of the image generated by the variational auto-encoder (VAE). GAN consists of two parts: the generator (Generator) converts the one-dimensional vector z into an image, and the discriminator (Discriminator) is used to identify whether the input image is a real image or a generator-generated image.

(2)VAE-GAN模型的训练，在完成VAE-GAN的网络构建后，需要先确定这个网络的优化目标，在本发明中的VAE-GAN一共有三个优化目标，分别对应VAE-GAN防御模型所包含的三个网络，优化目标函数定义如下：(2) The training of the VAE-GAN model. After completing the network construction of the VAE-GAN, the optimization goal of the network needs to be determined first. The VAE-GAN in the present invention has three optimization goals, which correspond to the VAE-GAN defense. The three networks included in the model, the optimization objective function is defined as follows:

本发明提出的VAE-GAN防御模型所包含的三个网络(编码器网络，解码网络和判别器网络)，优化目标各不相同。The three networks (encoder network, decoding network and discriminator network) included in the VAE-GAN defense model proposed by the present invention have different optimization objectives.

对于编码器，由于应用场景的变化，输入的图片有原始图和含有攻击信息(噪声)的篡改图之分，其优化目标采用和普通VAE编码器相同，但各变量含义有了很大变化。本发明定义的编码器优化目标如下：For the encoder, due to the change of the application scenario, the input image is divided into the original image and the tampered image containing attack information (noise). The optimization target is the same as that of the ordinary VAE encoder, but the meaning of each variable has changed greatly. The encoder optimization objective defined by the present invention is as follows:

表示解码器的输出。相应的编码器损失函数可以表示为：where σ and μ represent the mean and variance of the posterior distribution of the latent variable z, and x represents the original image (not necessarily the input image),

represents the output of the decoder. The corresponding encoder loss function can be expressed as:

对于解码器，其优化目标包含两部分：VAE的重构损失和GAN的生成器(解码器)的损失。本发明定义的解码器优化目标如下：For the decoder, its optimization objective consists of two parts: the reconstruction loss of VAE and the loss of the generator (decoder) of GAN. The decoder optimization target defined by the present invention is as follows:

其中D表示判别器，G表示生成器(解码器)。由于在训练开始解码器网络的重构误差会远大于GAN生成器损失，这将会解码器网络无法学习到GAN提供的分布特征，引入超参γ，用于表征这两部分损失，引入超参后解码器优化目标如下：where D represents the discriminator and G represents the generator (decoder). Since the reconstruction error of the decoder network at the beginning of training will be much larger than the loss of the GAN generator, the decoder network will not be able to learn the distribution features provided by the GAN, and the hyperparameter γ is introduced to characterize the two parts of the loss, and the hyperparameter is introduced. The post-decoder optimization objective is as follows:

超参γ只在更新解码器网络时使用，γ一般取值0.2。相应的解码器网络可以表示为：The hyperparameter γ is only used when updating the decoder network, and γ generally takes a value of 0.2. The corresponding decoder network can be expressed as:

对于判别器网络，其优化目标和普通GAN判别器网络优化目标相同：For the discriminator network, the optimization objective is the same as that of the ordinary GAN discriminator network:

相应的判别器网络的损失函数可以表示为：The loss function of the corresponding discriminator network can be expressed as:

模型整体优化目标为：The overall optimization objective of the model is:

定义完目标函数后，可以直接通过梯度下降(SGD)或其他优化算法进行模型训练，训练流程如下：After defining the objective function, you can directly train the model through gradient descent (SGD) or other optimization algorithms. The training process is as follows:

包含的步骤：从训练集中选取一批图像样本x_r；随机向图像样本x_r中添加对抗扰动得到x^*；通过编码器网络计算隐变量z后验概率分布的均值和方差μ和σ；从隐变量z的后验分布采样得到隐变量z；通过解码网络将隐变量z还原成无噪声图像

；计算VAE编码器网络的损失函数；计算解码器网络的损失函数；计算判别器网络的损失函数；按顺序更新三个网络中的参数。The steps involved: select a batch of image samples x _r from the training set; randomly add adversarial perturbations to the image samples x _r to obtain x ^* ; calculate the mean and variance μ and σ of the posterior probability distribution of the latent variable z through the encoder network; The latent variable z is obtained by sampling the posterior distribution of the hidden variable z; the hidden variable z is restored to a noise-free image through the decoding network

; Calculate the loss function of the VAE encoder network; calculate the loss function of the decoder network; calculate the loss function of the discriminator network; update the parameters in the three networks in order.

(3)VAE-GAN模型训练完成后，使用VAE-GAN模块就可完成对抗样本的防御，防御部署如图3所示。(3) After the VAE-GAN model training is completed, the VAE-GAN module can be used to complete the defense against the sample, and the defense deployment is shown in Figure 3.

下面结合具体实验对本发明作进一步描述。The present invention will be further described below in conjunction with specific experiments.

实验硬件环境为：英特尔Xeon E-5-2678v3处理器，64G内存，GPU型号为英伟达RTX2080，显存大小为8G。实验使用的软件环境为：Ubuntu16.04操作系统，编程语言为python3，开发环境为PyCharm，使用的机器学习框架为TensorFlow1.15.1。本实施例采用MNIST数据集，为验证基于VAE-GAN防御对抗样本的策略在白盒攻击下的有效性，采用MNIST数据集训练了两个分类器MNIST_A,MNIST_B。The experimental hardware environment is: Intel Xeon E-5-2678v3 processor, 64G memory, GPU model is NVIDIA RTX2080, and the video memory size is 8G. The software environment used in the experiment is: Ubuntu16.04 operating system, the programming language is python3, the development environment is PyCharm, and the machine learning framework used is TensorFlow1.15.1. This example uses the MNIST data set. In order to verify the effectiveness of the strategy based on VAE-GAN to defend against adversarial samples under white-box attacks, the MNIST data set is used to train two classifiers, MNIST_A and MNIST_B.

如表1-1所示，MNIST_A一共有6层网络，其中前三层卷积层分别拥有64、128、128个卷积核，卷积核大小是；前两层卷积核步长为2，最后一层卷积核步长为1；激活函数都是ReLU函数。第四层是扁平化层用于将多维输入转化为一维输出；第五层全连接层一共包含100个神经元，激活函数为ReLU函数；最后一层是Softmax层输出分类结果。As shown in Table 1-1, MNIST_A has a total of 6 layers of networks, of which the first three convolution layers have 64, 128, and 128 convolution kernels, respectively, and the size of the convolution kernel is 2; the first two layers of convolution kernel step size is 2 , the last layer of convolution kernel has a stride of 1; the activation functions are all ReLU functions. The fourth layer is a flattening layer used to convert multi-dimensional input into one-dimensional output; the fifth fully connected layer contains a total of 100 neurons, and the activation function is the ReLU function; the last layer is the Softmax layer to output the classification result.

表1-1 MNIST数据集分类器模型MNIST_A网络结构Table 1-1 MNIST dataset classifier model MNIST_A network structure

表1-2 MNIST数据集分类器模型MNIST_B网络结构Table 1-2 MNIST dataset classifier model MNIST_B network structure

如表1-2所示，MNIST_B共有10层网络，第一层和第二层卷积层拥有64卷积核，卷积核大小都是3x3，卷积核的移动步长都是1，边缘填充模式都是SAME；第三层是池化层，池化方法为最大池化法，池化核大小为2x2；第四层和第五层都是卷积层都拥有128个卷积核，卷积核大小都是3x3，卷积核的移动步长都是1，边缘填充模式都是VALID；第七层是池化层，池化方法为最大池化法，池化核大小为；第八层和第九层都是全连接层分别包含100个神经元和100个神经元，激活函数都是ReLU函数；最后一层为Softmax层作用是输出分类结果。As shown in Table 1-2, MNIST_B has a total of 10 layers of network. The first and second convolutional layers have 64 convolution kernels. The size of the convolution kernel is 3x3. The filling mode is SAME; the third layer is the pooling layer, the pooling method is the maximum pooling method, and the pooling kernel size is 2x2; the fourth and fifth layers are both convolutional layers with 128 convolution kernels. The size of the convolution kernel is 3x3, the moving step size of the convolution kernel is 1, and the edge filling mode is VALID; the seventh layer is the pooling layer, the pooling method is the maximum pooling method, and the size of the pooling kernel is; The eighth and ninth layers are fully connected layers containing 100 neurons and 100 neurons respectively, and the activation functions are all ReLU functions; the last layer is the Softmax layer, which is used to output the classification results.

分类器的训练参数设置如下：对于MNIST_A和MNIST_B模型，训练时学习率都设置为0.001，每批次都并行训练100张图片，训练周期分别为20和25。The training parameters of the classifier are set as follows: for both MNIST_A and MNIST_B models, the learning rate is set to 0.001 during training, and each batch is trained in parallel on 100 images, with training periods of 20 and 25, respectively.

接着需产生用于攻击深度学习系统的对抗样本，使用FGSM算法和BIM算法对4个攻防目标模型进行随机有目标攻击和无目标攻击，生成为L_∞范数距离为0.03、0.05和0.1的对抗样本各10000个(每个模型每种距离都生成10000个)，作为FGSM算法和BIM算法的攻击对抗样本；使用Deepfool算法对4个攻防目标模型进行无目标攻击，生成L₂范数距离为0.03、0.05和0.1的对抗样本各10000个，作为Deepfool算法的攻击对抗样本；使用C&W算法对4个攻防目标模型进行随机有目标攻击和无目标攻击，生成为L_∞范数距离为0.03、0.05和0.1的对抗样本各10000个，作为C&W算法的攻击对抗样本。将以上产生的对抗样本用来对实施例的防御模型进行训练，得到防御模型。针对MNIST数据集的训练得到VGMA和VGMB两个防御模型，VGMA表示由MNIST_A模型生成的对抗样本得到的防御模型，VGMB表示由MNIST_B模型生成对抗样本得到的防御模型。Then it is necessary to generate adversarial samples for attacking the deep learning system, and use the FGSM algorithm and the BIM algorithm to conduct random targeted attacks and untargeted attacks on the four attack and defense target models, and generate the confrontation with L _∞ norm distances of 0.03, 0.05 and 0.1 There are 10,000 samples each (10,000 samples are generated for each distance of each model) as the attack and confrontation samples of the FGSM algorithm and the BIM algorithm; the Deepfool algorithm is used to conduct untargeted attacks on the 4 attack and defense target models, and the generated L ₂ norm distance is 0.03 , 0.05, and 0.1 adversarial samples, each with 10,000 adversarial samples, as the attack adversarial samples of the Deepfool algorithm; using the C&W algorithm to conduct random targeted attacks and non-targeted attacks on the four attack and defense target models, the generated L _∞ norm distances are 0.03, 0.05 and There are 10,000 adversarial samples of 0.1 each, as the attacking adversarial samples of the C&W algorithm. The adversarial samples generated above are used to train the defense model of the embodiment to obtain a defense model. Two defense models, VGMA and VGMB, are obtained by training on the MNIST dataset. VGMA represents the defense model obtained from the adversarial samples generated by the MNIST_A model, and VGMB represents the defense model obtained from the adversarial samples generated by the MNIST_B model.

接着获得白盒攻击的实验结果。为了测试和客观的评价本发明提出的防御模型抵御对抗样本攻击的性能，本实施例选取目前效果较好的4种防御方法作为参照，和提出的策略进行对比实验。四种防御方法分别是：基于FGSM对抗训练的对抗样本防御方法(Adversarial FGSM)、基于BIM对抗训练的对抗样本防御方法(Adversarial BIM)、蒸馏防御法(Distillation Defend)和基于图像压缩重构的对抗样本防御方法(ComDefend)。各防御模型在MNIST数据集防御白盒攻击结果如表1-5和表1-6所示，Then the experimental results of the white-box attack are obtained. In order to test and objectively evaluate the performance of the defense model proposed in the present invention against adversarial sample attacks, in this embodiment, four defense methods with good effects are selected as reference, and a comparison experiment is carried out with the proposed strategy. The four defense methods are: Adversarial FGSM based on FGSM adversarial training, Adversarial BIM based on BIM adversarial training, Distillation Defend and image compression and reconstruction based adversarial Sample Defense Method (ComDefend). The results of defense models against white-box attacks in the MNIST dataset are shown in Table 1-5 and Table 1-6.

其中每一行表示一种防御模型，每一列代表一种攻击算法。考虑到攻击算法可以生成不同尺度的对抗样本，因此每个数据项包括三个分项，分别代表攻击算法扰动大小(FGSM和BIM度量标准为L_∞，Deepfool和C&W度量标准是L₂)为0.03、0.05、0.1下的分类准确率。Each row represents a defense model, and each column represents an attack algorithm. Considering that the attack algorithm can generate adversarial samples of different scales, each data item includes three sub-items, representing the size of the attack algorithm perturbation (L _∞ for FGSM and BIM metrics, L ₂ for Deepfool and C&W metrics) is 0.03 , 0.05, and classification accuracy under 0.1.

表1-5展示了在MINIST_A模型上防御白盒攻击….测试结果。其中Clean表示不对输入图片做任何处理，这是为了比较防御模型对原始分类效果的影响。可以看出，MNIST_A分类器的分类准确率为94％，在引入了防御策略后，分类准确率都有降低，但是本发明的VGMA模型与VGMB模型受影响较小，依然有93％的准确率。这表明本发明的模型在保证安全的前提下，几乎不影响分类器的性能。进一步观察可以看出，对于FGSM攻击，防御效果最好的前三种模型依次为Adversarial FGSM、VGMA、ComDefend；对于BIM攻击，防御效果最好的前三种模型为VGMA、VGMB、Adversarial BIM；对于Deepfool攻击，防御效果最好的前三种模型为VGMA、ComDefend、VGMB；对于C&W模型，防御效果的最好的前三种模型为VGMB、VGMA、ComDefend。从中可以发现，本发明的VGM模型可以有效防御各种攻击方法，最差情况下的分类准确率为71％。Adversarial FGSM、Adversarial BIM、Distillation Defend防御模型都存在明显的短板。ComDefend虽然无明显短板，但是性能指标略逊色于本发明的VGMA与VGMB模型。考虑到VGMB使用MINIST_B分类器，进行训练，但是仍然有较好的效果，这说明本发明的模型具有一定的迁移能力。Table 1-5 shows the test results of defending against white-box attacks on the MINIST_A model. Among them, Clean means that no processing is performed on the input image, which is to compare the impact of the defense model on the original classification effect. It can be seen that the classification accuracy of the MNIST_A classifier is 94%. After the introduction of the defense strategy, the classification accuracy is reduced. However, the VGMA model and the VGMB model of the present invention are less affected and still have an accuracy of 93%. . This shows that the model of the present invention hardly affects the performance of the classifier under the premise of ensuring safety. Further observation shows that for FGSM attacks, the top three models with the best defense effects are Adversarial FGSM, VGMA, and ComDefend; for BIM attacks, the top three models with the best defense effects are VGMA, VGMB, and Adversarial BIM; Deepfool attack, the top three models with the best defense effect are VGMA, ComDefend, and VGMB; for the C&W model, the top three models with the best defense effect are VGMB, VGMA, and ComDefend. It can be found that the VGM model of the present invention can effectively defend against various attack methods, and the classification accuracy rate in the worst case is 71%. Adversarial FGSM, Adversarial BIM, and Distillation Defend defense models all have obvious shortcomings. Although ComDefend has no obvious shortcomings, its performance index is slightly inferior to the VGMA and VGMB models of the present invention. Considering that VGMB uses the MINIST_B classifier for training, but still has a good effect, this shows that the model of the present invention has a certain transfer ability.

表1-5 在MNIST_A模型上防御白盒攻击分类准确率(％)Table 1-5 Classification accuracy (%) of defense against white-box attacks on the MNIST_A model

表1-6使用MNIST_B模型进行分类，与MNIST_A相比，MNIST_B在数据集上的分类效果更好，分类准确率达到了98％。在不存在攻击的情况下，本发明的VGMA与VGMB依然有95％的分类准确率。相类似地，可以发现Adversarial FGSM、Adversarial BIM、DistillationDefend、ComDefend防御模型存在明显的短板，而本发明的模型的最差情况是分类准确率为75％。Table 1-6 uses the MNIST_B model for classification. Compared with MNIST_A, MNIST_B has a better classification effect on the dataset, and the classification accuracy rate reaches 98%. In the absence of attacks, the VGMA and VGMB of the present invention still have a classification accuracy of 95%. Similarly, it can be found that Adversarial FGSM, Adversarial BIM, DistillationDefend, ComDefend defense models have obvious shortcomings, and the worst case of the model of the present invention is that the classification accuracy is 75%.

表1-6 在MNIST_B模型上防御白盒攻击分类准确率(％)Table 1-6 Classification accuracy (%) of defense against white-box attacks on the MNIST_B model

MNIST_A与MNIST_B在相同数据集上使用不同的分类器进行测试，得出了类似的结论。上述实验证明了在不同分类器下，本发明提出的基于VAE-GAN的对抗样本防御方法对正常样本的分类影响较小，对不同白盒攻击方式都有较好的防御效果，无明显防御短板。MNIST_A and MNIST_B were tested on the same dataset with different classifiers and reached similar conclusions. The above experiments prove that under different classifiers, the adversarial sample defense method based on VAE-GAN proposed by the present invention has little impact on the classification of normal samples, and has a good defense effect on different white-box attack methods, and has no obvious short-term defense. plate.

经过209次训练，VAE去噪效果如图4所示。从图中可以看出，对抗样本的植入使得原始图像上出现一定程序的“噪点”，而经过VAE-GAN模型降噪恢复后的图片和原始图像较为相似。这说明VAE-GAN模型可以较好的将对抗样本恢复成原始样本，这也是VAE-GAN模型能够防御对抗样本的重要原因。After 209 times of training, the VAE denoising effect is shown in Figure 4. It can be seen from the figure that the implantation of the adversarial sample makes a certain program of "noise" appear on the original image, and the image restored after the noise reduction of the VAE-GAN model is similar to the original image. This shows that the VAE-GAN model can better restore the adversarial samples to the original samples, which is also an important reason why the VAE-GAN model can defend against adversarial samples.

以上所述仅为本发明的较佳实施例而已，并不用以限制本发明，凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等，均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention and are not intended to limit the present invention. Any modifications, equivalent replacements and improvements made within the spirit and principles of the present invention shall be included in the protection of the present invention. within the range.

Claims

1. An adversarial sample defense method based on VAE-GAN is characterized in that, the adversarial sample defense method based on VAE-GAN uses variational autoencoder VAE and generative adversarial network GAN to denoise the adversarial sample, and utilizes GAN assists the training of the variational autoencoder VAE, so that the image results output by the variational autoencoder VAE are close to the original noise-free image.

2. The method for defending against samples based on VAE-GAN as claimed in claim 1, wherein the method for defending against samples based on VAE-GAN comprises the following steps:

Step 1, select the appropriate network structure to build the VAE-GAN training network;

Step 2, train the VAE-GAN network;

Step 3, use the VAE-GAN module as a classifier preprocessing module to defend against adversarial sample attacks.

3. The adversarial sample defense method based on VAE-GAN as claimed in claim 1, is characterized in that, in step 1, described selecting suitable network structure constructs the method for VAE-GAN training network as follows:

Select the appropriate network structure to build the VAE-GAN training network. The neural network structure selected for different VAEs and GANs of the classifier is also different. For small data sets VAE and GAN choose to use DNN structure. For large data sets VAE and GAN are used. Need to choose a deeper neural network or a convolutional neural network.

4. The method for adversarial sample defense based on VAE-GAN as claimed in claim 1, wherein in step 1, the VAE-GAN comprises two parts: a VAE part and a GAN part; the main function of the VAE is to convert the input The image is denoised, and then the GAN is used to make the VAE reconstructed image distribution as close to the original image distribution as possible;

VAE can be divided into Encoder encoding part and Decoder decoding part; Encoder mainly maps the input image samples into two sets of n-dimensional vectors, mean vector and standard deviation vector; Decoder mainly adds noise to these two sets of n-dimensional vectors and restores them as The original image sample is assisted by GAN to train the Decoder of the variational auto-encoder; GAN consists of two parts: the generator Generator converts the one-dimensional vector z into an image, and the discriminator Discriminator is used to identify whether the input image is a real image or generated by a generator Image, the decoding network of VAE and the generation network of GAN share network parameters.

5. The method for adversarial sample defense based on VAE-GAN as claimed in claim 1, wherein in step 2, the method for training the VAE-GAN network is as follows:

After completing the network construction of VAE-GAN, the optimization objective of the entire network needs to be determined first. The overall optimization objective function of the model is:

where σ and μ represent the mean and variance of the posterior distribution of the latent variable z, x represents the original image,

represents the output of the decoding network, D represents the discriminative network, G represents the generation network, γ is the introduced hyperparameter, and the value is 0.2;

The corresponding model overall loss function is:

After defining the objective function, the model is trained by gradient descent or other optimization algorithms.

6. A VAE-GAN-based adversarial sample defense control system implementing the VAE-GAN-based adversarial sample defense method according to any one of claims 1 to 5.

7. A computer program product stored on a computer-readable medium, comprising a computer-readable program, when executed on an electronic device, providing a user input interface to implement the VAE-based system according to any one of claims 1 to 5 Adversarial example defense methods for GANs.

8 . A computer-readable storage medium storing instructions, which when executed on a computer, cause the computer to execute the VAE-GAN-based adversarial sample defense method according to any one of claims 1 to 5 .