Abstract
Adversarial examples demonstrate the vulnerability of white-box models but exhibit weak transferability to black-box models. In image processing, each adversarial example usually consists of original image and disturbance. The disturbances are essential for the adversarial examples, determining the attack success rate on black-box models. To improve the transferability, we propose a new white-box attack method called separable positive and negative disturbance (SPND). SPND optimizes the positive and negative perturbations instead of the adversarial examples. SPND also smooths the search space by replacing constrained disturbances with unconstrained variables, which improves the success rate of attacking the black-box model. Our method outperforms the other attack methods in the MNIST and CIFAR10 datasets. In the ImageNet dataset, the black-box attack success rate of SPND exceeds the optimal CW method by nearly ten percentage points under the perturbation of \(L_\infty = 0.3\).
Similar content being viewed by others
Data availability
The datasets generated during and analysed during the current study are available in the https://pytorch.org/vision/stable/datasets.html website.
References
Adam P, Sam G, Soumith C et al (2017) Automatic differentiation in pytorch. In: Proceedings of neural information processing systems
Akhtar N, Mian A (2018) Threat of adversarial attacks on deep learning in computer vision: a survey. IEEE Access 6:14410–14430
Carlini N, Wagner D (2017) Towards evaluating the robustness of neural networks. In: 2017 IEEE symposium on security and privacy (sp), pp 39–57
Deng J, Dong W, Socher R et al (2009) Imagenet: a large-scale hierarchical image database. In: Proceedings of the IEEE conference on computer vision and pattern recognition, IEEE, pp 248–255
Dong Y, Liao F, Pang T et al (2018) Boosting adversarial attacks with momentum. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9185–9193
Dong Y, Pang T, Su H et al (2019) Evading defenses to transferable adversarial examples by translation-invariant attacks. Proceedings of the IEEE conference on computer vision and pattern recognition
Drenkow N, Fendley N, Burlina P (2022) Attack agnostic detection of adversarial examples via random subspace analysis. In: Proceedings of the IEEE winter conference on applications of computer vision, pp 472–482
Goodfellow IJ, Shlens J, Szegedy C (2014) Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572
Hazan T, Papandreou G, Tarlow D (2016) Perturbations, optimization, and statistics. MIT Press, Cambridge
Huang G, Liu Z, Van Der Maaten L et al (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Krizhevsky A, Hinton G et al (2009) Learning multiple layers of features from tiny images. Citeseer
Li Y, Bai S, Xie C et al (2020) Regional homogeneity: towards learning transferable universal adversarial perturbations against defenses. Lecture Notes in Computer Science p 795-813
Madry A, Makelov A, Schmidt L et al (2017) Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083
Moosavi-Dezfooli SM, Fawzi A, Fawzi O et al (2017) Universal adversarial perturbations. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1765–1773
Song Y, Shu R, Kushman N et al (2018) Constructing unrestricted adversarial examples with generative models. In: Advances in Neural Information Processing Systems, pp 8312–8323
Su J, Vargas DV, Sakurai K (2019) One pixel attack for fooling deep neural networks. IEEE Trans Evolut Comput 23(5):828–841
Szegedy C, Zaremba W, Sutskever I et al (2013) Intriguing properties of neural networks. arXiv preprint arXiv:1312f6199
Tan M, Le Q (2019) Efficientnet: rethinking model scaling for convolutional neural networks. In: Proceedings of the IEEE international conference on machine learning, PMLR, pp 6105–6114
Tramèr F, Kurakin A, Papernot N et al (2017) Ensemble adversarial training: attacks and defenses. arXiv preprint arXiv:1705.07204
Wang X, He K, Hopcroft JE (2019) At-gan: A generative attack model for adversarial transferring on generative adversarial nets. arXiv preprint arXiv:1904.07793 3(4)
Wang Z, Guo H, Zhang Z et al (2021) Feature importance-aware transferable adversarial attacks. In: Proceedings of the IEEE International Conference on Computer Vision, pp 7639–7648
Wei Z, Chen J, Wu Z et al (2022) Boosting the transferability of video adversarial examples via temporal translation. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 2659–2667
Wu L, Zhu Z, Tai C et al (2018) Understanding and enhancing the transferability of adversarial examples. arXiv preprint arXiv:1802.09707
Wu W, Su Y, Lyu MR et al (2021) Improving the transferability of adversarial samples with adversarial transformations. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9024–9033
Xiao C, Li B, Zhu JY et al (2018) Generating adversarial examples with adversarial networks. arXiv preprint arXiv:1801.02610
Xie C, Zhang Z, Zhou Y et al (2019) Improving transferability of adversarial examples with input diversity. Proceedings of the IEEE conference on computer vision and pattern recognition
Zhang X, Zhou X, Lin M et al (2018) Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6848–6856
Zhang Y, Ya Tan, Sun H et al (2023) Improving the invisibility of adversarial examples with perceptually adaptive perturbation. Inf Sci 635:126–137
Zheng T, Chen C, Ren K (2019) Distributionally adversarial attack. In: Proceedings of the AAAI conference on artificial intelligence, pp 2253–2260
Funding
This work is supported in part by the National Natural Science Foundation of China under Grant Nos. (62276127).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
No potential conflict of interest is reported by the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yan, Y., Bu, Y., Shen, F. et al. Improving the transferability of adversarial examples with separable positive and negative disturbances. Neural Comput & Applic 36, 3725–3736 (2024). https://doi.org/10.1007/s00521-023-09259-5
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1007/s00521-023-09259-5