这是indexloc提供的服务,不要输入任何密码
Skip to main content
Log in

Defending Against Adversarial Examples Via Modeling Adversarial Noise

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Adversarial examples have become a major threat to the reliable application of deep learning models. Meanwhile, this issue promotes the development of adversarial defenses. Adversarial noise contains well-generalizing and misleading features, which can manipulate predicted labels to be flipped maliciously. Motivated by this, we study modeling adversarial noise for defending against adversarial examples by learning the transition relationship between adversarial labels (i.e., flipped labels caused by adversarial noise) and natural labels (i.e., real labels of natural samples). In this work, we propose an adversarial defense method from the perspective of modeling adversarial noise. Specifically, we construct an instance-dependent label transition matrix to represent the label transition relationship for explicitly modeling adversarial noise. The label transition matrix is obtained from the input sample by leveraging a label transition network. By exploiting the label transition matrix, we can infer the natural label from the adversarial label and thus correct wrong predictions misled by adversarial noise. Additionally, to enhance the robustness of the label transition network, we design an adversarial robustness constraint at the transition matrix level. Experimental results demonstrate that our method effectively improves the robust accuracy against multiple attacks and exhibits great performance in detecting adversarial input samples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from $39.99 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Algorithm 1
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data Availability

The datasets used in our paper are available in: CIFAR-10: https://www.cs.toronto.eduy/~kriz/cifar-10-python.tar.gz, Tiny-ImageNet: https://www.kaggle.com/c/tiny-imagenet/data, Mini-ImageNet: https://www.kaggle.com/datasets/arjunashok33/miniimagenet.

References

  • Goodfellow, I.J., Shlens, J., & Szegedy, C (2015) Explaining and harnessing adversarial examples. In: International Conference on Learning Representations

  • Athalye, A., Engstrom, L., Ilyas, A., & Kwok, K (2018). Synthesizing robust adversarial examples. In: International Conference on Machine Learning, pp. 284–293. PMLR

  • Cai, J., Wang, B., Wang, X., & Jin, B (2019). Accelerate black-box attack with white-box prior knowledge. In: Intelligence Science and Big Data Engineering. Big Data and Machine Learning: 9th International Conference, IScIDE 2019, Nanjing, China, October 17–20, 2019, Proceedings, Part II 9, 394–405. Springer

  • Tramer, F., Carlini, N., Brendel, W., & Madry, A. (2020). On adaptive attacks to adversarial example defenses. Advances in neural information processing systems, 33, 1633–1645.

    Google Scholar 

  • Tang, S., Huang, X., Chen, M., Sun, C., & Yang, J. (2021). Adversarial attack type i: Cheat classifiers by significant changes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(3), 1100–1109. https://doi.org/10.1109/TPAMI.2019.2936378

    Article  Google Scholar 

  • Dong, Y., Zhu, J., & Gao, X.-S. (2022). Isometric 3d adversarial examples in the physical world. Advances in Neural Information Processing Systems, 35, 19716–19731.

    Google Scholar 

  • Ying, C., Qiaoben, Y., Zhou, X., Su, H., Ding, W., & Ai, J. (2023). Consistent attack: Universal adversarial perturbation on embodied vision navigation. Pattern Recognition Letters, 168, 57–63.

    Article  Google Scholar 

  • Yin, M., Li, S., Song, C., Asif, M.S., Roy-Chowdhury, A.K., & Krishnamurthy, S.V (2022). Adc: Adversarial attacks against object detection that evade context consistency checks. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 3278–3287

  • Mi, J.-X., Wang, X.-D., Zhou, L.-F., & Cheng, K. (2023). Adversarial examples based on object detection tasks: A survey. Neurocomputing, 519, 114–126.

    Article  Google Scholar 

  • Wei, X., Guo, Y., & Yu, J. (2023). Adversarial sticker: A stealthy attack method in the physical world. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(3), 2711–2725. https://doi.org/10.1109/TPAMI.2022.3176760

    Article  Google Scholar 

  • Jaiswal, S., Duggirala, K., Dash, A., & Mukherjee, A. (2022). Two-face: Adversarial audit of commercial face recognition systems. In: Proceedings of the International AAAI Conference on Web and Social Media, 16, 381–392

  • Feng, S., Yan, X., Sun, H., Feng, Y., & Liu, H. X. (2021). Intelligent driving intelligence test for autonomous vehicles with naturalistic and adversarial environment. Nature communications, 12(1), 748.

    Article  Google Scholar 

  • Shibly, K. H., Hossain, M. D., Inoue, H., Taenaka, Y., & Kadobayashi, Y. (2023). Towards autonomous driving model resistant to adversarial attack. Applied Artificial Intelligence, 37(1), 2193461.

    Article  Google Scholar 

  • Bai, S., Li, Y., Zhou, Y., Li, Q., & Torr, P. H. S. (2021). Adversarial metric attack and defense for person re-identification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(6), 2119–2126. https://doi.org/10.1109/TPAMI.2020.3031625

    Article  Google Scholar 

  • Yang, F., Weng, J., Zhong, Z., Liu, H., Wang, Z., Luo, Z., Cao, D., Li, S., Satoh, S., & Sebe, N. (2023). Towards robust person re-identification by defending against universal attackers. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(4), 5218–5235. https://doi.org/10.1109/TPAMI.2022.3199013

    Article  Google Scholar 

  • Ghaffari Laleh, N., Truhn, D., Veldhuizen, G. P., Han, T., Treeck, M., Buelow, R. D., Langer, R., Dislich, B., Boor, P., & Schulz, V. (2022). Adversarial attacks and adversarial robustness in computational pathology. Nature Communications, 13(1), 5711.

    Article  Google Scholar 

  • Puttagunta, M.K., Ravi, S., & Nelson Kennedy Babu, C (2023). Adversarial examples: attacks and defences on medical deep learning systems. Multimedia Tools and Applications, 1–37

  • Li, X., Wang, Z., Zhang, B., Sun, F., & Hu, X. (2023). Recognizing object by components with human prior knowledge enhances adversarial robustness of deep neural networks. IEEE Transactions on Pattern Analysis and Machine Intelligence

  • Wei, Z., Wang, Y., Guo, Y., & Wang, Y. (2023). Cfa: Class-wise calibrated fair adversarial training. In: Conference on Computer Vision and Pattern Recognition

  • Yan, H., Zhang, J., Niu, G., Feng, J., Tan, V., & Sugiyama, M. (2021). Cifs: Improving adversarial robustness of cnns via channel-wise importance-based feature selection. In: International Conference on Machine Learning, 11693–11703. PMLR

  • Wang, Z., Pang, T., Du, C., Lin, M., Liu, W., & Yan, S. (2023). Better diffusion models further improve adversarial training. arXiv preprint arXiv:2302.04638

  • Zhu, J., Yao, J., Liu, T., Xu, J., & Han, B. (2023). Combating exacerbated heterogeneity for robust models in federated learning. In: The Eleventh International Conference on Learning Representations

  • Liu, D., & Hu, W. (2023). Imperceptible transfer attack and defense on 3d point cloud classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(4), 4727–4746. https://doi.org/10.1109/TPAMI.2022.3193449

    Article  Google Scholar 

  • Madry, A., Makelov, A., Schmidt, L., Tsipras, D., & Vladu, A. (2018). Towards deep learning models resistant to adversarial attacks. In: 6th International Conference on Learning Representations

  • Liao, F., Liang, M., Dong, Y., Pang, T., Hu, X., & Zhu, J. (2018). Defense against adversarial attacks using high-level representation guided denoiser. In: Conference on Computer Vision and Pattern Recognition, pp. 1778–1787

  • Naseer, M., Khan, S., Hayat, M., Khan, F.S., & Porikli, F. (2020). A self-supervised approach for adversarial robustness. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 262–271

  • Yoon, J., Hwang, S.J., & Lee, J. (2021). Adversarial purification with score-based generative models. In: International Conference on Machine Learning, 12062–12072. PMLR

  • Li, Y., Zhang, W., Liu, J., Kou, X., Li, H., & Cui, J. (2021). Enhanced countering adversarial attacks via input denoising and feature restoring. arXiv preprint arXiv:2111.10075

  • Xiao, C., Chen, Z., Jin, K., Wang, J., Nie, W., Liu, M., Anandkumar, A., Li, B., & Song, D. (2022). Densepure: Understanding diffusion models towards adversarial robustness. arXiv preprint arXiv:2211.00322

  • Xu, W., Evans, D., & Qi, Y. (2017). Feature squeezing: Detecting adversarial examples in deep neural networks. arXiv preprint arXiv:1704.01155

  • Ilyas, A., Santurkar, S., Tsipras, D., Engstrom, L., Tran, B., & Madry, A. (2019). Adversarial examples are not bugs, they are features. arXiv preprint arXiv:1905.02175

  • Wei, K.-A.A. (2020). Understanding non-robust features in image classification. PhD thesis, Massachusetts Institute of Technology

  • Kim, J., Lee, B.-K., & Ro, Y. M. (2021). Distilling robust and non-robust features in adversarial examples by information bottleneck. Advances in Neural Information Processing Systems, 34, 17148–17159.

    Google Scholar 

  • Zhou, D., Wang, N., Han, B., & Liu, T. (2022) Modeling adversarial noise for adversarial training. In: International Conference on Machine Learning, 27353–27366. PMLR

  • Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I.J., & Fergus, R. (2014). Intriguing properties of neural networks. In: International Conference on Learning Representations

  • Dong, Y., Liao, F., Pang, T., Su, H., Zhu, J., Hu, X., & Li, J. (2018) Boosting adversarial attacks with momentum. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 9185–9193

  • Wang, X., & He, K. (2021). Enhancing the transferability of adversarial attacks through variance tuning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1924–1933

  • Carlini, N., & Wagner, D (2017) Towards evaluating the robustness of neural networks. In: 2017 Ieee Symposium on Security and Privacy (sp), 39–57. IEEE

  • Rony, J., Hafemann, L.G., Oliveira, L.S., Ayed, I.B., Sabourin, R., & Granger, E. (2019). Decoupling direction and norm for efficient gradient-based L2 adversarial attacks and defenses. In: Conference on Computer Vision and Pattern Recognition, pp. 4322–4330

  • Croce, F., & Hein, M. (2020) Minimally distorted adversarial examples with a fast adaptive boundary attack. In: International Conference on Machine Learning, 2196–2205. PMLR

  • Croce, F., & Hein, M. (2020). Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In: International Conference on Machine Learning, pp. 2206–2216. PMLR

  • Ding, G.W., Lui, K.Y.C., Jin, X., Wang, L., & Huang, R. (2019). On the sensitivity of adversarial robustness to input data distributions. In: ICLR (Poster)

  • Yu, C., Zhou, D., Shen, L., Yu, J., Han, B., Gong, M., Wang, N., & Liu, T. (2022). Strength-adaptive adversarial training. arXiv preprint arXiv:2210.01288

  • Zhou, J., Zhu, J., Zhang, J., Liu, T., Niu, G., Han, B., & Sugiyama, M. (2022). Adversarial training with complementary labels: On the benefit of gradually informative attacks. In: Advances in Neural Information Processing Systems

  • Yu, C., Han, B., Gong, M., Shen, L., Ge, S., Du, B., & Liu, T. (2022). Robust weight perturbation for adversarial training. arXiv preprint arXiv:2205.14826

  • Zhang, H., Yu, Y., Jiao, J., Xing, E., El Ghaoui, L., & Jordan, M. (2019). Theoretically principled trade-off between robustness and accuracy. In: International Conference on Machine Learning, 7472–7482. PMLR

  • Wang, Y., Zou, D., Yi, J., Bailey, J., Ma, X., & Gu, Q. (2019). Improving adversarial robustness requires revisiting misclassified examples. In: International Conference on Learning Representations

  • Wu, D., Xia, S.-T., & Wang, Y. (2020). Adversarial weight perturbation helps robust generalization. Advances in Neural Information Processing Systems 33

  • Jin, G., Shen, S., Zhang, D., Dai, F., & Zhang, Y. (2019). APE-GAN: adversarial perturbation elimination with GAN. In: International Conference on Acoustics, Speech and Signal Processing, 3842–3846

  • Zhou, D., Liu, T., Han, B., Wang, N., Peng, C., & Gao, X. (2021). Towards defending against adversarial examples via attack-invariant features. In: Proceedings of the 38th International Conference on Machine Learning, 12835–12845

  • Zhou, D., Wang, N., Peng, C., Gao, X., Wang, X., Yu, J., & Liu, T. (2021). Removing adversarial noise in class activation feature space. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 7878–7887

  • Shi, C., Holtz, C., & Mishne, G. (2021). Online adversarial purification based on self-supervised learning. In: International Conference on Learning Representations

  • Guo, C., Rana, M., Cissé, M., & Maaten, L. (2018). Countering adversarial images using input transformations. In: 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings

  • Das, N., Shanbhogue, M., Chen, S.-T., Hohman, F., Li, S., Chen, L., Kounavis, M.E., & Chau, D.H. (2018). Shield: Fast, practical defense and vaccination for deep learning using jpeg compression. ACM

  • Yang, S., Yang, E., Han, B., Liu, Y., Xu, M., Niu, G., & Liu, T. (2022). Estimating instance-dependent bayes-label transition matrix using a deep neural network. In: International Conference on Machine Learning, pp. 25302–25312. PMLR

  • Xia, X., Liu, T., Han, B., Wang, N., Gong, M., Liu, H., Niu, G., Tao, D., & Sugiyama, M. (2020). Part-dependent label noise: Towards instance-dependent label noise. Advances in Neural Information Processing Systems 33

  • Liu, T., & Tao, D. (2015). Classification with noisy labels by importance reweighting. IEEE Transactions on pattern analysis and machine intelligence, 38(3), 447–461.

    Article  Google Scholar 

  • Xia, X., Liu, T., Wang, N., Han, B., Gong, C., Niu, G., & Sugiyama, M. (2019). Are anchor points really indispensable in label-noise learning? arXiv preprint arXiv:1906.00189

  • Wu, S., Xia, X., Liu, T., Han, B., Gong, M., Wang, N., Liu, H., & Niu, G. (2021). Class2simi: A noise reduction perspective on learning with noisy labels. In: International Conference on Machine Learning, 11285–11295. PMLR

  • Xia, X., Liu, T., Han, B., Gong, C., Wang, N., Ge, Z., & Chang, Y. (2021). Robust early-learning: Hindering the memorization of noisy labels. In: International Conference on Learning Representations

  • Li, S., Xia, X., Zhang, H., Zhan, Y., Ge, S., & Liu, T. (2022). Estimating noise transition matrix with label correlations for noisy multi-label learning. In: NeurIPS

  • Krizhevsky, A., Hinton, G., & et al. (2009). Learning multiple layers of features from tiny images

  • Wu, J., Zhang, Q., & Xu, G. (2017). Tiny imagenet challenge. Technical Report

  • He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In: Conference on Computer Vision and Pattern Recognition, 770–778

  • Zagoruyko, S., & Komodakis, N. (2016). Wide residual networks. In: Wilson, R.C., Hancock, E.R., Smith, W.A.P. (eds.) Proceedings of the British Machine Vision Conference 2016

  • Kim, H. (2020). Torchattacks: A pytorch repository for adversarial attacks. arXiv preprint arXiv:2010.01950

  • Ding, G.W., Wang, L., & Jin, X. (2019). Advertorch v0. 1: An adversarial robustness toolbox based on pytorch. arXiv preprint arXiv:1902.07623

  • Andrew, G., & Gao, J. (2007). Scalable training of l 1-regularized log-linear models. In: Proceedings of the 24th International Conference on Machine Learning, 33–40

  • Pang, T., Yang, X., Dong, Y., Su, H., & Zhu, J. (2021). Bag of tricks for adversarial training. In: International Conference on Learning Representations

  • Rice, L., Wong, E., & Kolter, Z. (2020). Overfitting in adversarially robust deep learning. In: International Conference on Machine Learning, 8093–8104. PMLR

  • Vinyals, O., Blundell, C., Lillicrap, T., Wierstra, D., & et al. (2016). Matching networks for one shot learning. Advances in neural information processing systems 29

  • Athalye, A., Carlini, N., & Wagner, D.A. (2018). Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In: Proceedings of the 35th International Conference on Machine Learning

  • Carlini, N., & Wagner, D. (2017). Magnet and” efficient defenses against adversarial attacks” are not robust to adversarial examples. arXiv preprint arXiv:1711.08478

  • Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations

  • Zhou, K., Liu, Z., Qiao, Y., Xiang, T., & Loy, C. C. (2023). Domain generalization: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(4), 4396–4415. https://doi.org/10.1109/TPAMI.2022.3195549

    Article  Google Scholar 

  • Huang, Z., Zhu, M., Xia, X., Shen, L., Yu, J., Gong, C., Han, B., Du, B., & Liu, T. (2023). Robust generalization against photon-limited corruptions via worst-case sharpness minimization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 16175–16185

  • Zhang, X., He, Y., Xu, R., Yu, H., Shen, Z., & Cui, P. (2023). Nico++: Towards better benchmarking for domain generalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 16036–16047

  • Xia, X., Han, B., Wang, N., Deng, J., Li, J., Mao, Y., & Liu, T. (2023). Extended \(t\)t: Learning with mixed closed-set and open-set noisy labels. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(3), 3047–3058. https://doi.org/10.1109/TPAMI.2022.3180545

    Article  Google Scholar 

  • Li, J., Sun, H., & Li, J. (2023). Beyond confusion matrix: learning from multiple annotators with awareness of instance features. Machine Learning, 112(3), 1053–1075.

    Article  MathSciNet  Google Scholar 

  • Guo, X., Liu, J., Liu, T., & Yuan, Y. (2023). Handling open-set noise and novel target recognition in domain adaptive semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1–16 https://doi.org/10.1109/TPAMI.2023.3246392

  • Zhou, N., Zhou, D., Liu, D., Gao, X., & Wang, N. (2024). Mitigating feature gap for adversarial robustness by feature disentanglement. arXiv preprint arXiv:2401.14707

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grants U22A2096 and 62036007, in part by Scientific and Technological Innovation Teams in Shaanxi Province, in part by the Shaanxi Province Core Technology Research and Development Project under grant 2024QY2-GJHX-11, in part by the Fundamental Research Funds for the Central Universities under GrantQTZX23042.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nannan Wang.

Additional information

Communicated by Gunhee Kim.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, D., Wang, N., Han, B. et al. Defending Against Adversarial Examples Via Modeling Adversarial Noise. Int J Comput Vis 133, 5920–5937 (2025). https://doi.org/10.1007/s11263-025-02467-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1007/s11263-025-02467-7

Keywords