这是indexloc提供的服务,不要输入任何密码
Skip to main content
Log in

Regional Adversarial Training for Better Robust Generalization

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Adversarial training (AT) has been demonstrated as one of the most promising defense methods against various adversarial attacks. To our knowledge, existing AT-based methods usually train with the locally most adversarial perturbed points and treat all the perturbed points equally, which may lead to considerably weaker adversarial robust generalization on test data. In this work, we introduce a new adversarial training framework that considers the diversity as well as characteristics of the perturbed points in the vicinity of benign samples. To realize the framework, we propose a Regional Adversarial Training (RAT) defense method that first utilizes the attack path generated by the typical iterative attack method of projected gradient descent (PGD), and constructs an adversarial region based on the attack path. Then, RAT samples diverse perturbed training points efficiently inside this region, and utilizes a distance-aware label smoothing mechanism to capture our intuition that perturbed points at different locations should have different impact on the model performance. Extensive experiments on several benchmark datasets show that RAT consistently makes significant improvement on standard adversarial training (SAT), and exhibits better robust generalization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from $39.99 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Athalye, A., Carlini, N., & Wagner, D.A. (2018). Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In: ICML, pp 274–283

  • Buckman, J., Roy, A., Raffel, C., et al. (2018). Thermometer encoding: One hot way to resist adversarial examples. In: ICLR

  • Carlini, N., & Wagner, D. (2017). Towards evaluating the robustness of neural networks. In: IEEE S &P, pp 39–57

  • Dabouei, A., Soleymani, S., Taherkhani, F., et al. (2020). Exploiting joint robustness to adversarial perturbations. In: CVPR, pp 1122–1131

  • Deng, Z., Dong, Y., Pang, T., et al. (2020). Adversarial distributional training for robust deep learning. In: NeurIPS, pp 8270–8283

  • Dong, Y., Fu, Q.A., Yang, X., et al. (2020). Benchmarking adversarial robustness on image classification. In: CVPR, pp 321–331

  • Gilmer, J., Adams, R.P., Goodfellow, I.J., et al. (2018). Motivating the rules of the game for adversarial example research. arXiv:1807.06732

  • Gilmer, J., Ford, N., Carlini, N., et al. (2019). Adversarial examples are a natural consequence of test error in noise. In: ICML, pp 2280–2289

  • Girshick, R. (2015). Fast R-CNN. In: ICCV, pp 1440–1448

  • Goodfellow. I.J, Shlens, J., & Szegedy, C. (2015). Explaining and harnessing adversarial examples. In: ICLR

  • He, K., Zhang, X., Ren, S., et al. (2016a). Deep residual learning for image recognition. In: CVPR, pp 770–778

  • He, K., Zhang, X., Ren, S., et al. (2016b). Identity mappings in deep residual networks. In: ECCV, pp 630–645

  • Hendrycks, D., & Dietterich, T.G. (2019). Benchmarking neural network robustness to common corruptions and perturbations. In: ICLR

  • Howard, F.J. (2019). The ImageNette dataset. https://github.com/fastai/imagenette

  • Krizhevsky, A. (2009). Learning multiple layers of features from tiny images. Technical Report

  • Lee, S., Lee, H., & Yoon, S. (2020). Adversarial vertex mixup: Toward better adversarially robust generalization. In: CVPR, pp 272–281

  • Li, Y., Li, L., Wang, L., et al. (2019). NATTACK: learning the distributions of adversarial examples for an improved black-box attack on deep neural networks. In: ICML, pp 3866–3876

  • Lin, T.Y., Maire, M., Belongie, S., et al. (2014). Microsoft COCO: Common objects in context. In: ECCV, pp 740–755

  • Madry, A., Makelov, A., Schmidt, L., et al. (2018). Towards deep learning models resistant to adversarial attacks. In: ICLR

  • Montasser, O., Hanneke, S., & Srebro, N. (2019). VC classes are adversarially robustly learnable, but only improperly. In: COLT, pp 2512–2530

  • Pang, T., Xu, K., Du, C., et al. (2019). Improving adversarial robustness via promoting ensemble diversity. In: ICML, pp 4970–4979

  • Papernot, N., McDaniel, P.D., & Goodfellow, I.J. (2016). Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. arXiv:1807.06732

  • Papernot, N., McDaniel, P.D., Goodfellow, I.J., et al. (2017). Practical black-box attacks against machine learning. In: AsiaCCS, pp 506–519

  • Rice, L., Wong, E., & Kolter, J.Z. (2020). Overfitting in adversarially robust deep learning. In: ICML, pp 8093–8104

  • Schmidt, L., Santurkar, S., Tsipras, D., et al. (2018). Adversarially robust generalization requires more data. In: NeurIPS, pp 5019–5031

  • Song, C., He, K., Wang, L., et al. (2019). Improving the generalization of adversarial training with domain adaptation. In: ICLR

  • Szegedy, C., Zaremba, W., Sutskever, I., et al. (2014). Intriguing properties of neural networks. In: ICLR

  • Szegedy, C., Vanhoucke, V., Ioffe, S., et al. (2016). Rethinking the inception architecture for computer vision. In: CVPR, pp 2818–2826

  • Tramèr, F., Kurakin, A., Papernot, N., et al. (2018). Ensemble adversarial training: Attacks and defenses. In: ICLR

  • Wang, Y., Zou, D., Yi, J., et al. (2019). Improving adversarial robustness requires revisiting misclassified examples. In: ICLR

  • Yang, Y., Zhang, G., Xu, Z., et al. (2019). ME-Net: Towards effective adversarial robustness with matrix estimation. In: ICML, pp 7025–7034

  • Yin, D., Lopes, R.G., Shlens, J., et al. (2019a). A fourier perspective on model robustness in computer vision. In: NeurIPS, pp 13,255–13,265

  • Yin, D., Ramchandran, K., & Bartlett, P.L. (2019b). Rademacher complexity for adversarially robust generalization. In: ICML, pp 7085–7094

  • Zagoruyko, S., & Komodakis, N. (2016). Wide residual networks. In: BMVC

  • Zhang, H., Yu, Y., Jiao, J., et al. (2019). Theoretically principled trade-off between robustness and accuracy. In: ICML, pp 7472–7482

  • Zhang, J., Xu, X., Han, B., et al. (2020). Attacks which do not kill training make adversarial learning stronger. In: ICML, pp 11,278–11,287

Download references

Acknowledgements

This work is supported by National Natural Science Foundation of China (U22B2017, 62076105) and International Cooperation Foundation of Hubei Province, China (2021EHB011). Baoyuan Wu is supported by National Science Foundation of China (62076213), and Shenzhen Science and Technology Program (GXWD20201231105722002-20200901175001001).

Special thanks to Yichen Yang and Xin Liu for their valuable suggestions that have greatly improved this work.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Baoyuan Wu or Kun He.

Additional information

Communicated by Oliver Zendel.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A

Appendix A

We report the supplemental empirical results on CIFAR-10 below, including the evaluation results on other threat models and the performance on AutoAttack.

Table 7 Comparison between TRADES and RAT on CIFAR-10 with ResNet-18 under different \(l_p\) norms
Table 8 Test accuracy (%) of RAT and baseline methods against AutoAttack (AA) on CIFAR-10 with ResNet-18

On other Threat Models. To verify the generality of our method, we compare the robustness performance of RAT and TRADES against PGD20 under different perturbation norms, using a smaller network of ResNet-18. As shown in Table 7, RAT outperforms the TRADES baseline by a clear margin of 6.5% under PGD attack with \(l_\infty \) perturbation boundary 8/255. With \(l_2\) magnitude of perturbation 128/255, RAT achieves a higher accuracy by 2.49%. After combining with TRADES, the model robustness also improves slightly.

Performance on AutoAttack. We evaluate the performance of our proposed method on AutoAttack (AA), but the results fall short of our expectations. As shown in Table 8, the effectiveness of AA is primarily attributed to APGD\(_{DLR}\), while RAT demonstrates poor defense against it. The possible explanation for this could be that RAT excessively prioritizes the diversity and characteristics of adversarial examples, thereby overlooking the difference between benign points and sampling points. Adding a regularization term to the loss function, as demonstrated in Sect. 4.2, can mitigate this issue.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Song, C., Fan, Y., Zhou, A. et al. Regional Adversarial Training for Better Robust Generalization. Int J Comput Vis 132, 4510–4520 (2024). https://doi.org/10.1007/s11263-024-02103-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1007/s11263-024-02103-w

Keywords