Relation-Guided Adversarial Learning for Data-Free Knowledge Transfer

Liang, Yingping; Fu, Ying

doi:10.1007/s11263-024-02303-4

Relation-Guided Adversarial Learning for Data-Free Knowledge Transfer

Published: 13 December 2024

Volume 133, pages 2868–2885, (2025)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

452 Accesses
2 Citations
Explore all metrics

Abstract

Data-free knowledge distillation transfers knowledge by recovering training data from a pre-trained model. Despite the recent success of seeking global data diversity, the diversity within each class and the similarity among different classes are largely overlooked, resulting in data homogeneity and limited performance. In this paper, we introduce a novel Relation-Guided Adversarial Learning method with triplet losses, which solves the homogeneity problem from two aspects. To be specific, our method aims to promote both intra-class diversity and inter-class confusion of the generated samples. To this end, we design two phases, an image synthesis phase and a student training phase. In the image synthesis phase, we construct an optimization process to push away samples with the same labels and pull close samples with different labels, leading to intra-class diversity and inter-class confusion, respectively. Then, in the student training phase, we perform an opposite optimization, which adversarially attempts to reduce the distance of samples of the same classes and enlarge the distance of samples of different classes. To mitigate the conflict of seeking high global diversity and keeping inter-class confusing, we propose a focal weighted sampling strategy by selecting the negative in the triplets unevenly within a finite range of distance. RGAL shows significant improvement over previous state-of-the-art methods in accuracy and data efficiency. Besides, RGAL can be inserted into state-of-the-art methods on various data-free knowledge transfer applications. Experiments on various benchmarks demonstrate the effectiveness and generalizability of our proposed method on various tasks, specially data-free knowledge distillation, data-free quantization, and non-exemplar incremental learning. Our code will be publicly available to the community.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adaptive Discriminative Regularization for Visual Classification

Article 13 May 2024

Hierarchical Unsupervised Relation Distillation for Source Free Domain Adaptation

Dual discriminator adversarial distillation for data-free model compression

Article 25 October 2021

Data Availability.

Data sharing does not apply to this article, as no datasets were generated or analyzed during the current study.

References

Arora, S., & Bhatia, M. (2021). A secure framework to preserve privacy of biometric templates on cloud using deep learning. Recent Advances in Computer Science and Communications, 14(5), 1412–1421.
Article Google Scholar
Ba, J. & Caruana, R. (2014). Do deep nets really need to be deep? In Advances in neural information processing systems (pp. 2654–2662).
Bhardwaj, K., Suda, N. & Marculescu, R. (2019). Dream distillation: A data-independent model compression framework. In International conference on machine learning workshop.
Bucilua, C., Caruana, R. & Niculescu-Mizil, A. (2006). Model compression. In International conference on knowledge discovery and data mining (pp. 535–541).
Cai, Y., Yao, Z., Dong, Z., Gholami, A., Mahoney, M. W. & Keutzer, K. (2020). Zeroq: A novel zero shot quantization framework. In The IEEE conference on computer vision and pattern recognition (pp. 13169–13178).
Castro, F. M., Marín-Jiménez, M. J., Guil, N., Schmid, C. & Alahari, K. (2018). End-to-end incremental learning. In The european conference on computer vision (pp. 233–248).
Chaudhry, A., Dokanial P. K., Ajanthan, T. & Torr, P. H. (2018). Riemannian walk for incremental learning: Understanding forgetting and intransigence. In The European conference on computer vision (pp. 532–547).
Chen, H., Wang, Y., Xu, C., Yang, Z., Liu, C., Shi, B., Xu, C., Xu, C. & Tian, Q. (2019). Data-free learning of student networks. In the IEEE International conference on computer vision (pp. 3514–3522).
Cheng, Y., Wang, D., Zhou, P., & Zhang, T. (2018). Model compression and acceleration for deep neural networks: Principles, progress, and challenges. IEEE Signal Processing Magazine, 35(1), 126–136.
Article Google Scholar
Choi, Y., Choi, J., El-Khamy, M. & Lee, J. (2020). Data-free network quantization with adversarial knowledge distillation. In The IEEE conference on computer vision and pattern recognition workshops (pp. 710–711).
Deng, J., Dong, W., Socher, R., Li, LJ., Li, K. & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In The IEEE conference on computer vision and pattern recognition (pp. 248–255).
Dhar, P., Singh, R. V., Peng, K. C., Wu, Z. & Chellappa, R. (2019). Learning without memorizing. In The IEEE conference on computer vision and pattern recognition (pp. 5138–5146).
Fang, G., Song, J., Shen, C., Wang, X., Chen, D. & Song, M. (2019). Data-free adversarial distillation. arXiv:1912.11006
Fang, G., Song, J., Wang, X., Shen, C., Wang, X. & Song, M. (2021). Contrastive model inversion for data-free knowledge distillation. In International joint conference on artificial intelligence (pp. 2374–2380).
Fang, G., Mo, K., Wang, X., Song, J., Bei, S., Zhang, H. & Song, M. (2022). Up to 100x faster data-free knowledge distillation. In The AAAI conference on artificial intelligence (Vol. 36, pp. 6597–6604).
Girshick, R., Donahue, J., Darrell, T, & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In The IEEE conference on computer vision and pattern recognition (pp. 580–587).
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A. & Bengio, Y. (2014). Generative adversarial nets. In Advances in neural information processing systems (pp. 2672–2680).
Gou, J., Yu, B., Maybank, S. J., & Tao, D. (2021). Knowledge distillation: A survey. International Journal of Computer Vision, 129(6), 1789–1819.
Article Google Scholar
Ha, T., Dang, T. K., Le, H., & Truong, T. A. (2020). Security and privacy issues in deep learning: A brief review. SN Computer Science, 1(5), 1–15.
Article Google Scholar
Han, P., Park, J., Wang, S. & Liu, Y. (2021). Robustness and diversity seeking data-free knowledge distillation. In The IEEE International conference on acoustics, speech and signal processing (pp. 2740–2744).
He, K., Zhang, X., Ren, S. & Sun, J. (2016). Deep residual learning for image recognition. In The IEEE conference on computer vision and pattern recognition (pp. 770–778).
Hinton, G., Vinyals, O. & Dean, J. (2014). Distilling the knowledge in a neural network. In Deep learning and representation learning workshop.
Hou, S., Pan, X., Loy, CC., Wang, Z. & Lin, D. (2019). Learning a unified classifier incrementally via rebalancing. In The IEEE conference on computer vision and pattern recognition (pp. 831–839).
Kingma, D. P. & Ba, J. (2015). Adam: A method for stochastic optimization. In International conference on learning representations.
Kirkpatrick, J., Pascanu, R., Rabinowitz, N., Veness, J., Desjardins, G., Rusu, A. A., Milan, K., Quan, J., Ramalho, T., Grabska-Barwinska, A., et al. (2017). Overcoming catastrophic forgetting in neural networks. The National Academy of Sciences, 114(13), 3521–3526.
Article MathSciNet Google Scholar
Krizhevsky, A. & Hinton, G. (2009). Learning multiple layers of features from tiny images. In Handbook of systemic autoimmune diseases (Vol. 1, No. 4)
Krizhevsky, A., Sutskever, I. & Hinton, GE. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097–1105).
Li, Z., & Hoiem, D. (2017). Learning without forgetting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(12), 2935–2947.
Article Google Scholar
Liu, Y., Parisot, S., Slabaugh, G., Jia, X., Leonardis, A. & Tuytelaars, T. (2020). More classifiers, less forgetting: A generic multi-classifier paradigm for incremental learning. In The European conference on computer vision (pp. 699–716).
Liu, Y., Zhang, W. & Wang, J. (2021). Zero-shot adversarial quantization. In The IEEE conference on computer vision and pattern recognition (pp. 1512–1521).
Long, J., Shelhamer, E. & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In The IEEE conference on computer vision and pattern recognition (pp. 3431–3440).
Lopes, RG., Fenu, S. & Starner, T. (2017). Data-free knowledge distillation for deep neural networks. In Advances in neural information processing systems workshop.
Loshchilov, I. & Hutter, F. (2017). Sgdr: Stochastic gradient descent with warm restarts. In International conference on learning representations.
Van der Maaten, L. & Hinton, G. (2008). Visualizing data using t-sne. Journal of Machine Learning Research 9(11).
Mahendran, A. & Vedaldi, A. (2015). Understanding deep image representations by inverting them. In The IEEE conference on computer vision and pattern recognition (pp. 5188–5196).
Masana, M., Liu, X., Twardowski, B., Menta, M., Bagdanov, AD. & van de Weijer, J. (2020). Class-incremental learning: Survey and performance evaluation on image classification. arXiv:2010.15277
Micaelli, P. & Storkey, AJ. (2019). Zero-shot knowledge transfer via adversarial belief matching. In Advances in neural information processing systems (pp. 9551–9561).
Mirza, M. & Osindero, S. (2014). Conditional generative adversarial nets. arXiv:1411.1784
Nayak, G. K., Mopuri, K. R., Shaj, V., Radhakrishnan, V. B. & Chakraborty, A. (2019). Zero-shot knowledge distillation in deep networks. In International conference on machine learning (pp. 4743–4751).
Nie, D., & Shen, D. (2020). Adversarial confidence learning for medical image segmentation and synthesis. International Journal of Computer Vision, 128(10), 2494–2513.
Article MathSciNet Google Scholar
Park, W., Kim, D., Lu, Y. & Cho, M. (2019). Relational knowledge distillation. In The IEEE conference on computer vision and pattern recognition (pp. 3967–3976).
Rebuffi, S. A., Kolesnikov, A., Sperl, G. & Lampert, C. H. (2017). icarl: Incremental classifier and representation learning. In The IEEE conference on computer vision and pattern recognition (pp. 2001–2010).
Redmon, J., Divvala, S., Girshick, R. & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In The IEEE conference on computer vision and pattern recognition (pp. 779–788).
Romero, A., Ballas, N., Kahou, S. E., Chassang, A., Gatta, C. & Bengio, Y. (2015). Fitnets: Hints for thin deep nets. In International conference on learning representations.
Ronneberger, O., Fischer, P. & Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. In International conference on medical image computing and computer-assisted intervention (pp. 234–241).
Schroff, F., Kalenichenko, D. & Philbin, J. (2015). Facenet: A unified embedding for face recognition and clustering. In The IEEE conference on computer vision and pattern recognition (pp. 815–823).
Simonyan, K. & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In International conference on learning representations.
Wu, C. Y., Manmatha, R., Smola, A. J. & Krahenbuhl, P. (2017a). Sampling matters in deep embedding learning. In The IEEE international conference on computer vision (pp. 2840–2848).
Wu, J., Zhang, Q. & Xu, G. (2017b). Tiny imagenet challenge. Technical Report.
Wu, X., He, R., Hu, Y., & Sun, Z. (2020). Learning an evolutionary embedding via massive knowledge distillation. International Journal of Computer Vision, 128(8), 2089–2106.
Article Google Scholar
Xiang, Y., Fu, Y., Ji, P. & Huang, H. (2019). Incremental learning using conditional adversarial networks. In The IEEE international conference on computer vision (pp. 6619–6628).
Xu, S., Li, H., Zhuang, B., Liu, J., Cao, J., Liang, C. & Tan, M. (2020). Generative low-bitwidth data free quantization. In The European conference on computer vision (pp. 1–17).
Yin, H., Molchanov, P., Alvarez, JM., Li, Z., Mallya, A., Hoiem, D., Jha, N. K. & Kautz, J. (2020). Dreaming to distill: Data-free knowledge transfer via deepinversion. In the IEEE conference on computer vision and pattern recognition (pp. 8715–8724).
Zagoruyko, S. & Komodakis, N. (2016). Wide residual networks. In The british machine vision conference (pp. 87.1–87.12).
Zhang, Z., Qiao, S., Xie, C., Shen, W., Wang, B. & Yuille, AL. (2018). Single-shot object detection with enriched semantics. In The IEEE conference on computer vision and pattern recognition (pp. 5813–5821).
Zhang, Z., Luo, C., Wu, H., Chen, Y., Wang, N., & Song, C. (2022). From individual to whole: Reducing intra-class variance by feature aggregation. International Journal of Computer Vision, 130(3), 800–819.
Article Google Scholar
Zhong, Y., Lin, M., Nan, G., Liu, J., Zhang, B., Tian, Y., & Ji, R. (2022). Intraq: Learning synthetic images with intra-class heterogeneity for zero-shot network quantization. In The IEEE conference on computer vision and pattern recognition (pp. 12,339–12,348).
Zhu, F., Zhang, XY., Wang, C., Yin, F. & Liu, CL. (2021). Prototype augmentation and self-supervision for incremental learning. In The IEEE conference on computer vision and pattern recognition (pp. 5871–5880).
Zhu, K., Zhai, W., Cao, Y., Luo, J. & Zha, Z. J. (2022). Self-sustaining representation expansion for non-exemplar class-incremental learning. In The IEEE conference on computer vision and pattern recognition (pp. 9296–9305).

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (62331006, 6217 1038, 61931008, and 62088101), and the Fundamental Research Funds for the Central Universities.

Author information

Authors and Affiliations

School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
Yingping Liang & Ying Fu

Authors

Yingping Liang
View author publications
Search author on:PubMed Google Scholar
Ying Fu
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Ying Fu.

Additional information

Communicated by Liwei Wang.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liang, Y., Fu, Y. Relation-Guided Adversarial Learning for Data-Free Knowledge Transfer. Int J Comput Vis 133, 2868–2885 (2025). https://doi.org/10.1007/s11263-024-02303-4

Download citation

Received: 28 July 2022
Accepted: 12 November 2024
Published: 13 December 2024
Version of record: 13 December 2024
Issue date: May 2025
DOI: https://doi.org/10.1007/s11263-024-02303-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Relation-Guided Adversarial Learning for Data-Free Knowledge Transfer

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Adaptive Discriminative Regularization for Visual Classification

Hierarchical Unsupervised Relation Distillation for Source Free Domain Adaptation

Dual discriminator adversarial distillation for data-free model compression

Explore related subjects

Data Availability.

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now