Abstract
Data-free knowledge distillation transfers knowledge by recovering training data from a pre-trained model. Despite the recent success of seeking global data diversity, the diversity within each class and the similarity among different classes are largely overlooked, resulting in data homogeneity and limited performance. In this paper, we introduce a novel Relation-Guided Adversarial Learning method with triplet losses, which solves the homogeneity problem from two aspects. To be specific, our method aims to promote both intra-class diversity and inter-class confusion of the generated samples. To this end, we design two phases, an image synthesis phase and a student training phase. In the image synthesis phase, we construct an optimization process to push away samples with the same labels and pull close samples with different labels, leading to intra-class diversity and inter-class confusion, respectively. Then, in the student training phase, we perform an opposite optimization, which adversarially attempts to reduce the distance of samples of the same classes and enlarge the distance of samples of different classes. To mitigate the conflict of seeking high global diversity and keeping inter-class confusing, we propose a focal weighted sampling strategy by selecting the negative in the triplets unevenly within a finite range of distance. RGAL shows significant improvement over previous state-of-the-art methods in accuracy and data efficiency. Besides, RGAL can be inserted into state-of-the-art methods on various data-free knowledge transfer applications. Experiments on various benchmarks demonstrate the effectiveness and generalizability of our proposed method on various tasks, specially data-free knowledge distillation, data-free quantization, and non-exemplar incremental learning. Our code will be publicly available to the community.
Similar content being viewed by others
Data Availability.
Data sharing does not apply to this article, as no datasets were generated or analyzed during the current study.
References
Arora, S., & Bhatia, M. (2021). A secure framework to preserve privacy of biometric templates on cloud using deep learning. Recent Advances in Computer Science and Communications, 14(5), 1412–1421.
Ba, J. & Caruana, R. (2014). Do deep nets really need to be deep? In Advances in neural information processing systems (pp. 2654–2662).
Bhardwaj, K., Suda, N. & Marculescu, R. (2019). Dream distillation: A data-independent model compression framework. In International conference on machine learning workshop.
Bucilua, C., Caruana, R. & Niculescu-Mizil, A. (2006). Model compression. In International conference on knowledge discovery and data mining (pp. 535–541).
Cai, Y., Yao, Z., Dong, Z., Gholami, A., Mahoney, M. W. & Keutzer, K. (2020). Zeroq: A novel zero shot quantization framework. In The IEEE conference on computer vision and pattern recognition (pp. 13169–13178).
Castro, F. M., Marín-Jiménez, M. J., Guil, N., Schmid, C. & Alahari, K. (2018). End-to-end incremental learning. In The european conference on computer vision (pp. 233–248).
Chaudhry, A., Dokanial P. K., Ajanthan, T. & Torr, P. H. (2018). Riemannian walk for incremental learning: Understanding forgetting and intransigence. In The European conference on computer vision (pp. 532–547).
Chen, H., Wang, Y., Xu, C., Yang, Z., Liu, C., Shi, B., Xu, C., Xu, C. & Tian, Q. (2019). Data-free learning of student networks. In the IEEE International conference on computer vision (pp. 3514–3522).
Cheng, Y., Wang, D., Zhou, P., & Zhang, T. (2018). Model compression and acceleration for deep neural networks: Principles, progress, and challenges. IEEE Signal Processing Magazine, 35(1), 126–136.
Choi, Y., Choi, J., El-Khamy, M. & Lee, J. (2020). Data-free network quantization with adversarial knowledge distillation. In The IEEE conference on computer vision and pattern recognition workshops (pp. 710–711).
Deng, J., Dong, W., Socher, R., Li, LJ., Li, K. & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In The IEEE conference on computer vision and pattern recognition (pp. 248–255).
Dhar, P., Singh, R. V., Peng, K. C., Wu, Z. & Chellappa, R. (2019). Learning without memorizing. In The IEEE conference on computer vision and pattern recognition (pp. 5138–5146).
Fang, G., Song, J., Shen, C., Wang, X., Chen, D. & Song, M. (2019). Data-free adversarial distillation. arXiv:1912.11006
Fang, G., Song, J., Wang, X., Shen, C., Wang, X. & Song, M. (2021). Contrastive model inversion for data-free knowledge distillation. In International joint conference on artificial intelligence (pp. 2374–2380).
Fang, G., Mo, K., Wang, X., Song, J., Bei, S., Zhang, H. & Song, M. (2022). Up to 100x faster data-free knowledge distillation. In The AAAI conference on artificial intelligence (Vol. 36, pp. 6597–6604).
Girshick, R., Donahue, J., Darrell, T, & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In The IEEE conference on computer vision and pattern recognition (pp. 580–587).
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A. & Bengio, Y. (2014). Generative adversarial nets. In Advances in neural information processing systems (pp. 2672–2680).
Gou, J., Yu, B., Maybank, S. J., & Tao, D. (2021). Knowledge distillation: A survey. International Journal of Computer Vision, 129(6), 1789–1819.
Ha, T., Dang, T. K., Le, H., & Truong, T. A. (2020). Security and privacy issues in deep learning: A brief review. SN Computer Science, 1(5), 1–15.
Han, P., Park, J., Wang, S. & Liu, Y. (2021). Robustness and diversity seeking data-free knowledge distillation. In The IEEE International conference on acoustics, speech and signal processing (pp. 2740–2744).
He, K., Zhang, X., Ren, S. & Sun, J. (2016). Deep residual learning for image recognition. In The IEEE conference on computer vision and pattern recognition (pp. 770–778).
Hinton, G., Vinyals, O. & Dean, J. (2014). Distilling the knowledge in a neural network. In Deep learning and representation learning workshop.
Hou, S., Pan, X., Loy, CC., Wang, Z. & Lin, D. (2019). Learning a unified classifier incrementally via rebalancing. In The IEEE conference on computer vision and pattern recognition (pp. 831–839).
Kingma, D. P. & Ba, J. (2015). Adam: A method for stochastic optimization. In International conference on learning representations.
Kirkpatrick, J., Pascanu, R., Rabinowitz, N., Veness, J., Desjardins, G., Rusu, A. A., Milan, K., Quan, J., Ramalho, T., Grabska-Barwinska, A., et al. (2017). Overcoming catastrophic forgetting in neural networks. The National Academy of Sciences, 114(13), 3521–3526.
Krizhevsky, A. & Hinton, G. (2009). Learning multiple layers of features from tiny images. In Handbook of systemic autoimmune diseases (Vol. 1, No. 4)
Krizhevsky, A., Sutskever, I. & Hinton, GE. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097–1105).
Li, Z., & Hoiem, D. (2017). Learning without forgetting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(12), 2935–2947.
Liu, Y., Parisot, S., Slabaugh, G., Jia, X., Leonardis, A. & Tuytelaars, T. (2020). More classifiers, less forgetting: A generic multi-classifier paradigm for incremental learning. In The European conference on computer vision (pp. 699–716).
Liu, Y., Zhang, W. & Wang, J. (2021). Zero-shot adversarial quantization. In The IEEE conference on computer vision and pattern recognition (pp. 1512–1521).
Long, J., Shelhamer, E. & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In The IEEE conference on computer vision and pattern recognition (pp. 3431–3440).
Lopes, RG., Fenu, S. & Starner, T. (2017). Data-free knowledge distillation for deep neural networks. In Advances in neural information processing systems workshop.
Loshchilov, I. & Hutter, F. (2017). Sgdr: Stochastic gradient descent with warm restarts. In International conference on learning representations.
Van der Maaten, L. & Hinton, G. (2008). Visualizing data using t-sne. Journal of Machine Learning Research 9(11).
Mahendran, A. & Vedaldi, A. (2015). Understanding deep image representations by inverting them. In The IEEE conference on computer vision and pattern recognition (pp. 5188–5196).
Masana, M., Liu, X., Twardowski, B., Menta, M., Bagdanov, AD. & van de Weijer, J. (2020). Class-incremental learning: Survey and performance evaluation on image classification. arXiv:2010.15277
Micaelli, P. & Storkey, AJ. (2019). Zero-shot knowledge transfer via adversarial belief matching. In Advances in neural information processing systems (pp. 9551–9561).
Mirza, M. & Osindero, S. (2014). Conditional generative adversarial nets. arXiv:1411.1784
Nayak, G. K., Mopuri, K. R., Shaj, V., Radhakrishnan, V. B. & Chakraborty, A. (2019). Zero-shot knowledge distillation in deep networks. In International conference on machine learning (pp. 4743–4751).
Nie, D., & Shen, D. (2020). Adversarial confidence learning for medical image segmentation and synthesis. International Journal of Computer Vision, 128(10), 2494–2513.
Park, W., Kim, D., Lu, Y. & Cho, M. (2019). Relational knowledge distillation. In The IEEE conference on computer vision and pattern recognition (pp. 3967–3976).
Rebuffi, S. A., Kolesnikov, A., Sperl, G. & Lampert, C. H. (2017). icarl: Incremental classifier and representation learning. In The IEEE conference on computer vision and pattern recognition (pp. 2001–2010).
Redmon, J., Divvala, S., Girshick, R. & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In The IEEE conference on computer vision and pattern recognition (pp. 779–788).
Romero, A., Ballas, N., Kahou, S. E., Chassang, A., Gatta, C. & Bengio, Y. (2015). Fitnets: Hints for thin deep nets. In International conference on learning representations.
Ronneberger, O., Fischer, P. & Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. In International conference on medical image computing and computer-assisted intervention (pp. 234–241).
Schroff, F., Kalenichenko, D. & Philbin, J. (2015). Facenet: A unified embedding for face recognition and clustering. In The IEEE conference on computer vision and pattern recognition (pp. 815–823).
Simonyan, K. & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In International conference on learning representations.
Wu, C. Y., Manmatha, R., Smola, A. J. & Krahenbuhl, P. (2017a). Sampling matters in deep embedding learning. In The IEEE international conference on computer vision (pp. 2840–2848).
Wu, J., Zhang, Q. & Xu, G. (2017b). Tiny imagenet challenge. Technical Report.
Wu, X., He, R., Hu, Y., & Sun, Z. (2020). Learning an evolutionary embedding via massive knowledge distillation. International Journal of Computer Vision, 128(8), 2089–2106.
Xiang, Y., Fu, Y., Ji, P. & Huang, H. (2019). Incremental learning using conditional adversarial networks. In The IEEE international conference on computer vision (pp. 6619–6628).
Xu, S., Li, H., Zhuang, B., Liu, J., Cao, J., Liang, C. & Tan, M. (2020). Generative low-bitwidth data free quantization. In The European conference on computer vision (pp. 1–17).
Yin, H., Molchanov, P., Alvarez, JM., Li, Z., Mallya, A., Hoiem, D., Jha, N. K. & Kautz, J. (2020). Dreaming to distill: Data-free knowledge transfer via deepinversion. In the IEEE conference on computer vision and pattern recognition (pp. 8715–8724).
Zagoruyko, S. & Komodakis, N. (2016). Wide residual networks. In The british machine vision conference (pp. 87.1–87.12).
Zhang, Z., Qiao, S., Xie, C., Shen, W., Wang, B. & Yuille, AL. (2018). Single-shot object detection with enriched semantics. In The IEEE conference on computer vision and pattern recognition (pp. 5813–5821).
Zhang, Z., Luo, C., Wu, H., Chen, Y., Wang, N., & Song, C. (2022). From individual to whole: Reducing intra-class variance by feature aggregation. International Journal of Computer Vision, 130(3), 800–819.
Zhong, Y., Lin, M., Nan, G., Liu, J., Zhang, B., Tian, Y., & Ji, R. (2022). Intraq: Learning synthetic images with intra-class heterogeneity for zero-shot network quantization. In The IEEE conference on computer vision and pattern recognition (pp. 12,339–12,348).
Zhu, F., Zhang, XY., Wang, C., Yin, F. & Liu, CL. (2021). Prototype augmentation and self-supervision for incremental learning. In The IEEE conference on computer vision and pattern recognition (pp. 5871–5880).
Zhu, K., Zhai, W., Cao, Y., Luo, J. & Zha, Z. J. (2022). Self-sustaining representation expansion for non-exemplar class-incremental learning. In The IEEE conference on computer vision and pattern recognition (pp. 9296–9305).
Acknowledgements
This work was supported by the National Natural Science Foundation of China (62331006, 6217 1038, 61931008, and 62088101), and the Fundamental Research Funds for the Central Universities.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Liwei Wang.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liang, Y., Fu, Y. Relation-Guided Adversarial Learning for Data-Free Knowledge Transfer. Int J Comput Vis 133, 2868–2885 (2025). https://doi.org/10.1007/s11263-024-02303-4
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1007/s11263-024-02303-4