Abstract
The rapid development of AI-based facial manipulation techniques has made manipulated facial images highly deceptive. These techniques can be misused maliciously, which poses a severe threat to information security. Many effective detection methods have been developed to distinguish whether an image has been manipulated. However, malicious facial manipulation images or videos have been widely spread and had a harmful impact before detection. Thus protecting images from manipulation through proactive defense techniques has become the focus of current research. Currently, existing proactive defense methods disrupt the manipulation process through an adversarial attack on the facial manipulation network, which distorts or blurs parts of the manipulated facial image. Nevertheless, these methods are only slightly disruptive in defending against some facial manipulation methods, and the outputs are not only a stigmatized portrait but also that people still can not distinguish the real and fake. To overcome this issue, we propose a Multi-Teacher Universal Distillation based on information hiding for defense against facial manipulation. First, we propose a facial manipulation adversarial attacks network based on information hiding called IHA-Net. IHA-Net can hide the warning image in the protected image without affecting its visual quality and make the facial information disappear after manipulation to present the warning message. In this way, it prevents privacy leakage and stigmatization. Then to address the problem that the protected image cannot defend against multiple facial manipulations simultaneously, we propose the Multi-Teacher Universal Distillation framework. We use multiple trained teacher networks to co-direct the learning of the student network, allowing the student network to defend against multiple manipulation networks simultaneously. Specifically, we designed Multi-scale Discriminators for knowledge distillation at the feature map level to enable the student network to learn more rich knowledge from the teacher network. Furthermore, to balance the influence of multiple teacher networks on the student network during the training process, we designed a Dynamic Balancing Loss module that dynamically adjusts during the training process. Finally, extensive experiments on advanced facial manipulation systems demonstrate that the proposed method outperforms the state-of-the-art approaches.
Similar content being viewed by others
Data Availability
All experiments are performed on publicly available datasets: CelebA (http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html), LFW (http://vis-www.cs.umass.edu/lfw/). No new data were created or analysed for this study.
References
Afchar, D., Nozick, V., Yamagishi, J., & Echizen, I. (2018). Mesonet: A compact facial video forgery detection network. In Proceedings of IEEE international workshop on information forensics and security (WIFS), pages 1–7.
Athalye, A., Carlini, N., and & Wagner, D. A. (2018). Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In Proceedings of internation conference on machine learning (ICML), volume 80 of Proceedings of Machine Learning Research, pages 274–283.
Cao, Q., Shen, L., Xie, W., Parkhi, O. M., and & Zisserman, A. (2018). Vggface2: A dataset for recognising faces across pose and age. In Proceedings of IEEE international conference on automatic face and gesture recognition (FG), pages 67–74.
Choi, Y., Choi, M., Kim, M., Ha, J., Kim, S., and & Choo, J. (2018). Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In Proceedings of IEEE/CVF conference on computer vision and pattern recognition (CVPR), pages 8789–8797.
Chollet, F. (2017). Xception: Deep learning with depthwise separable convolutions. In Proceedings of IEEE/CVF conference on computer vision and pattern recognition (CVPR), pages 1800–1807.
Dong, J., and & Xie, X. (2021). Visually maintained image disturbance against deepfake face swapping. In Proceedings of international conference on multimedia and expo (ICME), pages 1–6. IEEE.
Fernandes, S. L., Raj, S., Ortiz, E., Vintila, I., Salter, M., Urosevic, G., and & Jha, S. K. (2019). Predicting heart rate variations of deepfake videos using neural ODE. In Proceedings of IEEE International conference on computer vision workshops (ICCVW), pages 1721–1729.
Geng, Z., Cao, C., & Tulyakov, S. (2020). Towards photo-realistic facial expression manipulation. International Journal of Computer Vision, 128(10), 2744–2761.
Guan, W., He, Z., Wang, W., Dong, J., and & Peng, B. (2022). Defending against deepfakes with ensemble adversarial perturbation. In Proceedings of internation conference on pattern recognition (ICPR), pages 1952–1958. IEEE.
He, Z., Zuo, W., Kan, M., Shan, S., & Chen, X. (2019). Attgan: Facial attribute editing by only changing what you want. IEEE Transactions on Image Processing, 28(11), 5464–5478.
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., and & Hochreiter, S. (2017). Gans trained by a two time-scale update rule converge to a local nash equilibrium. In Proceedings of advances in neural information processing systems (NeurIPS), pages 6626–6637.
Hu, J., Liao, X., Liang, J., Zhou, W., and & Qin, Z. (2022). Finfer: Frame inference-based deepfake detection for high-visual-quality videos. In Proceedings of the AAAI conference on artificial intelligence (AAAI), pages 951–959.
Huang, G. B., Mattar, M., Berg, T., and & Learned-Miller, E. (2008). Labeled faces in the wild: A database for studying face recognition in unconstrained environments. In Workshop on faces in’Real-Life’Images:Detection, alignment, and recognition.
Huang, H., Wang, Y., Chen, Z., Zhang, Y., Li, Y., Tang, Z., Chu, W., Chen, J., Lin, W., and & Ma, K. (2022). Cmua-watermark: A cross-model universal adversarial watermark for combating deepfakes. In Proceedings of the AAAI conference on artificial intelligence (AAAI), pages 989–997.
Huang, Q., Zhang, J., Zhou, W., Zhang, W., and & Yu, N. (2021). Initiative defense against facial manipulation. In Proceedings of the AAAI conference on artificial intelligence (AAAI), pages 1619–1627.
Juefei-Xu, F., Wang, R., Huang, Y., Guo, Q., Ma, L., & Liu, Y. (2022). Countering malicious deepfakes: Survey, battleground, and horizon. International Journal of Computer Vision, 130(7), 1678–1734.
Li, L., Bao, J., Zhang, T., Yang, H., Chen, D., Wen, F., and & Guo, B. (2020). Face x-ray for more general face forgery detection. In Proceedings of IEEE/CVF conference on computer vision and pattern recognition (CVPR), pages 5000–5009.
Li, X., Ni, R., Yang, P., Fu, Z., & Zhao, Y. (2023). Artifacts-disentangled adversarial learning for deepfake detection. IEEE Transactions on Circuits and Systems for Video Technology, 33(4), 1658–1670.
Li, X., Zhang, S., Hu, J., Cao, L., Hong, X., Mao, X., Huang, F., Wu, Y., and & Ji, R. (2021). Image-to-image translation via hierarchical style disentanglement. In Proceedings of IEEE/CVF conference on computer vision and pattern recognition (CVPR), pages 8639–8648.
Li, Y., Chang, M., and & Lyu, S. (2018). In ictu oculi: Exposing AI created fake videos by detecting eye blinking. In Proceedings of IEEE international workshop on information forensics and security (WIFS), pages 1–7.
Li, Y., and & Lyu, S. (2019). Exposing deepfake videos by detecting face warping artifacts. In Proceedings of IEEE/CVF conference on computer vision and pattern recognition workshops (CVPRW), pages 46–52.
Liao, X., Wang, Y., Wang, T., Hu, J., & Wu, X. (2023). FAMM: facial muscle motions for detecting compressed deepfake videos over social networks. IEEE Transactions on Circuits and Systems for Video Technology, 33(12), 7236–7251.
Lin, Y., Chen, H., Maiorana, E., Campisi, P., and & Li, B. (2022). Source-id-tracker: Source face identity protection in face swapping. In Proceedings of international conference on multimedia and expo (ICME), pages 1–6. IEEE.
Liu, H., Li, X., Zhou, W., Chen, Y., He, Y., Xue, H., Zhang, W., and & Yu, N. (2021). Spatial-phase shallow learning: Rethinking face forgery detection in frequency domain. In Proceedings of IEEE/CVF conference on computer vision and pattern recognition (CVPR), pages 772–781.
Liu, Z., Luo, P., Wang, X., and & Tang, X. (2015). Deep learning face attributes in the wild. In Proceedings of international conference on computer vision (ICCV).
Lv, L. (2021). Smart watermark to defend against deepfake image manipulation. In Proceedings of international conference on computer and communication systems (ICCCS), pages 380–384. IEEE.
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., and & Vladu, A. (2018). Towards deep learning models resistant to adversarial attacks. In Proceedings of internation conference on learning representations (ICLR).
Nguyen, H. H., Yamagishi, J., and & Echizen, I. (2019). Capsule-forensics: Using capsule networks to detect forged images and videos. In Proceedings of IEEE international conference on acoustics, speech and signal processing (ICASSP), pages 2307–2311.
Pumarola, A., Agudo, A., Martínez, A. M., Sanfeliu, A., & Moreno-Noguer, F. (2020). Ganimation: One-shot anatomically consistent facial animation. International Journal of Computer Vision, 128(3), 698–713.
Qian, Y., Yin, G., Sheng, L., Chen, Z., and & Shao, J. (2020). Thinking in frequency: Face forgery detection by mining frequency-aware clues. In Proceedings of European conference on computer vision. (ECCV), volume 12357 of Lecture Notes in Computer Science, pages 86–103.
Ruiz, N., Bargal, S. A., and & Sclaroff, S. (2020). Disrupting deepfakes: Adversarial attacks against conditional image translation networks and facial manipulation systems. In Proceedings of European Conference on Computer Vision Workshops (ECCVW), volume 12538 of Lecture Notes in Computer Science, pages 236–251.
Schroff, F., Kalenichenko, D., and & Philbin, J. (2015). Facenet: A unified embedding for face recognition and clustering. In Proceedings of IEEE/CVF conference on computer vision and pattern recognition (CVPR), pages 815–823.
Sun, P., Li, Y., Qi, H., and & Lyu, S. (2022). Faketracer: Exposing deepfakes with training data contamination. In Proceedings of international conference on image processing (icIP), pages 1161–1165. IEEE.
Tan, M., and & Le, Q. V. (2019). Efficientnet: Rethinking model scaling for convolutional neural networks. In Proceedings of internation conference on machine learning (ICML), volume 97 of Proceedings of machine learning research, pages 6105–6114. PMLR.
Tang, H., Xu, D., Sebe, N., and & Yan, Y. (2019). Attention-guided generative adversarial networks for unsupervised image-to-image translation. In International joint conference on neural networks (IJCNN), pages 1–8. IEEE.
Wang, R., Juefei-Xu, F., Luo, M., Liu, Y., and & Wang, L. (2021). Faketagger: Robust safeguards against deepfake dissemination via provenance tracking. In Proceedings of ACM international conference on multimedia, pages 3546–3555.
Yang, X., Li, Y., and & Lyu, S. (2019a). Exposing deep fakes using inconsistent head poses. In Proceedings of IEEE international conference on acoustics, speech and signal processing (ICASSP), pages 8261–8265.
Yang, X., Li, Y., Qi, H., and & Lyu, S. (2019b). Exposing gan-synthesized faces using landmark locations. In IH &MMSec, pages 113–118.
Yuan, Z., Bo, L., Ming, D., Baoping, L., Tianqing, Z., and & Xin, Y. (2023). Proactive deepfake defence via identity watermarking. In Proceedings of the IEEE/CVF winter conference on applications of computer vision, pages 4602–4611.
Zhao, H., Zhou, W., Chen, D., Wei, T., Zhang, W., and & Yu, N. (2021). Multi-attentional deepfake detection. In Proceedings of IEEE/CVF conference on computer vision and pattern recognition (CVPR), pages 2185–2194.
Zheng, X., Guo, Y., Huang, H., Li, Y., & He, R. (2020). A survey of deep facial attribute analysis. International Journal of Computer Vision, 128(8), 2002–2034.
Zhuang, W., Chu, Q., Tan, Z., Liu, Q., Yuan, H., Miao, C., Luo, Z., and & Yu, N. (2022). Uia-vit: Unsupervised inconsistency-aware method based on vision transformer for face forgery detection. In Proceedings of European conference on computer vision. (ECCV), volume 13665 of Lecture notes in computer science, pages 391–407.
Acknowledgements
This work was supported in part by the National Key R &D Program of China (No. 2021ZD0112100), National Natural Science Foundation of China (Nos. 62336001, 62120106009), and Beijing Natural Science Foundation (No. 4222014).
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Segio Escalera.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, X., Ni, R., Zhao, Y. et al. Multi-teacher Universal Distillation Based on Information Hiding for Defense Against Facial Manipulation. Int J Comput Vis 132, 5293–5307 (2024). https://doi.org/10.1007/s11263-024-02050-6
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1007/s11263-024-02050-6