CCR: Facial Image Editing with Continuity, Consistency and Reversibility

Yang, Nan; Luan, Xin; Jia, Huidi; Han, Zhi; Li, Xiaofeng; Tang, Yandong

doi:10.1007/s11263-023-01938-z

CCR: Facial Image Editing with Continuity, Consistency and Reversibility

Published: 14 November 2023

Volume 132, pages 1336–1349, (2024)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Nan Yang^1,2,3^na1,
Xin Luan^2,3,4^na1,
Huidi Jia^2,3,4,
Zhi Han^2,3,
Xiaofeng Li¹ &
…
Yandong Tang ORCID: orcid.org/0000-0003-3805-7654^2,3

528 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

Three problems exist in sequential facial image editing: discontinuous editing, inconsistent editing, and irreversible editing. Discontinuous editing is that the current editing can not retain the previously edited attributes. Inconsistent editing is that swapping the attribute editing orders can not yield the same results. Irreversible editing means that operating on a facial image is irreversible, especially in sequential facial image editing. In this work, we put forward three concepts and their corresponding definitions: editing continuity, consistency, and reversibility. Note that continuity refers to the continuity of attributes, that is, attributes can be continuously edited on any face. Consistency is that not only attributes meet continuity, but also facial identity needs to be consistent. To do so, we propose a novel model to achieve the goal of editing continuity, consistency, and reversibility. Furthermore, a sufficient criterion is defined to determine whether a model is continuous, consistent, and reversible. Extensive qualitative and quantitative experimental results validate our proposed model, and show that a continuous, consistent and reversible editing model has a more flexible editing function while preserving facial identity. We believe that our proposed definitions and model will have wide and promising applications in multimedia processing. Code and data are available at https://github.com/mickoluan/CCR.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Conditional adversarial consistent identity autoencoder for cross-age face synthesis

Article 22 January 2021

Iterative facial image inpainting based on an encoder-generator architecture

Article 23 February 2022

Mitigating Bias in Facial Recognition Systems: Centroid Fairness Loss Optimization

References

Abdal, R., Qin, Y., & Wonka, P. (2020). Image2StyleGAN++: How to edit the embedded images? In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (pp. 8293–8302). https://doi.org/10.1109/CVPR42600.2020.00832
Chai, T., & Draxler, R. R. (2014). Root mean square error (RMSE) or mean absolute error (MAE)?-Arguments against avoiding RMSE in the literature. Geoscientific Model Development, 7(3), 1247–1250. https://doi.org/10.5194/gmd-7-1247-2014
Article Google Scholar
Chen, X., Ni, B., Liu, N., Liu, Z., Jiang, Y., Truong, L., & Tian, Q. (2020). Coogan: A memory-efficient framework for high-resolution facial attribute editing. In Computer Vision-ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XI 16 (pp. 670–686). Springer International Publishing
Choi, Y., Choi, M., Kim, M., Ha, J. W., Kim, S., & Choo, J. (2018). Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, (pp. 8789–8797) https://doi.org/10.1109/CVPR.2018.00916, 1711.09020
De Bot, K., Gommans, P., & Rossing, C. (1991). L1 loss in an L2 environment: Dutch immigrants in France. First Lang attrition (pp. 87–98).
Deng, J., Guo, J., Xue, N., & Zafeiriou, S. (2019). ArcFace: Additive angular margin loss for deep face recognition. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (pp. 4685–4694). https://doi.org/10.1109/CVPR.2019.00482, arXiv:1801.07698
Ding, K., Ma, K., Wang, S., & Simoncelli, E. P. (2020). Image quality assessment: Unifying structure and texture similarity. arXiv preprint arXiv:2004.07728
Geng, Z., Cao, C., & Tulyakov, S. (2020). Towards photo-realistic facial expression manipulation. International Journal of Computer Vision, 128(10), 2744–2761.
Article Google Scholar
Georgopoulos, M., Oldfield, J., Nicolaou, M. A., Panagakis, Y., & Pantic, M. (2021). Mitigating demographic bias in facial datasets with style-based multi-attribute transfer. International Journal of Computer Vision, 129(7), 2288–2307.
Article Google Scholar
Guo, D., Shamai, S., & Verdú, S. (2004). Mutual information and MMSE in Gaussian channels. IEEE International Symposium on Information Theory - Proceedings, 51(4), 347. https://doi.org/10.1109/isit.2004.1365386
Article Google Scholar
He, Z., Zuo, W., Kan, M., Shan, S., & Chen, X. (2019). AttGAN: Facial attribute editing by only changing what you want. IEEE Transactions on Image Processing, 28(11), 5464–5478. https://doi.org/10.1109/TIP.2019.2916751. arXiv:1711.10678.
Article MathSciNet Google Scholar
Hore, A., & Ziou, D. (2010). Image quality metrics: PSNR versus SSIM. In 2010 20th international conference on pattern recognition (pp. 2366–2369) (2010).
Huang, X., Liu, M. Y., Belongie, S., & Kautz, J. (2018). Multimodal unsupervised image-to-image translation. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics) (pp. 179–196). https://doi.org/10.1007/978-3-030-01219-9_11, arXiv:1804.04732
Karras, T., Aila, T., Laine, S., & Lehtinen, J. (2017). Progressive growing of gans for improved quality, stability, and variation. arXiv Prepr arXiv:1710.10196
Karras, T., Laine, S., & Aila, T. (2019). A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition 2019 (pp. 4396–4405). https://doi.org/10.1109/CVPR.2019.00453, arXiv:1812.04948
Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., & Aila, T. (2020). Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (pp. 8107–8116). https://doi.org/10.1109/CVPR42600.2020.00813, arXiv:1912.04958
Li, X., Zhang, S., Hu, J., Cao, L., Hong, X., Mao, X., Huang, F., Wu, Y., Ji, R. (2021). Image-to-image translation via hierarchical style disentanglement. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (pp. 8635–8644). https://doi.org/10.1109/CVPR46437.2021.00853, arXiv:2103.01456
Li, Z., Zhang, Z., Qin, J., Zhang, Z., & Shao, L. (2019). Discriminative fisher embedding dictionary learning algorithm for object recognition. IEEE Transactions on Neural Networks and Learning Systems, 31(3), 786–800.
Article MathSciNet Google Scholar
Liu, M., Ding, Y., Xia, M., Liu, X., Ding, E., Zuo, W., & Wen, S. (2019). STGAN: A unified selective transfer network for arbitrary image attribute editing. Proceedings of the IEEE computer society conference on computer vision and pattern recognition (pp. 3668–3677). https://doi.org/10.1109/CVPR.2019.00379, arXiv:1904.09709
Liu, Z., Luo, P., Wang, X., & Tang, X. (2015). Deep learning face attributes in the wild. In Proceedings of the IEEE international conference on computer vision (pp. 3730–3738) https://doi.org/10.1109/ICCV.2015.425, arXiv:1411.7766
Mescheder, L., Geiger, A., & Nowozin, S. (2018). Which training methods for GANs do actually converge? In 35th international conference on machine learning, ICML 2018, vol 8. PMLR (pp. 5589–5626). arXiv:1801.04406
Miyato, T., Kataoka, T., Koyama, M., & Yoshida, Y. (2018). Spectral normalization for generative adversarial networks. In 6th international conference on learning representations, ICLR 2018 - conference track proceedingsarXiv:1802.05957
Moore, T., & Fallah, M. (2001). Control of eye movements and spatial attention. Proceedings of the National Academy of Sciences of the United States of America, 98(3), 1273–1276. https://doi.org/10.1073/pnas.98.3.1273
Article Google Scholar
Norouzi, M., Fleet, D. J., & Salakhutdinov, R. (2012). Hamming distance metric learning. Advances in Neural Information Processing Systems 1061–1069
Pumarola, A., Agudo, A., Martinez, A. M., Sanfeliu, A., & Moreno-Noguer, F. (2018). GANimation: Anatomically-aware facial animation from a single image. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics) (pp. 835–851). https://doi.org/10.1007/978-3-030-01249-6_50, arXiv:1807.09251
Pumarola, A., Agudo, A., Martinez, A. M., Sanfeliu, A., & Moreno-Noguer, F. (2020). Ganimation: One-shot anatomically consistent facial animation. International Journal of Computer Vision, 128(3), 698–713.
Article Google Scholar
Radford, A., Metz, L., & Chintala, S. (2015). Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv Prepr arXiv:1511.06434
Richardson, E., Alaluf, Y., Patashnik, O., Nitzan, Y., Azar, Y., Shapiro, S., & Cohen-Or, D. (2021). Encoding in Style: A StyleGAN encoder for image-to-image translation. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (pp. 2287–2296). https://doi.org/10.1109/CVPR46437.2021.00232, arXiv:2008.00951
Shanmugam, R., & Tong, Y. L. (1993). The multivariate normal distribution. Springer Science & Business Media.
Google Scholar
Sheikh, H. R., & Bovik, A. C. (2006). Image information and visual quality. IEEE Transactions on Image Processing, 15(2), 430–444.
Article Google Scholar
Shen, Y., & Zhou, B. (2021). Closed-form factorization of latent semantics in GaNs. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (pp. 1532–1540). https://doi.org/10.1109/CVPR46437.2021.00158, arXiv:2007.06600
Shen, Y., Yang, C., Tang, X., & Zhou, B. (2020). Interfacegan: Interpreting the disentangled face representation learned by gans. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(4), 2004–2018. https://doi.org/10.1109/TPAMI.2020.3034267. arXiv:2005.09635.
Article Google Scholar
Snell, J., Ridgeway, K., Liao, R., Roads, B. D., Mozer, M. C., & Zemel, R. S. (2017). Learning to generate images with perceptual similarity metrics. In 2017 IEEE international conference on image processing (ICIP) (pp. 4277–4281).
Tan, D. S., Soeseno, J. H., & Hua, K. L. (2021). Controllable and identity-aware facial attribute transformation. IEEE Transactions on Cybernetics. https://doi.org/10.1109/TCYB.2021.3071172
Article Google Scholar
Upchurch, P., Gardner, J., Pleiss, G., Pless, R., Snavely, N., Bala, K., & Weinberger, K. (2017). Deep feature interpolation for image content changes. In: Proceedings - 30th IEEE conference on computer vision and pattern recognition, CVPR 2017 (pp. 6090–6099) https://doi.org/10.1109/CVPR.2017.645, arXiv:1611.05507
Viazovetskyi, Y., Ivashkin, V., & Kashin, E. (2020). StyleGAN2 distillation for feed-forward image manipulation. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics) (Vol. 12367 LNCS, pp. 170–186). Springer https://doi.org/10.1007/978-3-030-58542-6_11, arXiv:2003.03581
Wang, Y., Gonzalez-Garcia, A., van De Weijer, J., & Herranz, L. (2019). SDIT: Scalable and diverse cross-domain image translation. In: MM 2019 - Proceedings of the 27th ACM international conference on multimedia (pp. 1267–1276). https://doi.org/10.1145/3343031.3351004, arXiv:1908.06881
Wang, Z., & Bovik, A. C. (2002). A universal image quality index. IEEE Signal Processing Letters, 9(3), 81–84. https://doi.org/10.1109/97.995823
Article Google Scholar
Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4), 600–612. https://doi.org/10.1109/TIP.2003.819861
Article Google Scholar
Wei, X., Shen, H., Li, Y., Tang, X., Wang, F., Kleinsteuber, M., & Murphey, Y. L. (2018). Reconstructible nonlinear dimensionality reduction via joint dictionary learning. IEEE Transactions on Neural Networks and Learning Systems, 30(1), 175–189.
Article MathSciNet Google Scholar
Xiao, T., Hong, J., & Ma, J. (2018). ELEGANT: Exchanging latent encodings with GAN for transferring multiple face attributes. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics) (pp. 172–187). https://doi.org/10.1007/978-3-030-01249-6_11, arXiv:1803.10562
Xue, W., Zhang, L., Mou, X., & Bovik, A. C. (2013). Gradient magnitude similarity deviation: A highly efficient perceptual image quality index. IEEE Transactions on Image Processing, 23(2), 684–695.
Article MathSciNet Google Scholar
Yang, G., Fei, N., Ding, M., Liu, G., Lu, Z., & Xiang, T. (2021). L2m-gan: Learning to manipulate latent space semantics for facial attribute editing. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (pp. 2950–2959). https://doi.org/10.1109/CVPR46437.2021.00297
Yang, N., Zheng, Z., Zhou, M., Guo, X., Qi, L., & Wang, T. (2021). A domain-guided noise-optimization-based inversion method for facial image manipulation. IEEE Transactions on Image Processing, 30, 6198–6211. https://doi.org/10.1109/TIP.2021.3089905
Yang, N., Zhou, M., Xia, B., Guo, X., & Qi, L. (2021). Inversion based on a detached dual-channel domain method for StyleGAN2 embedding. IEEE Signal Processing Letters, 28, 553–557. https://doi.org/10.1109/LSP.2021.3059371
Article Google Scholar
Zhang, J., Huang, Y., Li, Y., Zhao, W., & Zhang, L. (2019). Multi-attribute transfer via disentangled representation. In: 33rd AAAI conference on artificial intelligence, AAAI 2019, 31st innovative applications of artificial intelligence conference, IAAI 2019 and the 9th AAAI symposium on educational advances in artificial intelligence, EAAI 2019 (pp. 9195–9202). https://doi.org/10.1609/aaai.v33i01.33019195
Zhang, K., Su, Y., Guo, X., Qi, L., & Zhao, Z. (2020). MU-GAN: Facial attribute editing based on multi-attention mechanism. IEEE/CAA Journal of Automatica Sinica, 8(9), 1614–1626.
Article Google Scholar
Zhang, L., Zhang, L., Mou, X., & Zhang, D. (2011). FSIM: A feature similarity index for image quality assessment. IEEE Transactions on Image Processing, 20(8), 2378–2386.
Article MathSciNet Google Scholar
Zhang, L., Shen, Y., & Li, H. (2014). VSI: A visual saliency-induced index for perceptual image quality assessment. IEEE Transactions on Image Processing, 23(10), 4270–4281. https://doi.org/10.1109/TIP.2014.2346028
Article MathSciNet Google Scholar
Zhang, R., Isola, P., Efros, A. A., Shechtman, E., & Wang, O. (2018). The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 586–595).
Zhao, B., Chang, B., Jie, Z., & Sigal, L. (2018). Modular generative adversarial networks. In: Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and Lecture notes in bioinformatics) (pp. 157–173). https://doi.org/10.1007/978-3-030-01264-9_10
Zheng, X., Guo, Y., Huang, H., Li, Y., & He, R. (2020). A survey of deep facial attribute analysis. International Journal of Computer Vision, 128(8), 2002–2034.
Article Google Scholar
Zhu, F., Cao, H., Feng, Z., Zhang, Y., Luo, W., Zhou, H., Song, M., & Ma, K. -K. (2019). Semi-supervised eye makeup transfer by swapping learned representation. In Proceedings - 2019 international conference on computer vision workshop, ICCVW 2019 (pp. 3858–3867) https://doi.org/10.1109/ICCVW.2019.00479
Zhu, J., Zhao, D., Zhang, B., & Zhou, B. (2022). Disentangled inference for gans with latently invertible autoencoder. International Journal of Computer Vision, 130(5), 1259–1276.
Article Google Scholar
Zhu, J. Y., Li, H., Shechtman, E., Liu, M. Y., Kautz, J., & Torralba, A. (2020). Guest editorial: Generative adversarial networks for computer vision. International Journal of Computer Vision, 128, 2363–2365.
Article Google Scholar

Download references

Acknowledgements

This work This work was supported by the National Key Research and Development Program of China under Grant 2020YFB1313400, Youth Innovation Promotion Association of the Chinese Academy of Sciences under Grant Y202051 CAS Project for Young Scientists in Basic Research, Grant No. YSBR-041, the National Natural Science Foundation of China under Grant 42306214, 61821005 and 61991413, Shandong Province Postdoctoral Innovative Talents Support Program (SDBX2022026), China Postdoctoral Science Foundation (2023M733533), and Special Research Assistant Project of the Chinese Academy of Sciences in 2022.

Author information

Nan Yang and Xin Luan have contributed equally to this work.

Authors and Affiliations

Institute of Oceanology, Chinese Academy of Sciences, Tsingtao, China
Nan Yang & Xiaofeng Li
State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, 110016, China
Nan Yang, Xin Luan, Huidi Jia, Zhi Han & Yandong Tang
Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang, 110169, China
Nan Yang, Xin Luan, Huidi Jia, Zhi Han & Yandong Tang
University of Chinese Academy of Sciences, Beijing, 100049, China
Xin Luan & Huidi Jia

Authors

Nan Yang
View author publications
Search author on:PubMed Google Scholar
Xin Luan
View author publications
Search author on:PubMed Google Scholar
Huidi Jia
View author publications
Search author on:PubMed Google Scholar
Zhi Han
View author publications
Search author on:PubMed Google Scholar
Xiaofeng Li
View author publications
Search author on:PubMed Google Scholar
Yandong Tang
View author publications
Search author on:PubMed Google Scholar

Corresponding authors

Correspondence to Xiaofeng Li or Yandong Tang.

Additional information

Communicated by Arun Mallya.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yang, N., Luan, X., Jia, H. et al. CCR: Facial Image Editing with Continuity, Consistency and Reversibility. Int J Comput Vis 132, 1336–1349 (2024). https://doi.org/10.1007/s11263-023-01938-z

Download citation

Received: 24 December 2022
Accepted: 18 October 2023
Published: 14 November 2023
Version of record: 14 November 2023
Issue date: April 2024
DOI: https://doi.org/10.1007/s11263-023-01938-z

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CCR: Facial Image Editing with Continuity, Consistency and Reversibility

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Conditional adversarial consistent identity autoencoder for cross-age face synthesis

Iterative facial image inpainting based on an encoder-generator architecture

Mitigating Bias in Facial Recognition Systems: Centroid Fairness Loss Optimization

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now