这是indexloc提供的服务,不要输入任何密码
Skip to main content
Log in

Learning Spatiotemporal Inconsistency via Thumbnail Layout for Face Deepfake Detection

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

The deepfake threats to society and cybersecurity have provoked significant public apprehension, driving intensified efforts within the realm of deepfake video detection. Current video-level methods are mostly based on 3D CNNs resulting in high computational demands, although have achieved good performance. This paper introduces an elegantly simple yet effective strategy named Thumbnail Layout (TALL), which transforms a video clip into a pre-defined layout to realize the preservation of spatial and temporal dependencies. This transformation process involves sequentially masking frames at the same positions within each frame. These frames are then resized into sub-frames and reorganized into the predetermined layout, forming thumbnails. TALL is model-agnostic and has remarkable simplicity, necessitating only minimal code modifications. Furthermore, we introduce a graph reasoning block (GRB) and semantic consistency (SC) loss to strengthen TALL, culminating in TALL++. GRB enhances interactions between different semantic regions to capture semantic-level inconsistency clues. The semantic consistency loss imposes consistency constraints on semantic features to improve model generalization ability. Extensive experiments on intra-dataset, cross-dataset, diffusion-generated image detection, and deepfake generation method recognition show that TALL++ achieves results surpassing or comparable to the state-of-the-art methods, demonstrating the effectiveness of our approaches for various deepfake detection problems. The code is available at https://github.com/rainy-xu/TALL4Deepfake.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from $39.99 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data Availibility Statement

All the datasets used in this paper are available online. FaceForensics++ (FF++) (https://github.com/ondyari/FaceForensics), Celeb-DF (CDF) (https://github.com/yuezunli/celeb-deepfakeforensics), DFDC (https://ai.meta.com/datasets/dfdc/), DeeperForensics (DFo) (https://liming-jiang.com/projects/DrF1/DrF1.html), FaceShifter(Fsh) (https://github.com/ondyari/FaceForensics), Wild-Deepfake (Wild-DF) (https://github.com/deepfakeinthewild/deepfake-in-the-wild), KoDF (https://deepbrainai-research.github.io/kodf/), and Deepfakes LSUN-Bedroom (DBL) (https://github.com/jonasricker/diffusion-model-deepfake-detection#dataset) can be downloaded from their official website accordingly.

Notes

  1. The model trained on the FaceForensics++ dataset and evaluated on Celeb-DF, the same applies to the following text.

References

  • Afchar, D., Nozick, V., Yamagishi, J., & Echizen, I. (2018). Mesonet: A compact facial video forgery detection network. In 2018 IEEE international workshop on information forensics and security 1–7.

  • Agarwal, S., Farid, H., Gu, Y., He, M., Nagano, K., & Li, H. (2019). Protecting world leaders against deep fakes. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 38.

  • Amerini, I., Galteri, L., Caldelli, R., & Del Bimbo, A. (2019). Deepfake video detection through optical flow based CNN. In Proc: ICCV.

    Book  Google Scholar 

  • Arnab, A., Dehghani, M., Heigold, G., Sun, C., Lucic, M., & Schmid, C. (2021). Vivit: A video vision transformer. Proceedings of the IEEE/CVF international conference on computer vision, pp. 6836–6846.

  • Bilen, H., Fernando, B., Gavves, E., Vedaldi, A., & Gould, S. (2016). Dynamic image networks for action recognition. Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3034–3042.

  • Cao, J., Ma, C., Yao, T., Chen, S., Ding, S., & Yang, X. (2022). End-to-end reconstruction-classification learning for face forgery detection. In Proc: CVPR.

    Book  Google Scholar 

  • Carreira, J., & Zisserman, A. (2017). Quo vadis, action recognition? a new model & the kinetics dataset. In proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6299–6308.

  • Chai, L., Bau, D., Lim, S.-N., & Isola, P. (2020). What makes fake images detectable? understanding properties that generalize. In Proc. ECCV pp. 103–120.

  • Chen, Y., Rohrbach, M., Yan, Z., Shuicheng, Y., Feng, J., & Kalantidis, Y. (2019). Graph-based global reasoning networks. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 433–442.

  • Chen, L., Zhang, Y., Song, Y., Liu, L., & Wang, J. (2022). Self-supervised learning of adversarial example: Towards good generalizations for deepfake detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition pp. 18710–18719.

  • Chollet, F. (2017). Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1251–1258.

  • Coccomini, D. A., Caldelli, R., Falchi, F., Gennaro, C., & Amato, G. (2022). Cross-forgery analysis of vision transformers and CNNS for Deepfake image detection. In Proceedings of the 1st International Workshop on Multimedia AI against Disinformation, pp. 52–58.

  • Cozzolino, D., Rössler, A., Thies, J., Nießner, M., & Verdoliva, L. (2021). Id-reveal: Identity-aware deepfake video detection. In Proceedings of the IEEE/CVF international conference on computer vision pp. 15108–15117.

  • Davide, A. C. (2022). Nicola Messina, Claudio Gennaro, & Fabrizio Falchi. ICIAP: Combining efficientnet and vision transformers for video deepfake detection. In Proc.

    Google Scholar 

  • Davis, J. W., & Bobick, A. F. (1997). The representation and recognition of human movement using temporal templates. In Proceedings of IEEE computer society conference on computer vision and pattern recognition, pp. 928–934.

  • deepfakes. Deepfakes. https://github.com/deepfakes/faceswap, 2020.

  • DeVries, T., & Taylor, G. W. (2017). Improved regularization of convolutional neural networks with cutout. arXiv:1708.04552

  • Dolhansky, B., Bitton, J., Pflaum, B., Lu, J., Howes, R., Wang, M., & Ferrer, C C. (2020). The deepfake detection challenge (dfdc) dataset. arXiv:2006.07397.

  • Dong, X., Bao, J., Chen, D., Zhang, T., Zhang, W., Nenghai, Y., Chen, D., Wen, F., & Guo, B. (2022). Protecting celebrities from Deepfake with identity consistency transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 9468–9478.

  • Dong, C., Kumar, A., & Liu, E. (2022). Think twice before detecting GAN-generated fake images from their spectral domain imprints. In Proc: CVPR.

    Book  Google Scholar 

  • Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., & Houlsby, N. (2021). An image is worth 16x16 words: Transformers for image recognition at scale. In Proc: ICLR.

    Google Scholar 

  • Fei, J., Dai, Y., Peipeng, Y., Shen, T., Xia, Z., & Weng, J. (2022). Learning second order local anomaly for general face forgery detection. In Proc: CVPR.

    Book  Google Scholar 

  • Feichtenhofer, C., Fan, H., Malik, J., & He, K. (2019). Slowfast networks for video recognition. Proceedings of the IEEE/CVF international conference on computer vision pp. 6202–6211.

  • Frank, J., Eisenhofer, T., Schönherr, L., Fischer, A., Kolossa, D., & Holz, T. (2020). Leveraging frequency analysis for deep fake image recognition. In Proc: ICML.

    Google Scholar 

  • Gerstner, C. R., & Farid, H. (2022). Detecting real-time deep-fake videos using active illumination. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 53–60.

  • Goodfellow, I., Pouget-Abadie, J., Mirza, M., Bing, X., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative adversarial networks. In Proc: NeurIPS.

    Google Scholar 

  • Haiwei, W., Zhou, J., Tian, J., & Liu, J. (2022). Robust image forgery detection over online social network shared images. In Proc: CVPR.

    Google Scholar 

  • Haliassos, A., Mira, R., Petridis, S., & Pantic, M. (2022). Leveraging real talking faces via self-supervision for robust forgery detection. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 14950–14962.

  • Haliassos, A., Vougioukas, K., Petridis, S., & Pantic, M. (2021). Lips don’t lie: A generalisable & robust approach to face forgery detection. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition pp. 5039–5049.

  • Hara, K., Kataoka, H., & Satoh, Y. (2017). Learning spatio-temporal features with 3d residual networks for action recognition. Proceedings of the IEEE international conference on computer vision workshops, pp. 3154–3160.

  • He, K. , Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778.

  • Heo, Y.-J., Choi, Y.-J., Lee, Y.-W., & Kim, B.-G. (2021). Deepfake detection scheme based on vision transformer and distillation. arXiv:2104.01353.

  • Hong, F.-T., Zhang, L., Shen, L., & Dan, X. (2022). Depth-aware generative adversarial network for talking head video generation. In Proc: CVPR.

    Book  Google Scholar 

  • Jia, G., Zheng, M., Chuanrui, H., Ma, X., Yuting, X., Liu, L., Deng, Y., & He, R. (2021). Inconsistency-aware wavelet dual-branch network for face forgery detection. IEEE T-BIOM,3(3) .

  • Jiang, L., Li, R., Wu, W., Qian, C., & Loy, C. Change. (2020). Deeperforensics-1.0: A large-scale dataset for real-world face forgery detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2889–2898.

  • Juan, H., Liao, X., Liang, J., Zhou, W., & Qin, Z. (2022). Finfer: Frame inference-based Deepfake detection for high-visual-quality videos. In Proceedings of the AAAI conference on artificial intelligence 951–959.

  • Karras, T., Laine, S., & Aila, T. (2019). A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 4401–4410.

  • Khan, S. A., & Dai, H. (2021). Video transformer for Deepfake detection with incremental learning. In In Proceedings of the 29th ACM international conference on multimedia, pp. 1821–1828.

  • Khormali, A., & Yuan, J.-S. (2022). Dfdt: An end-to-end Deepfake detection framework using vision transformer. Applied Sciences, 12(6), 2953.

    Article  Google Scholar 

  • Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv:1412.6980.

  • Kipf, Thomas N., & Welling, M. (2016). Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016.

  • Kwon, P., You, J., Nam, G., Park, S., and Chae, G A large-scale korean deepfake detection dataset. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10744–10753, 2021.

  • Le, B M., & Woo, S S. (2022) Add: Frequency attention and multi-view based knowledge distillation to detect low-quality compressed deepfake images. In Proceedings of the AAAI conference on artificial intelligence.

  • Li, Y., & Lyu, S. (2019). Exposing deepfake videos by detecting face warping artifacts. In Proc. CVPRW, pp. 656–663.

  • Li, L., Bao, J., Yang, H., Chen, D., & Wen, F. (2020). Advancing high fidelity identity swapping for forgery detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 5074–5083.

  • Li, L., Bao, J., Zhang, T., Yang, H., Chen, D., Wen, F., & Guo, B. (2020). Face x-ray for more general face forgery detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition pp. 5001–5010.

  • Li, Y., Chang, M-C., & Lyu, S. (2018). In ictu oculi: Exposing AI generated fake face videos by detecting eye blinking. In IEEE WIFS.

  • Li, Y., Xin Y., Pu, S., Honggang Q, & Lyu, S. (2020). Celeb-df: A large-scale challenging dataset for Deepfake forensics. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 3207–3216.

  • Liang, X., Hu, Z., Zhang, H., Lin, L., & Xing, E P. (2018). Symbolic graph reasoning meets convolutions. Advances in Neural Information Processing Systems.

  • Liu, Z., Lin, Y, Cao, Y, Han, H., Wei, Y, Zhang, Z, Lin, S, & Guo, B. (2021). Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision 10012–10022.

  • Liu, Z., Luo, D., Wang, Y., Wang, L., Tai, Y., Wang, C., Li, J., Huang, F., & Lu, T., Teinet: Towards an efficient architecture for video recognition. In Proceedings of the AAAI conference on artificial intelligence, pp. 11669–11676, 2020.

  • Liu, Z., Qi, X., & Torr, P HS. (2020). Global texture enhancement for fake face detection in the wild. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.

  • Li, J., Xie, H., Li, J., Wang, Z., & Zhang, Y. (2021). Frequency-aware discriminative feature learning supervised by single-center loss for face forgery detection. In Proc: CVPR.

    Book  Google Scholar 

  • MarekKowalski. Faceswap. https://github.com/MarekKowalski/FaceSwap/, 2021.

  • Masi, I., Killekar, A., Mascarenhas, R M., Gurudatt, Shenoy P., & AbdAlmageed, W. 2020. Two-branch recurrent network for isolating Deepfakes in videos. In Proc. ECCV, pages 667–684.

  • Mirsky, Y., & Lee, W. (2021). The creation and detection of Deepfakes: A survey. ACM CSUR, 54(1), 1–41.

    Google Scholar 

  • Neimark, D., Bar, O., Zohar, M., & Asselmann, D. (2021). Video transformer network. In Proc. ICCV, pp. 3163–3172.

  • Ni, Y., Meng, D., Changqian, Y., Quan, C., Ren, D., & Zhao, Y. (2022). Core: Consistent representation learning for face forgery detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition 12–21.

  • Ni, Y., Meng, D., Changqian, Y., Quan, C., Ren, D., & Zhao, Y. (2022). Core: Consistent representation learning for face forgery detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 12–21.

  • Ning, Y., Skripniuk, V., Chen, D., Davis, L., & Fritz, M. (2022). Responsible disclosure of generative models using scalable fingerprinting. In Proc: ICLR.

    Google Scholar 

  • Nirkin, Y., Wolf, L., Keller, Y., & Hassner, T. (2021) Deepfake detection based on discrepancies between faces and their context. In IEEE TPAMI.

  • Peipeng, Y., Fei, J., Xia, Z., Zhou, Z., & Weng, J. (2022). Improving generalization by commonality learning in face forgery detection. IEEE TIFS, 17, 547–558.

    Google Scholar 

  • Qian, Y., Guojun Yin, L., Sheng, Z. C., & Shao, J. (2020). Thinking in frequency: Face forgery detection by mining frequency-aware clues. In Proc: ECCV.

    Google Scholar 

  • Ricker, J., Damm, S., Holz, T., & Fischer, A. (2022). Towards the detection of diffusion model deepfakes. arXiv preprint arXiv:2210.14571.

  • Rossler, A., Cozzolino, D., Verdoliva, L., Riess, C., Thies, J., & Niessner, M. (2019). Faceforensics++: Learning to detect manipulated facial images. Proceedings of the IEEE/CVF international conference on computer vision, 1–11.

  • Sabir, E., Cheng, J., Jaiswal, A., AbdAlmageed, W., Masi, I., & Natarajan, P. (2019). Recurrent convolutional strategies for face manipulation detection in videos. In Proc. CVPRW, pp. 80–87.

  • Safaei, M., & Foroosh, H. (2019). Still image action recognition by predicting spatial-temporal pixel evolution. In 2019 IEEE winter conference on applications of computer vision pp. 111–120.

  • Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-cam: Visual explanations from deep networks via gradient-based localization. Proceedings of the IEEE international conference on computer vision, pp. 618–626.

  • Shiohara, K., & Yamasaki, T. (2022). Detecting Deepfakes with self-blended images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 18720–18729.

  • Sun, K., Yao, T., Chen, S., Ding, S., Li, J., & Ji, R. (2022). Dual contrastive learning for general face forgery detection. Proceedings of the AAAI conference on artificial intelligence pp. 2316–2324.

  • Sun, Y., Zhang, Z., Qiu, C., Liang W., L., & Sun, & Zekai W. (2022). Faketransformer: Exposing face forgery from spatial-temporal representation modeled by facial pixel variations. In 2022 7th international conference on intelligent computing and signal processing pp. 705–713.

  • Tan, M., & Le, Q. (2019). Efficientnet: Rethinking model scaling for convolutional neural networks. International conference on machine learning, pp. 6105–6114.

  • Thies, J., Zollhofer, M., Stamminger, M., Theobalt, C., & Nießner, M. (2016). Face2face: Real-time face capture and reenactment of RGB videos. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2387–2395.

  • Thies, J., Zollhöfer, M., & Nießner, M. (2019). Deferred neural rendering: Image synthesis using neural textures. ACM TOG, 38(4), 1–12.

    Article  Google Scholar 

  • Verdoliva, L. (2020). Media forensics & Deepfakes: An overview. IEEE Journal of Selected Topics in Signal Processing, 14(5), 910–932.

    Article  Google Scholar 

  • Wang, C., & Deng, W. (2021). Representative forgery mining for fake face detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition pp. 14923–14932.

  • Wang, X., & Gupta, A. (2018). Videos as space-time region graphs. Proceedings of the European conference on computer vision, pp. 399–417.

  • Wang, X., Girshick, R., Gupta, A., & He, K. (2018). Non-local neural networks. In Proc. CVPR, pp. 7794–7803.

  • Wang, P., Liu, K., Zhou, W., Zhou, H., Liu, H., Zhang, W., & Nenghai, Y. (2022). Adt: Anti-deepfake transformer. In ICASSP 2022-2022 IEEE International conference on acoustics, speech and signal processing, pp. 2899–1903.

  • Wang, S.-Y., Wang, O., Zhang, R., Owens, A., & Efros, A. A. (2020). CNN-generated images are surprisingly easy to spot... for now. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pp. 8695–8704.

  • Wang, J., Zuxuan, W., Ouyang, W., Han, Xintong, C., Jingjing, J., Yu-G., & Li, S-N. (2022). M2tr: Multi-modal multi-scale transformers for Deepfake detection. In Proceedings of the 2022 international conference on multimedia retrieval 615–623.

  • Wodajo, D., & Atnafu, S. Deepfake video detection using convolutional vision transformer. arXiv:2102.11126, 2021.

  • Yang, J., Ang, Y Z., Guo, Z., Zhou, K., Zhang, W., & Liu, Z. Panoptic scene graph generation. In Procedings of ECCV, pp. 178–196, 2022.

  • Yang, X., Li, Y., & Lyu, S. (2019). Exposing deep fakes using inconsistent head poses. In Proc: ICASSP.

    Book  Google Scholar 

  • Yang, Z., Liang, J., Yuting, X., Zhang, X.-Y., & He, R. (2023). Masked relation learning for Deepfake detection. IEEE TIFS, 18, 1696–1708.

    Google Scholar 

  • Yao, B., & Fei-Fei, L. (2012). Action recognition with exemplar based 2.5 d graph matching. In Proc. ECCV, pages 173–186.

  • Yiwei, R., Zhou, W., Liu, Y., Sun, Ji., & Li, Q. (2021). Bita-net: Bi-temporal attention network for facial video forgery detection. In: In 2021 IEEE International Joint Conference on Biometrics, pp. 1–8.

  • Yuting, X., Jia, G., Huang, H., Duan, J., & He, R. (2021). Visual-semantic transformer for face forgery detection. In 2021 IEEE International Joint Conference on Biometrics pp. 1–7.

  • Yuting, X., Liang, Jian, J, Gengyun, Y, Ziming, Z, Yanhao, & He, R. (2023). Tall: Thumbnail layout for Deepfake video detection. In Proceedings of the IEEE/CVF international conference on computer vision, 22658–22668.

  • Zhang, Y., Li, X., Liu, C., Shuai, B., Zhu, Y., Brattoli, B., Chen, H., Marsic, I., & Tighe, J. (2021). Vidtr: Video transformer without convolutions. In Proceedings of the IEEE/CVF international conference on computer vision, pp. 13577–13587.

  • Zhang, K., Zhang, Z., Li, Z., & Qiao, Yu. (2016). Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, 23(10), 1499–1503.

    Article  Google Scholar 

  • Zhao, T., Xiang, X., Mingze, X., Ding, H., Xiong, Y., & Xia, W. (2021). Learning self-consistency for deepfake detection. In Proceedings of the IEEE/CVF international conference on computer vision pp. 15023–15033.

  • Zhao, H., Zhou, W., Chen, Dongdong., Zhang, Weiming., & Yu, Nenghai. Self-supervised transformer for Deepfake detection. arXiv:2203.01265, 2022.

  • Zhao, H., Zhou, W., Chen, D., Wei, T., Zhang, W., & Nenghai, Y. (2021). In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition pp. 2185–2194.

  • Zhao, C., Wang, C., Guosheng, H., Chen, H., Liu, C., & Tang, J. (2023). Istvt: Interpretable spatial-temporal video transformer for Deepfake detection. IEEE TIFS, 18, 1335–1348.

    Google Scholar 

  • Zheng, Y., Bao, J., Chen, D., Zeng, M., & Wen, F. (2021). Exploring temporal coherence for more general video face forgery detection. In Proceedings of the IEEE/CVF international conference on computer vision pp. 15044–15054.

  • Zhihao, G., Chen, Y., Yao, T., Ding, S., Li, J., Huang, F., & Ma, L. (2021). Spatiotemporal inconsistency learning for Deepfake video detection. Proceedings of the 29th ACM international conference on multimedia pp. 3473–3481.

  • Zhihao, G., Chen, Y., Yao, T., Ding, S., Li, J., & Ma, L. (2022). Delving into the local: Dynamic inconsistency learning for Deepfake video detection. In Proc: AAAI.

    Google Scholar 

  • Zhou, Y., & Lim, S-N. (2021). Joint audio-visual Deepfake detection. Proceedings of the IEEE/CVF international conference on computer vision pp. 14800–14809.

  • Zi, B., Chang, M., Chen, J., Ma, X., & Jiang, Y.-G. (2020). Wilddeepfake: A challenging real-world dataset for deepfake detection. In Proceedings of the 28th ACM international conference on multimedia 2382–2390.

Download references

Acknowledgements

The authors would like to thank the reviewers and the associate editor for their valuable comments. The authors also thank Ziming Yang, and Gengyun Jia for their help in improving the technical writing aspect of this paper. This work was supported by the National Natural Science Foundation of China (NSFC) under Grants (62376265, 62276256 and U21B2045), the Beijing Nova Program under Grant Z211100002121108, and the Young Elite Scientists Sponsorship Program by CAST.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Jian Liang or Xiao-Yu Zhang.

Additional information

Communicated by Segio Escalera.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, Y., Liang, J., Sheng, L. et al. Learning Spatiotemporal Inconsistency via Thumbnail Layout for Face Deepfake Detection. Int J Comput Vis 132, 5663–5680 (2024). https://doi.org/10.1007/s11263-024-02054-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1007/s11263-024-02054-2

Keywords

Profiles

  1. Yuting Xu