Pattern-Expandable Image Copy Detection

Wang, Wenhao; Sun, Yifan; Yang, Yi

doi:10.1007/s11263-024-02140-5

Pattern-Expandable Image Copy Detection

Published: 22 June 2024

Volume 132, pages 5618–5634, (2024)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

493 Accesses
Explore all metrics

Abstract

Open-world visual recognition aims to empower models to identify objects in real-world settings, particularly when they encounter domains or categories that are not included in the training dataset. This paper proposes a specific open-world visual recognition task, i.e. Pattern-Expandable Image Copy Detection (PE-ICD). In realistic scenarios, the continuous emergence of novel tampering patterns necessitates fast upgrades to the ICD system to prevent confusion in already-trained models. Therefore, our PE-ICD focuses on two aspects, i.e., rehearsal-free upgrade and backward-compatible deployment: (1) The rehearsal-free upgrade utilizes only the new patterns to save time, as re-training on the old patterns can be very time-consuming. (2) The backward-compatible deployment allows for comparing the updated query features against the outdated gallery features, thereby avoiding the need to re-extract features for the extensively large gallery. To lay the foundation for PE-ICD research, we construct the first regulated pattern set, CrossPattern, and propose Pattern Stripping (P-Strip). CrossPattern regulates both base and novel patterns during the initial training and subsequent upgrades. Given a query, our P-Strip separates the tamper patterns by decomposing it into an image feature and multiple pattern features. The advantage of P-Strip is that we can easily introduce new pattern features with minimal impact on the image feature and previously seen pattern features. Experimental results show that P-Strip supports both rehearsal-free upgrading and backward compatibility. Our code is publicly available at https://github.com/WangWenhao0716/PEICD.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An efficient framework for text-to-image retrieval using complete feature aggregation and cross-knowledge conversion

Article 26 September 2025

Image classification by combining local and global features

Article 07 April 2018

COURIER: contrastive user intention reconstruction for large-scale visual recommendation

Article 14 December 2024

References

Berman, M., Jégou, H., Vedaldi, A., Kokkinos, I., & Douze, M. (2019). Multigrain: A unified image embedding for classes and instances. arXiv preprint arXiv:1902.05509
Budnik, M., & Avrithis, Y. (2021). Asymmetric metric learning for knowledge transfer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 8228–8238).
Caron, M., Touvron, H., Misra, I., Jégou, H., Mairal, J., Bojanowski, P., & Joulin, A. (2021). Emerging properties in self-supervised vision transformers. In Proceedings of the International Conference on Computer Vision (ICCV).
Chaoyu, Z., Jianjun, Q., Shumin, Z., Jin, X., & Yang, J. (2024). Learning robust facial representation from the view of diversity and closeness. International Journal of Computer Vision, 132(2), 410–427.
Article Google Scholar
Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020). A simple framework for contrastive learning of visual representations. In International Conference on Machine Learning, PMLR (pp. 1597–1607).
Choudhury, S., Laina, I., Rupprecht, C., & Vedaldi, A. (2024). The curious layperson: Fine-grained image recognition without expert labels. International Journal of Computer Vision, 132(2), 537–554.
Article Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition (pp. 248–255). IEEE
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations.
Douze, M., Jégou, H., Sandhawalia, H., Amsaleg, L., & Schmid, C. (2009). Evaluation of gist descriptors for web-scale image search. In Proceedings of the ACM International Conference on Image and Video Retrieval (pp. 1–8)
Douze, M., Tolias, G., Pizzi, E., Papakipos, Z., Chanussot, L., Radenovic, F., Jenicek, T., Maximov, M., Leal-Taixé, L., Elezi, I., et al. (2021). The 2021 image similarity dataset and challenge. arXiv preprint arXiv:2106.09672
Duggal, R., Zhou, H., Yang, S., Xiong, Y., Xia, W., Tu, Z., & Soatto, S. (2021). Compatibility-aware heterogeneous visual search. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 10723–10732).
Flusser, J., Lébl, M., Šroubek, F., Pedone, M., & Kostková, J. (2023). Blur invariants for image recognition. International Journal of Computer Vision, 131(9), 2298–2315.
Article Google Scholar
Hermans, A., Beyer, L., & Leibe, B. (2017). In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737
Kirkpatrick, J., Pascanu, R., Rabinowitz, N., Veness, J., Desjardins, G., Rusu, A. A., Milan, K., Quan, J., Ramalho, T., Grabska-Barwinska, A., et al. (2017). Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences, 114(13), 3521–3526.
Article MathSciNet Google Scholar
Lao, M., Pu, N., Liu, Y., Zhong, Z., Bakker, E.M., Sebe, N., & Lew, M.S. (2023). Multi-domain lifelong visual question answering via self-critical distillation. In Proceedings of the 31st ACM International Conference on Multimedia, Association for Computing Machinery, New York, NY, USA (pp. 4747–4758).
Li, W.H., Liu, X., & Bilen, H. (2023). Universal representations: A unified look at multiple task and domain learning. International Journal of Computer Vision, 1–25.
Li, Z., & Hoiem, D. (2017). Learning without forgetting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(12), 2935–2947.
Article Google Scholar
Liu, W., Wen, Y., Yu, Z., & Yang, M. (2016). Large-margin softmax loss for convolutional neural networks. In ICML.
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., & Lin, Z., et al. (2019). Pytorch: An imperative style, high-performance deep learning library.
Pizzi, E., Roy, S.D., Ravindra, S.N., Goyal, P., & Douze, M. (2022). A self-supervised descriptor for image copy detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 14532–14542).
Pu, N., Zhong, Z., Sebe, N., & Lew, M. S. (2023). A memorizing and generalizing framework for lifelong person re-identification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(11), 13567–13585.
Article Google Scholar
Rao, H., Leung, C., & Miao, C. (2024). Hierarchical skeleton meta-prototype contrastive learning with hard skeleton mining for unsupervised person re-identification. International Journal of Computer Vision, 132(1), 238–260.
Article Google Scholar
Rebuffi, S.A., Kolesnikov, A., Sperl, G., & Lampert, C.H. (2017). ICARL: Incremental classifier and representation learning. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (pp. 2001–2010).
Shen, Y., Xiong, Y., Xia, W., & Soatto, S. (2020). Towards backward-compatible representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 6368–6377).
Sohn, K. (2016). Improved deep metric learning with multi-class n-pair loss objective. In Advances in neural information processing systems (Vol. 29).
Sun, Y., Cheng, C., Zhang, Y., Zhang, C., Zheng, L., Wang, Z., & Wei, Y. (2020). Circle loss: A unified perspective of pair similarity optimization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 6398–6407).
Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., & Jégou, H. (2021). Training data-efficient image transformers & distillation through attention. In International Conference on Machine Learning, PMLR (pp. 10347–10357).
Van der Maaten, L., & Hinton, G. (2008). Visualizing data using t-sne. Journal of Machine Learning Research, 9(11).
Wang, C.Y., Chang, Y.L., Yang, S.T., Chen, D., & Lai, S.H. (2020). Unified representation learning for cross model compatibility. In British Machine Vision Conference.
Wang, H., Wang, Y., Zhou, Z., Ji, X., Gong, D., Zhou, J., Li, Z., & Liu, W. (2018). Cosface: Large margin cosine loss for deep face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 5265–5274).
Wang, W., Sun, Y., Zhang, W., & Yang, Y. (2021a). D$^{2}$lv: A data-driven and local-verification approach for image copy detection. arXiv preprint arXiv:2111.07090
Wang, W., Zhang, W., Sun, Y., & Yang, Y. (2021b). Bag of tricks and a strong baseline for image copy detection. arXiv preprint arXiv:2111.08004
Wang, W., Sun, Y., Yang, Y. (2023a). A benchmark and asymmetrical-similarity learning for practical image copy detection. In AAAI Conference on Artificial Intelligence.
Wang, W., Zhong, Z., Wang, W., Chen, X., Ling, C., Wang, B., & Sebe, N. (2023b). Dynamically instance-guided adaptation: A backward-free approach for test-time domain adaptive semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 24090–24099).
Wang, Z., Gao, Z., Guo, K., Yang, Y., Wang, X., & Shen, H. T. (2023c). Multilateral semantic relations modeling for image text retrieval. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 2830–2839).
Wu, W., Sun, Z., Song, Y., Wang, J., & Ouyang, W. (2024). Transferring vision-language models for visual recognition: A classifier perspective. International Journal of Computer Vision, 132(2), 392–409.
Article Google Scholar
Zhong, Z., Zhao, Y., Lee, G. H., & Sebe, N. (2022). Adversarial style augmentation for domain generalized urban-scene segmentation. Advances in Neural Information Processing Systems, 35, 338–350.
Google Scholar
Zhou, K., Yang, Y., Qiao, Y., & Xiang, T. (2023). Mixstyle neural networks for domain generalization and adaptation. International Journal of Computer Vision, 1–15.
Zhu, J., Liu, L., Zhan, Y., Zhu, X., Zeng, H., & Tao, D. (2023). Attribute-image person re-identification via modal-consistent metric learning. International Journal of Computer Vision, 131(11), 2959-2976.
Article Google Scholar

Download references

Author information

Authors and Affiliations

ReLER, University of Technology Sydney, Sydney, Australia
Wenhao Wang
Baidu Inc, Beijing, China
Yifan Sun
College of Computer Science and Technology, Zhejiang University, Hangzhou, Zhejiang, China
Yi Yang

Authors

Wenhao Wang
View author publications
Search author on:PubMed Google Scholar
Yifan Sun
View author publications
Search author on:PubMed Google Scholar
Yi Yang
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Yi Yang.

Additional information

Communicated by Zhun Zhong.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Demonstration of the Base and Novel Patterns

Tables 5, 6, and 7 display the names, detailed elaborations, and demonstrations of base and novel patterns. Although we use four samples to illustrate, in our PEICD, the query images have no overlap with the training images, a basic requirement for image retrieval tasks.

Table 5 The demonstration of the base and novel patterns (part 1) in our CrossPattern

Full size table

Table 6 The demonstration of the base and novel patterns (part 2) in our CrossPattern

Full size table

Table 7 The demonstration of the base and novel patterns (part 3) in our CrossPattern

Full size table

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, W., Sun, Y. & Yang, Y. Pattern-Expandable Image Copy Detection. Int J Comput Vis 132, 5618–5634 (2024). https://doi.org/10.1007/s11263-024-02140-5

Download citation

Received: 07 December 2023
Accepted: 31 May 2024
Published: 22 June 2024
Version of record: 22 June 2024
Issue date: December 2024
DOI: https://doi.org/10.1007/s11263-024-02140-5

Keywords

Part of a collection:

Special Issue on Open-World Visual Recognition

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Pattern-Expandable Image Copy Detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An efficient framework for text-to-image retrieval using complete feature aggregation and cross-knowledge conversion

Image classification by combining local and global features

COURIER: contrastive user intention reconstruction for large-scale visual recommendation

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Demonstration of the Base and Novel Patterns

Demonstration of the Base and Novel Patterns

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now