Combating Label Noise with a General Surrogate Model for Sample Selection

Liang, Chao; Zhu, Linchao; Shi, Humphrey; Yang, Yi

doi:10.1007/s11263-024-02324-z

Combating Label Noise with a General Surrogate Model for Sample Selection

Published: 27 December 2024

Volume 133, pages 3166–3179, (2025)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Chao Liang¹,
Linchao Zhu ORCID: orcid.org/0000-0002-4093-7557¹,
Humphrey Shi^2,3 &
…
Yi Yang¹

449 Accesses
Explore all metrics

Abstract

Modern deep learning systems are data-hungry. Learning with web data is one of the feasible solutions, but will introduce label noise inevitably, which can hinder the performance of deep neural networks. Sample selection is an effective way to deal with label noise. The key is to separate clean samples based on some criterion. Previous methods pay more attention to the small loss criterion where small-loss samples are regarded as clean ones. Nevertheless, such a strategy relies on the learning dynamics of each data instance. Some noisy samples are still memorized due to frequently occurring corrupted learning patterns. To tackle this problem, a training-free surrogate model is preferred, freeing from the effect of memorization. In this work, we propose to leverage the vision-language surrogate model CLIP to filter noisy samples automatically. CLIP brings external knowledge to facilitate the selection of clean samples with its ability of text-image alignment. Furthermore, a margin adaptive loss is designed to regularize the selection bias introduced by CLIP, providing robustness to label noise. We validate the effectiveness of our proposed method on both real-world and synthetic noisy datasets. Our method achieves significant improvement without CLIP involved during the inference stage.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference

Delving Deeper Into Clean Samples for Combating Noisy Labels

Adversarial domain adaptation with CLIP for few-shot image classification

Article 30 November 2024

Data availibility

The datasets analyzed during the current study are available in https://www.image-net.org/, https://www.cs.toronto.edu/~kriz/cifar.html, http://noisylabels.com/, https://google.github.io/controlled-noisy-web-labels/ and https://data.vision.ee.ethz.ch/cvl/webvision/dataset2017.html. No new datasets were generated.

Notes

http://noisylabels.com/

References

Arazo, E., Ortego, D., Albert, P., O’Connor, N., & McGuinness, K. (2019). Unsupervised label noise modeling and loss correction. In: International conference on machine learning, PMLR, pp 312–321
Arpit, D., Jastrzębski, S., Ballas, N., Krueger, D., Bengio, E., Kanwal, M.S., Maharaj, T., Fischer, A., Courville, A., Bengio, Y., et al. (2017). A closer look at memorization in deep networks. In: International conference on machine learning, PMLR, pp 233–242
Bai, Y., & Liu, T. (2021). Me-momentum: Extracting hard confident examples from noisily labeled data. In: ICCV
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., & Zagoruyko, S. (2020). End-to-end object detection with transformers. In: European Conference on Computer Vision, Springer, pp 213–229
Chen, P., Liao, B.B., Chen, G., Zhang, S. (2019). Understanding and utilizing deep neural networks trained with noisy labels. In: International Conference on Machine Learning, PMLR, pp 1062–1070
Cheng, D., Liu, T., Ning, Y., Wang, N., Han, B., Niu, G., Gao, X., & Sugiyama, M. (2022a). Instance-dependent label-noise learning with manifold-regularized transition matrix estimation. In: CVPR
Cheng, D., Ning, Y., Wang, N., Gao, X., Yang, H., Du, Y., Han, B., & Liu, T. (2022b). Class-dependent label-noise learning with cycle-consistency regularization. In: Oh AH, Agarwal A, Belgrave D, Cho K (eds) Advances in Neural Information Processing Systems, https://openreview.net/forum?id=IvnoGKQuXi
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, Ieee, pp 248–255
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., & Houlsby, N. (2021). An image is worth 16x16 words: Transformers for image recognition at scale. In: International Conference on Learning Representations, https://openreview.net/forum?id=YicbFdNTTy
Feng, C., Tzimiropoulos, G., Patras, I. (2021). Ssr: An efficient and robust framework for learning with unknown label noise. arXiv preprint arXiv:2111.11288
Gadre, S.Y., Ilharco, G., Fang, A., Hayase, J., Smyrnis, G., Nguyen, T., Marten, R., Wortsman, M., Ghosh, D., Zhang, J., et al. (2024). Datacomp: In search of the next generation of multimodal datasets. Advances in Neural Information Processing Systems 36
Garg, A., Nguyen, C., Felix, R., Do, T.T., Carneiro G (2023) Instance-dependent noisy label learning via graphical modelling. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 2288–2298
Han, J., Luo, P., & Wang, X. (2019). Deep self-learning from noisy labels. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 5138–5147
Han, B., Yao, Q., Yu, X., Niu, G., Xu, M., Hu, W., Tsang, I., & Sugiyama, M. (2018a). Co-teaching: Robust training of deep neural networks with extremely noisy labels. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds) Advances in Neural Information Processing Systems, Curran Associates, Inc., vol 31, https://proceedings.neurips.cc/paper/2018/file/a19744e268754fb0148b017647355b7b-Paper.pdf
Han, B., Yao, Q., Yu, X., Niu, G., Xu, M., Hu, W., Tsang, I., Sugiyama, M. (2018b). Co-teaching: Robust training of deep neural networks with extremely noisy labels. Advances in neural information processing systems 31
He, K., Zhang, X., Ren, S., Sun, J. (2016). Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hendrycks, D., Mazeika, M., Wilson, D., & Gimpel, K. (2018). Using trusted data to train deep networks on labels corrupted by severe noise. Advances in neural information processing systems 31
Huang, X., Chong, K.F.E. (2023). Genkl: An iterative framework for resolving label ambiguity and label non-conformity in web images via a new generalized kl divergence. International Journal of Computer Vision pp 1–25
Huang, Z., Zhang, J., & Shan, H. (2023). Twin contrastive learning with noisy labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 11661–11670
Iscen, A., Valmadre, J., Arnab, A., & Schmid, C. (2022). Learning with neighbor consistency for noisy labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4672–4681
Jiang, L., Huang, D., Liu, M., & Yang, W. (2020). Beyond synthetic noise: Deep learning on controlled noisy labels. In: International conference on machine learning, PMLR, pp 4804–4815
Jiang, L., Zhou, Z., Leung, T., Li, L.J., Fei-Fei, L. (2018). Mentornet: Learning data-driven curriculum for very deep neural networks on corrupted labels. In: International Conference on Machine Learning, PMLR, pp 2304–2313
Kim, N.r., Lee, J.S., & Lee, J.H. (2024). Learning with structural labels for learning with noisy labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 27610–27620
Kolesnikov, A., Beyer, L., Zhai, X., Puigcerver, J., Yung, J., Gelly, S., Houlsby, N. (2020). Big transfer (bit): General visual representation learning. In: European conference on computer vision, Springer, pp 491–507
Krizhevsky, A., & Hinton, G., et al. (2009). Learning multiple layers of features from tiny images
Li, J., Socher, R., & Hoi, S.C. (2020). Dividemix: Learning with noisy labels as semi-supervised learning. In: International Conference on Learning Representations, https://openreview.net/forum?id=HJgExaVtwr
Li, W., Wang, L., Li, W., Agustsson, E., & Van Gool, L. (2017). Webvision database: Visual learning and understanding from web data. arXiv preprint arXiv:1708.02862
Li, S., Xia, X., Ge, S., Liu, T. (2022a). Selective-supervised contrastive learning with noisy labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 316–325
Li, S., Xia, X., Zhang, H., Zhan, Y., Ge, S., & Liu, T. (2022b). Estimating noise transition matrix with label correlations for noisy multi-label learning. In: Oh AH, Agarwal A, Belgrave D, Cho K (eds) Advances in Neural Information Processing Systems, https://openreview.net/forum?id=GwXrGy_vc8m
Liang, C., Yang, Z., Zhu, L., & Yang, Y. (2023). Co-learning meets stitch-up for noisy multi-label visual recognition. IEEE Transactions on Image Processing, 32, 2508–2519. https://doi.org/10.1109/TIP.2023.3270103
Article Google Scholar
Lin, T.Y., Goyal, P., Girshick, R., He, K., & Dollár, P. (2017). Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B. (2021). Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 10012–10022
Liu, S., Niles-Weed, J., Razavian, N., & Fernandez-Granda, C. (2020). Early-learning regularization prevents memorization of noisy labels. Advances in neural information processing systems, 33, 20331–20342.
Google Scholar
Liu, L., Ouyang, W., Wang, X., Fieguth, P., Chen, J., Liu, X., & Pietikäinen, M. (2020). Deep learning for generic object detection: A survey. International journal of computer vision, 128, 261–318.
Article Google Scholar
Liu, T., & Tao, D. (2015). Classification with noisy labels by importance reweighting. IEEE Transactions on pattern analysis and machine intelligence, 38(3), 447–461.
Article Google Scholar
Malach, E., & Shalev-Shwartz, S. (2017). Decoupling" when to update" from" how to update". Advances in neural information processing systems 30
Ma, F., Zhu, L., & Yang, Y. (2022). Weakly supervised moment localization with decoupled consistent concept prediction. International Journal of Computer Vision, 130(5), 1244–1258.
Article Google Scholar
Menon, A.K., Jayasumana, S., Rawat, A.S., Jain, H., Veit, A., Kumar, S. (2021). Long-tail learning via logit adjustment. In: International Conference on Learning Representations, https://openreview.net/forum?id=37nvvqkCo5
Ortego, D., Arazo, E., Albert, P., O’Connor, N.E., McGuinness, K. (2021). Multi-objective interpolation training for robustness to label noise. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 6606–6615
Patrini, G., Rozza, A., Krishna Menon, A., Nock, R., & Qu, L. (2017). Making deep neural networks robust to label noise: A loss correction approach. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1944–1952
Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., et al. (2021). Learning transferable visual models from natural language supervision. In: International Conference on Machine Learning, PMLR, pp 8748–8763
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al. (2015). Imagenet large scale visual recognition challenge. International journal of computer vision, 115, 211–252.
Article MathSciNet Google Scholar
Schuhmann, C., Vencu, R., Beaumont, R., Kaczmarczyk, R., Mullis, C., Katta, A., Coombes, T., Jitsev, J., Komatsuzaki, A. (2021). Laion-400m: Open dataset of clip-filtered 400 million image-text pairs. arXiv preprint arXiv:2111.02114
Shu, J., Xie, Q., Yi, L., Zhao, Q., Zhou, S., Xu, Z., & Meng, D. (2019). Meta-weight-net: Learning an explicit mapping for sample weighting. Advances in neural information processing systems 32
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A. (2017). Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-first AAAI conference on artificial intelligence
Wang, X., Wu, Z., Lian, L., & Yu, S.X. (2022). Debiased learning from naturally imbalanced pseudo-labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 14647–14657
Wei, H., Feng, L., Chen, X., & An, B. (2020). Combating noisy labels by agreement: A joint training method with co-regularization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 13726–13735
Wei, J., Zhu, Z., Cheng, H., Liu, T., Niu, G., & Liu, Y. (2022). Learning with noisy labels revisited: A study using real-world human annotations. In: International Conference on Learning Representations, https://openreview.net/forum?id=TBWA6PLJZQm
Wu, Z.F., Wei, T., Jiang, J., Mao, C., Tang, M., & Li, Y.F. (2021). Ngc: a unified framework for learning with open-world noisy data. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 62–71
Xia, X., Liu, T., Han, B., Gong, M., Yu, J., Niu, G., & Sugiyama, M. (2021). Sample selection with uncertainty of losses for learning with noisy labels. https://openreview.net/forum?id=zGsRcuoR5-0
Xia, X., Liu, T., Han, B., Wang, N., Gong, M., Liu, H., Niu, G., Tao, D., & Sugiyama, M. (2020). Part-dependent label noise: Towards instance-dependent label noise. Advances in Neural Information Processing Systems, 33, 7597–7610.
Google Scholar
Xiao, T., Xia, T., Yang, Y., Huang, C., Wang, X. (2015). Learning from massive noisy labeled data for image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2691–2699
Xu, Y., Zhu, L., Jiang, L., & Yang, Y. (2021). Faster meta update strategy for noise-robust deep learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 144–153
Yang, Y., Zhuang, Y., & Pan, Y. (2021). Multiple knowledge representation for big data artificial intelligence: framework, applications, and case studies. Frontiers of Information Technology & Electronic Engineering, 22(12), 1551–1558.
Article Google Scholar
Yao, Y., Sun, Z., Zhang, C., Shen, F., Wu, Q., Zhang, J., & Tang, Z. (2021b). Jo-src: A contrastive approach for combating noisy labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 5192–5201
Yao, Y., Liu, T., Gong, M., Han, B., Niu, G., & Zhang, K. (2021). Instance-dependent label-noise learning under a structural causal model. Advances in Neural Information Processing Systems, 34, 4409–4420.
Google Scholar
Yi, K., & Wu, J. (2019). Probabilistic end-to-end noise correction for learning with noisy labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7017–7025
Yu, X., Han, B., Yao, J., Niu, G., Tsang, I., & Sugiyama, M. (2019). How does disagreement help generalization against label corruption? In: International Conference on Machine Learning, PMLR, pp 7164–7173
Zhang, Z., & Pfister, T. (2021). Learning fast sample re-weighting without reward data. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 725–734
Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz D. (2018). mixup: Beyond empirical risk minimization. In: International Conference on Learning Representations, https://openreview.net/forum?id=r1Ddp1-Rb
Zhang, C., Bengio, S., Hardt, M., Recht, B., & Vinyals, O. (2021). Understanding deep learning (still) requires rethinking generalization. Communications of the ACM, 64(3), 107–115.
Article Google Scholar
Zhao, S., Zhu, L., Wang, X., & Yang, Y. (2022). Centerclip: Token clustering for efficient text-video retrieval. In: SIGIR ’22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, July 11-15, 2022, Madrid, Spain
Zhu, Z., Dong, Z., & Liu, Y. (2022). Detecting corrupted labels without training a model to predict. In: International Conference on Machine Learning, PMLR, pp 27412–27427

Download references

Acknowledgements

This work is supported by National Science and Technology Major Project (2022ZD0117802). This work is also partially supported by the Fundamental Research Funds for the Central Universities (Grant Number: 226-2024-00058).

Author information

Authors and Affiliations

ReLER Lab, CCAI, Zhejiang University, Hangzhou, China
Chao Liang, Linchao Zhu & Yi Yang
SHI Labs @ UIUC & Oregon, Champaign, USA
Humphrey Shi
Picsart AI Research (PAIR), New York, USA
Humphrey Shi

Authors

Chao Liang
View author publications
Search author on:PubMed Google Scholar
Linchao Zhu
View author publications
Search author on:PubMed Google Scholar
Humphrey Shi
View author publications
Search author on:PubMed Google Scholar
Yi Yang
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Linchao Zhu.

Ethics declarations

Conflict of interest

The authors declare that they have no Conflict of interest.

Additional information

Communicated by Giorgos Tolias.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liang, C., Zhu, L., Shi, H. et al. Combating Label Noise with a General Surrogate Model for Sample Selection. Int J Comput Vis 133, 3166–3179 (2025). https://doi.org/10.1007/s11263-024-02324-z

Download citation

Received: 17 September 2023
Accepted: 01 December 2024
Published: 27 December 2024
Version of record: 27 December 2024
Issue date: June 2025
DOI: https://doi.org/10.1007/s11263-024-02324-z

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Combating Label Noise with a General Surrogate Model for Sample Selection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference

Delving Deeper Into Clean Samples for Combating Noisy Labels

Adversarial domain adaptation with CLIP for few-shot image classification

Explore related subjects

Data availibility

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Subscribe and save

Buy Now