这是indexloc提供的服务,不要输入任何密码
Skip to main content
Log in

Combating Label Noise with a General Surrogate Model for Sample Selection

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Modern deep learning systems are data-hungry. Learning with web data is one of the feasible solutions, but will introduce label noise inevitably, which can hinder the performance of deep neural networks. Sample selection is an effective way to deal with label noise. The key is to separate clean samples based on some criterion. Previous methods pay more attention to the small loss criterion where small-loss samples are regarded as clean ones. Nevertheless, such a strategy relies on the learning dynamics of each data instance. Some noisy samples are still memorized due to frequently occurring corrupted learning patterns. To tackle this problem, a training-free surrogate model is preferred, freeing from the effect of memorization. In this work, we propose to leverage the vision-language surrogate model CLIP to filter noisy samples automatically. CLIP brings external knowledge to facilitate the selection of clean samples with its ability of text-image alignment. Furthermore, a margin adaptive loss is designed to regularize the selection bias introduced by CLIP, providing robustness to label noise. We validate the effectiveness of our proposed method on both real-world and synthetic noisy datasets. Our method achieves significant improvement without CLIP involved during the inference stage.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from $39.99 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availibility

The datasets analyzed during the current study are available in https://www.image-net.org/, https://www.cs.toronto.edu/~kriz/cifar.html, http://noisylabels.com/, https://google.github.io/controlled-noisy-web-labels/ and https://data.vision.ee.ethz.ch/cvl/webvision/dataset2017.html. No new datasets were generated.

Notes

  1. http://noisylabels.com/

References

  • Arazo, E., Ortego, D., Albert, P., O’Connor, N., & McGuinness, K. (2019). Unsupervised label noise modeling and loss correction. In: International conference on machine learning, PMLR, pp 312–321

  • Arpit, D., Jastrzębski, S., Ballas, N., Krueger, D., Bengio, E., Kanwal, M.S., Maharaj, T., Fischer, A., Courville, A., Bengio, Y., et al. (2017). A closer look at memorization in deep networks. In: International conference on machine learning, PMLR, pp 233–242

  • Bai, Y., & Liu, T. (2021). Me-momentum: Extracting hard confident examples from noisily labeled data. In: ICCV

  • Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., & Zagoruyko, S. (2020). End-to-end object detection with transformers. In: European Conference on Computer Vision, Springer, pp 213–229

  • Chen, P., Liao, B.B., Chen, G., Zhang, S. (2019). Understanding and utilizing deep neural networks trained with noisy labels. In: International Conference on Machine Learning, PMLR, pp 1062–1070

  • Cheng, D., Liu, T., Ning, Y., Wang, N., Han, B., Niu, G., Gao, X., & Sugiyama, M. (2022a). Instance-dependent label-noise learning with manifold-regularized transition matrix estimation. In: CVPR

  • Cheng, D., Ning, Y., Wang, N., Gao, X., Yang, H., Du, Y., Han, B., & Liu, T. (2022b). Class-dependent label-noise learning with cycle-consistency regularization. In: Oh AH, Agarwal A, Belgrave D, Cho K (eds) Advances in Neural Information Processing Systems, https://openreview.net/forum?id=IvnoGKQuXi

  • Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, Ieee, pp 248–255

  • Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., & Houlsby, N. (2021). An image is worth 16x16 words: Transformers for image recognition at scale. In: International Conference on Learning Representations, https://openreview.net/forum?id=YicbFdNTTy

  • Feng, C., Tzimiropoulos, G., Patras, I. (2021). Ssr: An efficient and robust framework for learning with unknown label noise. arXiv preprint arXiv:2111.11288

  • Gadre, S.Y., Ilharco, G., Fang, A., Hayase, J., Smyrnis, G., Nguyen, T., Marten, R., Wortsman, M., Ghosh, D., Zhang, J., et al. (2024). Datacomp: In search of the next generation of multimodal datasets. Advances in Neural Information Processing Systems 36

  • Garg, A., Nguyen, C., Felix, R., Do, T.T., Carneiro G (2023) Instance-dependent noisy label learning via graphical modelling. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 2288–2298

  • Han, J., Luo, P., & Wang, X. (2019). Deep self-learning from noisy labels. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 5138–5147

  • Han, B., Yao, Q., Yu, X., Niu, G., Xu, M., Hu, W., Tsang, I., & Sugiyama, M. (2018a). Co-teaching: Robust training of deep neural networks with extremely noisy labels. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds) Advances in Neural Information Processing Systems, Curran Associates, Inc., vol 31, https://proceedings.neurips.cc/paper/2018/file/a19744e268754fb0148b017647355b7b-Paper.pdf

  • Han, B., Yao, Q., Yu, X., Niu, G., Xu, M., Hu, W., Tsang, I., Sugiyama, M. (2018b). Co-teaching: Robust training of deep neural networks with extremely noisy labels. Advances in neural information processing systems 31

  • He, K., Zhang, X., Ren, S., Sun, J. (2016). Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  • Hendrycks, D., Mazeika, M., Wilson, D., & Gimpel, K. (2018). Using trusted data to train deep networks on labels corrupted by severe noise. Advances in neural information processing systems 31

  • Huang, X., Chong, K.F.E. (2023). Genkl: An iterative framework for resolving label ambiguity and label non-conformity in web images via a new generalized kl divergence. International Journal of Computer Vision pp 1–25

  • Huang, Z., Zhang, J., & Shan, H. (2023). Twin contrastive learning with noisy labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 11661–11670

  • Iscen, A., Valmadre, J., Arnab, A., & Schmid, C. (2022). Learning with neighbor consistency for noisy labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4672–4681

  • Jiang, L., Huang, D., Liu, M., & Yang, W. (2020). Beyond synthetic noise: Deep learning on controlled noisy labels. In: International conference on machine learning, PMLR, pp 4804–4815

  • Jiang, L., Zhou, Z., Leung, T., Li, L.J., Fei-Fei, L. (2018). Mentornet: Learning data-driven curriculum for very deep neural networks on corrupted labels. In: International Conference on Machine Learning, PMLR, pp 2304–2313

  • Kim, N.r., Lee, J.S., & Lee, J.H. (2024). Learning with structural labels for learning with noisy labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 27610–27620

  • Kolesnikov, A., Beyer, L., Zhai, X., Puigcerver, J., Yung, J., Gelly, S., Houlsby, N. (2020). Big transfer (bit): General visual representation learning. In: European conference on computer vision, Springer, pp 491–507

  • Krizhevsky, A., & Hinton, G., et al. (2009). Learning multiple layers of features from tiny images

  • Li, J., Socher, R., & Hoi, S.C. (2020). Dividemix: Learning with noisy labels as semi-supervised learning. In: International Conference on Learning Representations, https://openreview.net/forum?id=HJgExaVtwr

  • Li, W., Wang, L., Li, W., Agustsson, E., & Van Gool, L. (2017). Webvision database: Visual learning and understanding from web data. arXiv preprint arXiv:1708.02862

  • Li, S., Xia, X., Ge, S., Liu, T. (2022a). Selective-supervised contrastive learning with noisy labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 316–325

  • Li, S., Xia, X., Zhang, H., Zhan, Y., Ge, S., & Liu, T. (2022b). Estimating noise transition matrix with label correlations for noisy multi-label learning. In: Oh AH, Agarwal A, Belgrave D, Cho K (eds) Advances in Neural Information Processing Systems, https://openreview.net/forum?id=GwXrGy_vc8m

  • Liang, C., Yang, Z., Zhu, L., & Yang, Y. (2023). Co-learning meets stitch-up for noisy multi-label visual recognition. IEEE Transactions on Image Processing, 32, 2508–2519. https://doi.org/10.1109/TIP.2023.3270103

    Article  Google Scholar 

  • Lin, T.Y., Goyal, P., Girshick, R., He, K., & Dollár, P. (2017). Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988

  • Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B. (2021). Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 10012–10022

  • Liu, S., Niles-Weed, J., Razavian, N., & Fernandez-Granda, C. (2020). Early-learning regularization prevents memorization of noisy labels. Advances in neural information processing systems, 33, 20331–20342.

    Google Scholar 

  • Liu, L., Ouyang, W., Wang, X., Fieguth, P., Chen, J., Liu, X., & Pietikäinen, M. (2020). Deep learning for generic object detection: A survey. International journal of computer vision, 128, 261–318.

    Article  Google Scholar 

  • Liu, T., & Tao, D. (2015). Classification with noisy labels by importance reweighting. IEEE Transactions on pattern analysis and machine intelligence, 38(3), 447–461.

    Article  Google Scholar 

  • Malach, E., & Shalev-Shwartz, S. (2017). Decoupling" when to update" from" how to update". Advances in neural information processing systems 30

  • Ma, F., Zhu, L., & Yang, Y. (2022). Weakly supervised moment localization with decoupled consistent concept prediction. International Journal of Computer Vision, 130(5), 1244–1258.

    Article  Google Scholar 

  • Menon, A.K., Jayasumana, S., Rawat, A.S., Jain, H., Veit, A., Kumar, S. (2021). Long-tail learning via logit adjustment. In: International Conference on Learning Representations, https://openreview.net/forum?id=37nvvqkCo5

  • Ortego, D., Arazo, E., Albert, P., O’Connor, N.E., McGuinness, K. (2021). Multi-objective interpolation training for robustness to label noise. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 6606–6615

  • Patrini, G., Rozza, A., Krishna Menon, A., Nock, R., & Qu, L. (2017). Making deep neural networks robust to label noise: A loss correction approach. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1944–1952

  • Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., et al. (2021). Learning transferable visual models from natural language supervision. In: International Conference on Machine Learning, PMLR, pp 8748–8763

  • Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al. (2015). Imagenet large scale visual recognition challenge. International journal of computer vision, 115, 211–252.

    Article  MathSciNet  Google Scholar 

  • Schuhmann, C., Vencu, R., Beaumont, R., Kaczmarczyk, R., Mullis, C., Katta, A., Coombes, T., Jitsev, J., Komatsuzaki, A. (2021). Laion-400m: Open dataset of clip-filtered 400 million image-text pairs. arXiv preprint arXiv:2111.02114

  • Shu, J., Xie, Q., Yi, L., Zhao, Q., Zhou, S., Xu, Z., & Meng, D. (2019). Meta-weight-net: Learning an explicit mapping for sample weighting. Advances in neural information processing systems 32

  • Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A. (2017). Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-first AAAI conference on artificial intelligence

  • Wang, X., Wu, Z., Lian, L., & Yu, S.X. (2022). Debiased learning from naturally imbalanced pseudo-labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 14647–14657

  • Wei, H., Feng, L., Chen, X., & An, B. (2020). Combating noisy labels by agreement: A joint training method with co-regularization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 13726–13735

  • Wei, J., Zhu, Z., Cheng, H., Liu, T., Niu, G., & Liu, Y. (2022). Learning with noisy labels revisited: A study using real-world human annotations. In: International Conference on Learning Representations, https://openreview.net/forum?id=TBWA6PLJZQm

  • Wu, Z.F., Wei, T., Jiang, J., Mao, C., Tang, M., & Li, Y.F. (2021). Ngc: a unified framework for learning with open-world noisy data. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 62–71

  • Xia, X., Liu, T., Han, B., Gong, M., Yu, J., Niu, G., & Sugiyama, M. (2021). Sample selection with uncertainty of losses for learning with noisy labels. https://openreview.net/forum?id=zGsRcuoR5-0

  • Xia, X., Liu, T., Han, B., Wang, N., Gong, M., Liu, H., Niu, G., Tao, D., & Sugiyama, M. (2020). Part-dependent label noise: Towards instance-dependent label noise. Advances in Neural Information Processing Systems, 33, 7597–7610.

    Google Scholar 

  • Xiao, T., Xia, T., Yang, Y., Huang, C., Wang, X. (2015). Learning from massive noisy labeled data for image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2691–2699

  • Xu, Y., Zhu, L., Jiang, L., & Yang, Y. (2021). Faster meta update strategy for noise-robust deep learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 144–153

  • Yang, Y., Zhuang, Y., & Pan, Y. (2021). Multiple knowledge representation for big data artificial intelligence: framework, applications, and case studies. Frontiers of Information Technology & Electronic Engineering, 22(12), 1551–1558.

    Article  Google Scholar 

  • Yao, Y., Sun, Z., Zhang, C., Shen, F., Wu, Q., Zhang, J., & Tang, Z. (2021b). Jo-src: A contrastive approach for combating noisy labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 5192–5201

  • Yao, Y., Liu, T., Gong, M., Han, B., Niu, G., & Zhang, K. (2021). Instance-dependent label-noise learning under a structural causal model. Advances in Neural Information Processing Systems, 34, 4409–4420.

    Google Scholar 

  • Yi, K., & Wu, J. (2019). Probabilistic end-to-end noise correction for learning with noisy labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7017–7025

  • Yu, X., Han, B., Yao, J., Niu, G., Tsang, I., & Sugiyama, M. (2019). How does disagreement help generalization against label corruption? In: International Conference on Machine Learning, PMLR, pp 7164–7173

  • Zhang, Z., & Pfister, T. (2021). Learning fast sample re-weighting without reward data. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 725–734

  • Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz D. (2018). mixup: Beyond empirical risk minimization. In: International Conference on Learning Representations, https://openreview.net/forum?id=r1Ddp1-Rb

  • Zhang, C., Bengio, S., Hardt, M., Recht, B., & Vinyals, O. (2021). Understanding deep learning (still) requires rethinking generalization. Communications of the ACM, 64(3), 107–115.

    Article  Google Scholar 

  • Zhao, S., Zhu, L., Wang, X., & Yang, Y. (2022). Centerclip: Token clustering for efficient text-video retrieval. In: SIGIR ’22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, July 11-15, 2022, Madrid, Spain

  • Zhu, Z., Dong, Z., & Liu, Y. (2022). Detecting corrupted labels without training a model to predict. In: International Conference on Machine Learning, PMLR, pp 27412–27427

Download references

Acknowledgements

This work is supported by National Science and Technology Major Project (2022ZD0117802). This work is also partially supported by the Fundamental Research Funds for the Central Universities (Grant Number: 226-2024-00058).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Linchao Zhu.

Ethics declarations

Conflict of interest

The authors declare that they have no Conflict of interest.

Additional information

Communicated by Giorgos Tolias.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liang, C., Zhu, L., Shi, H. et al. Combating Label Noise with a General Surrogate Model for Sample Selection. Int J Comput Vis 133, 3166–3179 (2025). https://doi.org/10.1007/s11263-024-02324-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1007/s11263-024-02324-z