Multi-source-free Domain Adaptive Object Detection

Zhao, Sicheng; Yao, Huizai; Lin, Chuang; Gao, Yue; Ding, Guiguang

doi:10.1007/s11263-024-02170-z

Multi-source-free Domain Adaptive Object Detection

Published: 11 July 2024

Volume 132, pages 5950–5982, (2024)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Sicheng Zhao ORCID: orcid.org/0000-0001-5843-6411¹^na1,
Huizai Yao^1,2,3^na1,
Chuang Lin⁴^na1,
Yue Gao^1,5 &
…
Guiguang Ding^1,5

1700 Accesses
5 Citations
Explore all metrics

A Correction to this article was published on 08 October 2024

This article has been updated

Abstract

To enhance the transferability of object detection models in real-world scenarios where data is sampled from disparate distributions, considerable attention has been devoted to domain adaptive object detection (DAOD). Researchers have also investigated multi-source DAOD to confront the challenges posed by training samples originating from different source domains. However, existing methods encounter difficulties when source data is unavailable due to privacy preservation policies or transmission cost constraints. To address these issues, we introduce and address the problem of Multi-source-free Domain Adaptive Object Detection (MSFDAOD), which seeks to perform domain adaptation for object detection using multi-source-pretrained models without any source data or target labels. Specifically, we propose a novel Divide-and-Aggregate Contrastive Adaptation (DACA) framework. First, multiple mean-teacher detection models perform effective knowledge distillation and class-wise contrastive learning within each source domain feature space, denoted as “Divide”. Meanwhile, DACA integrates proposals, obtains unified pseudo-labels, and assigns dynamic weights to student prediction aggregation, denoted as “Aggregate”. The two-step process of “Divide” and “Aggregate” enables our method to efficiently leverage the advantages of multiple source-free models and aggregate their contributions to adaptation in a self-supervised manner. Extensive experiments are conducted on multiple popular benchmark datasets, and the results demonstrate that the proposed DACA framework significantly outperforms state-of-the-art approaches for MSFDAOD tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-source Open-Set Deep Adversarial Domain Adaptation

Source-free domain adaptive object detection based on pseudo-supervised mean teacher

Article 02 November 2022

Dynamic Retraining-Updating Mean Teacher for Source-Free Object Detection

Change history

08 October 2024
A Correction to this paper has been published: https://doi.org/10.1007/s11263-024-02257-7

References

Ahmed, S. M., Raychaudhuri, D. S., Paul, S., Oymak, S., & Roy-Chowdhury, A. K. (2021). Unsupervised multi-source domain adaptation without access to source data. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 10103–10112).
Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., & Mané, D. (2016). Concrete problems in AI safety. arXiv:1606.06565
Bochkovskiy, A., Wang, C. Y., & Liao, H. Y. M. (2020). Yolov4: Optimal speed and accuracy of object detection. arXiv:2004.10934
Bodla, N., Singh, B., Chellappa, R., & Davis, L. S. (2017). Soft-NMS–improving object detection with one line of code. In Proceedings of the IEEE international conference on computer vision (pp. 5561–5569).
Cai, Q., Pan, Y., Ngo, C. W., Tian, X., Duan, L., Yao, T. (2019). Exploring object relation in mean teacher for cross-domain detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 11457–11466).
Cao, S., Joshi, D., Gui, L. Y., & Wang, Y. X. (2023). Contrastive mean teacher for domain adaptive object detectors. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 23839–23848).
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., & Zagoruyko, S. (2020). End-to-end object detection with transformers. In Proceedings of the European conference on computer vision (pp. 213–229).
Chen, C., Zheng, Z., Ding, X., Huang, Y., & Dou, Q. (2020a). Harmonizing transferability and discriminability for adapting object detectors. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 8869–8878).
Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2017). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834–848.
Article Google Scholar
Chen, S., Sun, P., Song, Y., & Luo, P. (2023). Diffusiondet: Diffusion model for object detection. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 19830–19843).
Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020b). A simple framework for contrastive learning of visual representations. In Proceedings of the international conference on machine learning (pp. 1597–1607).
Chen, X., Wang, S., Long, M., & Wang, J. (2019). Transferability vs. discriminability: Batch spectral penalization for adversarial domain adaptation. In Proceedings of the international conference on machine learning (pp. 1081–1090).
Chen, Y., Li, W., Sakaridis, C., Dai, D., & Van Gool, L. (2018). Domain adaptive faster r-cnn for object detection in the wild. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 3339–3348).
Chen, Y. T., Shi, J., Ye, Z., Mertz, C., Ramanan, D., & Kong, S. (2022). Multimodal object detection via probabilistic ensembling. In Proceedings of the European conference on computer vision (pp. 139–158).
Chu, Q., Li, S., Chen, G., Li, K., & Li, X. (2023). Adversarial alignment for source free object detection. In Proceedings of the AAAI conference on artificial intelligence (pp. 452–460).
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., & Schiele, B. (2016). The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 3213–3223).
Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27.
Article Google Scholar
Deng, J., Li, W., Chen, Y., & Duan, L. (2021). Unbiased mean teacher for cross-domain object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 4091–4101).
Deng, J., Xu, D., Li, W., & Duan, L. (2023). Harmonious teacher for cross-domain object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 23829–23838).
Denton, E. L., Zaremba, W., Bruna, J., LeCun, Y., & Fergus, R. (2014). Exploiting linear structure within convolutional networks for efficient evaluation. In Advances in neural information processing systems (pp. 1269–1277).
Dong, J., Fang, Z., Liu, A., Sun, G., & Liu, T. (2021). Confident anchor-induced multi-source free domain adaptation. In Advances in neural information processing systems (pp. 2848–2860).
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., & Uszkoreit, J. (2021). An image is worth 16x16 words: Transformers for image recognition at scale. In Proceedings of the international conference on learning representations.
He, Z., Zhang, L. (2020). Domain adaptive object detection via asymmetric tri-way faster-rcnn. Proceedings of the European (pp. 309–324). Conference on Computer Vision
Fang, Y., Yap, P. T., Lin, W., Zhu, H., & Liu, M. (2022). Source-free unsupervised domain adaptation: A survey. arXiv:2301.00265
Ganin, Y., & Lempitsky, V. (2015). Unsupervised domain adaptation by backpropagation. In Proceedings of the international conference on machine learning (pp. 1180–1189).
Geiger, A., Lenz, P., & Urtasun, R. (2012). Are we ready for autonomous driving? The kitti vision benchmark suite. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 3354–3361).
Girshick, R. (2015). Fast r-cnn. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 1440–1448).
Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 580–587).
Gori, M., Monfardini, G., & Scarselli, F. (2005). A new model for learning in graph domains. In Proceedings of the IEEE international joint conference on neural networks (pp. 729–734).
Han, Z., Zhang, Z., Wang, F., He, R., Su, W., Xi, X., & Yin, Y. (2023). Discriminability and transferability estimation: A Bayesian source importance estimation approach for multi-source-free domain adaptation. In Proceedings of the AAAI conference on artificial intelligence (pp. 7811–7820)
He, K., Zhang, X., Ren, S., & Sun, J. (2015). Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(9), 1904–1916.
Article Google Scholar
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 770–778).
He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017). Mask r-cnn. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 2961–2969).
He, Z., Zhang, L., Gao, X., & Zhang, D. (2023). Multi-adversarial faster-RCNN with paradigm teacher for unrestricted object detection. International Journal of Computer Vision, 131(3), 680–700.
Article Google Scholar
Hinton, G., Vinyals, O., & Dean, J. (2015). Distilling the knowledge in a neural network. arXiv:1503.02531
Ho, J., Jain, A., & Abbeel, P. (2020). Denoising diffusion probabilistic models. In Advances in neural information processing systems (pp. 6840–6851).
Hoffman, J., Kulis, B., Darrell, T., & Saenko, K. (2012). Discovering latent domains for multisource domain adaptation. In Proceedings of the European conference on computer vision (pp. 702–715).
Hsu, C. C., Tsai, Y. H., Lin, Y. Y., & Yang, M. H. (2020). Every pixel matters: Center-aware feature alignment for domain adaptive object detector. In Proceedings of the European conference on computer vision (pp. 733–748).
Hu, W., Miyato, T., Tokui, S., Matsumoto, E., & Sugiyama, M. (2017). Learning discrete representations via information maximizing self-augmented training. In Proceedings of the international conference on machine learning (pp. 1558–1567).
Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 4700–4708).
Huang, J., Guan, D., Xiao, A., & Lu, S. (2021). Model adaptation: Historical contrastive learning for unsupervised domain adaptation without source data. In Advances in neural information processing systems (pp. 3635–3649).
Inoue, N., Furuta, R., Yamasaki, T., & Aizawa, K. (2018). Cross-domain weakly-supervised object detection through progressive domain adaptation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5001–5009).
Johnson-Roberson, M., Barto, C., Mehta, R., Sridhar, S. N., Rosaen, K., & Vasudevan, R. (2017). Driving in the matrix: Can virtual worlds replace human-generated annotations for real world tasks? In Proceedings of the IEEE international conference on robotics and automation (pp. 746–753).
Kang, G., Jiang, L., Yang, Y., & Hauptmann, A. G. (2019). Contrastive adaptation network for unsupervised domain adaptation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 4893–4902).
Kennerley, M., Wang, J. G., Veeravalli, B., & Tan, R. T. (2023). 2pcnet: Two-phase consistency training for day-to-night unsupervised domain adaptive object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 11484–11493).
Kim, Y., Cho, D., Han, K., Panda, P., & Hong, S. (2021). Domain adaptation without source data. IEEE Transactions on Artificial Intelligence, 2(6), 508–518.
Article Google Scholar
Kipf, T. N., & Welling, M. (2016). Semi-supervised classification with graph convolutional networks. arXiv:1609.02907
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems Vol. 25.
Kundu, J. N., Kulkarni, A. R., Bhambri, S., Mehta, D., Kulkarni, S. A., Jampani, V., & Radhakrishnan, V. B. (2022). Balancing discriminability and transferability for source-free domain adaptation. In Proceedings of the international conference on machine learning (pp. 11710–11728).
Lang, Q., Zhang, L., Shi, W., Chen, W., & Pu, S. (2022). Exploring implicit domain-invariant features for domain adaptive object detection. IEEE Transactions on Circuits and Systems for Video Technology, 33(4), 1816–1826.
Article Google Scholar
Li, J., Xu, R., Ma, J., Zou, Q., Ma, J., & Yu, H. (2023). Domain adaptive object detection for autonomous driving under foggy weather. In Proceedings of the IEEE/CVF winter conference on applications of computer vision (pp. 612–622).
Li, S., Ye, M., Zhu, X., Zhou, L., & Xiong, L. (2022a). Source-free object detection by learning to overlook domain style. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 8014–8023).
Li, T., Sahu, A. K., Talwalkar, A., & Smith, V. (2020). Federated learning: Challenges, methods, and future directions. IEEE Signal Processing Magazine, 37(3), 50–60.
Article Google Scholar
Li, X., Wang, W., Wu, L., Chen, S., Hu, X., Li, J., Tang, J., & Yang, J. (2020b). Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection. In Advances in neural information processing systems (pp. 21002–21012).
Li, X., Chen, W., Xie, D., Yang, S., Yuan, P., Pu, S., & Zhuang, Y. (2021). A free lunch for unsupervised domain adaptive object detection without source data. In Proceedings of the AAAI conference on artificial intelligence (pp. 8474–8481).
Li, Y., Wang, N., Shi, J., Liu, J., & Hou, X. (2017). Revisiting batch normalization for practical domain adaptation. In Proceedings of the international conference on learning representations workshops.
Li, Y. J., Dai, X., Ma, C. Y., Liu, Y. C., Chen, K., Wu, B., He, Z., Kitani, K., & Vajda, P. (2022b). Cross-domain adaptive teacher for object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 7581–7590).
Li, Z., Togo, R., Ogawa, T., & Haseyama, M. (2022c). Union-set multi-source model adaptation for semantic segmentation. In Proceedings of the European conference on computer vision (pp. 579–595).
Liang, J., Hu, D., & Feng, J. (2020). Do we really need to access the source data? source hypothesis transfer for unsupervised domain adaptation. In Proceedings of the international conference on machine learning (pp. 6028–6039).
Liang, J., Hu, D., Feng, J., & He, R. (2022). Dine: Domain adaptation from single and multiple black-box predictors. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 8003–8013).
Lin, C., Zhao, S., Meng, L., & Chua, T. S. (2020). Multi-source domain adaptation for visual sentiment classification. In Proceedings of the AAAI conference on artificial intelligence (pp. 2661–2668).
Lin, C., Yuan, Z., Zhao, S., Sun, P., Wang, C., & Cai, J. (2021). Domain-invariant disentangled network for generalizable object detection. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 8771–8780).
Lin, C., Sun, P., Jiang, Y., Luo, P., Qu, L., Haffari, G., Yuan, Z., Cai, J. (2023). Learning object-language alignments for open-vocabulary object detection. In Proceedings of the international conference on learning representations.
Lin, T. Y., Dollár, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2017a). Feature pyramid networks for object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 2117–2125).
Lin, T. Y., Goyal, P., Girshick, R., He, K., & Dollár, P. (2017b). Focal loss for dense object detection. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 2980–2988).
Liu, M. Y., & Tuzel, O. (2016). Coupled generative adversarial networks. In Advances in neural information processing systems (pp. 469–477).
Liu, Q., Lin, L., Shen, Z., & Yang, Z. (2023a). Periodically exchange teacher–student for source-free object detection. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 6414–6424).
Liu, S., Li, F., Zhang, H., Yang, X., Qi, X., Su, H., Zhu, J., & Zhang, L. (2022). Dab-detr: Dynamic anchor boxes are better queries for detr. In Proceedings of the international conference on learning representations.
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016). Ssd: Single shot multibox detector. In Proceedings of the European conference on computer vision (pp. 21–37).
Liu, X., Xi, W., Li, W., Xu, D., Bai, G., & Zhao, J. (2023). Co-MDA: Federated multi-source domain adaptation on black-box models. IEEE Transactions on Circuits and Systems for Video Technology, 33(12), 7658–7670.
Article Google Scholar
Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 3431–3440).
Long, M., Zhu, H., Wang, J., & Jordan, M. I. (2017). Deep transfer learning with joint adaptation networks. In Proceedings of the international conference on machine learning (pp. 2208–2217).
Lu, P. J., Jui, C. Y., & Chuang, J. H. (2023). A privacy-preserving approach for multi-source domain adaptive object detection. In Proceedings of the IEEE international conference on image processing (pp. 1075–1079).
Mansour, Y., Mohri, M., & Rostamizadeh, A. (2008). Domain adaptation with multiple sources. In Advances in neural information processing systems (pp. 1041–1048).
Munir, M. A., Khan, M. H., Sarfraz, M., & Ali, M. (2021). Ssal: Synergizing between self-training and adversarial learning for domain adaptive object detection. In Advances in neural information processing systems (pp. 22770–22782).
Oord, A., Li, Y., & Vinyals, O. (2018). Representation learning with contrastive predictive coding. arXiv:1807.03748
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., & Desmaison, A. (2019). Pytorch: An imperative style, high-performance deep learning library. In Advances in neural information processing systems (pp. 8024–8035).
Peng, X., Bai, Q., Xia, X., Huang, Z., Saenko, K., & Wang, B. (2019). Moment matching for multi-source domain adaptation. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 1406–1415).
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 779–788).
Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems (pp. 91–99).
Riemer, M., Cases, I., Ajemian, R., Liu, M., Rish, I., Tu, Y., & Tesauro, G. (2019). Learning to learn without forgetting by maximizing transfer and minimizing interference. In Proceedings of the international conference on learning representations.
Robbins, H., & Monro, S. (1951). A stochastic approximation method. Annals of Mathematical Statistics, 400–407.
Saito, K., Ushiku, Y., Harada, T., & Saenko, K. (2019). Strong-weak distribution alignment for adaptive object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 6956–6965).
Shen, M., Bu, Y., & Wornell, G. W. (2023). On balancing bias and variance in unsupervised multi-source-free domain adaptation. In Proceedings of the international conference on machine learning (pp. 30976–30991).
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. In Proceedings of the international conference on learning representations.
Sindagi, V. A., Oza, P., Yasarla, R., & Patel, V. M. (2020). Prior-based domain adaptive object detection for hazy and rainy conditions. In Proceedings of the European conference on computer vision (pp. 763–780).
Solovyev, R., Wang, W., & Gabruseva, T. (2021). Weighted boxes fusion: Ensembling boxes from different object detection models. Image and Vision Computing, 107, 104117.
Article Google Scholar
Sun, B., & Saenko, K. (2016). Deep coral: Correlation alignment for deep domain adaptation. In Proceedings of the European conference on computer vision (pp. 443–450).
Sun, B., Feng, J., & Saenko, K. (2016). Return of frustratingly easy domain adaptation. In Proceedings of the AAAI conference on artificial intelligence (pp. 2058–2065).
Sun, S., Shi, H., & Wu, Y. (2015). A survey of multi-source domain adaptation. Information Fusion, 24, 84–92.
Article Google Scholar
Sun, T., Segu, M., Postels, J., Wang, Y., Van Gool, L., Schiele, B., Tombari, F., & Yu, F. (2022). Shift: A synthetic driving dataset for continuous multi-task domain adaptation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 21371–21382).
Szegedy, C., Ioffe, S., Vanhoucke, V., & Alemi, A. (2017). Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the AAAI conference on artificial intelligence (pp. 4278–4284).
Tarvainen, A., & Valpola, H. (2017). Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In Advances in neural information processing systems (pp. 1195–1204).
Tian, Z., Shen, C., Chen, H., & He, T. (2019). Fcos: Fully convolutional one-stage object detection. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 9627–9636).
Tzeng, E., Hoffman, J., Saenko, K., & Darrell, T. (2017). Adversarial discriminative domain adaptation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 7167–7176).
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser Ł, Polosukhin, I. (2017) Attention is all you need. In Advances in neural information processing systems (pp. 5998–6008).
Vibashan, V., Oza, P., & Patel, V. M. (2023). Instance relation graph guided source-free domain adaptive object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 3520–3530).
Wang, K., & Zhang, L. (2021). Reconcile prediction consistency for balanced object detection. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 3631–3640).
Wei, F., Gao, Y., Wu, Z., Hu, H., & Lin, S. (2021). Aligning pretraining for detection via object-level contrastive learning. In Advances in neural information processing systems (pp. 22682–22694).
Wilson, G., & Cook, D. J .(2020). A survey of unsupervised deep domain adaptation. ACM Transactions on Intelligent Systems and Technology, 11(5), 51:1–51:46
Wu, J., Chen, J., He, M., Wang, Y., Li, B., Ma, B., Gan, W., Wu, W., Wang, Y., & Huang, D. (2022). Target-relevant knowledge preservation for multi-source domain adaptive object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5301–5310).
Xiong, L., Ye, M., Zhang, D., Gan, Y., Li, X., & Zhu, Y. (2021). Source data-free domain adaptation of object detector through domain-specific perturbation. International Journal of Intelligent Systems, 36(8), 3746–3766.
Article Google Scholar
Xu, M., Wang, H., Ni, B., Tian, Q., & Zhang, W. (2020). Cross-domain detection via graph-induced prototype alignment. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 12355–12364).
Xu, M., Zhang, Z., Hu, H., Wang, J., Wang, L., Wei, F., Bai, X., & Liu, Z. (2021). End-to-end semi-supervised object detection with soft teacher. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 3060–3069).
Xu, M., Qin, L., Chen, W., Pu, S., & Zhang, L. (2023). Multi-view adversarial discriminator: Mine the non-causal factors for object detection in unseen domains. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 8103–8112).
Yang, C., Liu, Y., & Yuan, Y. (2023). Transferability-guided multi-source model adaptation for medical image segmentation. In Proceedings of the international conference on medical image computing and computer-assisted intervention (pp. 703–712).
Yang, S., Wang, Y., Van De Weijer, J., Herranz, L., & Jui, S. (2020). Unsupervised domain adaptation without source data by casting a bait. 1(2), 5. arXiv:2010.12427
Yang, S., Wang, Y., Van De Weijer, J., Herranz, L., & Jui, S. (2021a). Generalized source-free domain adaptation. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 8978–8987).
Yang, S., van de Weijer, J., Herranz, L., & Jui, S. (2021). Exploiting the intrinsic neighborhood structure for source-free domain adaptation. Advances in Neural information processing systems, 34, 29393–29405.
Google Scholar
Yao, X., Zhao, S., Xu, P., & Yang, J. (2021). Multi-source domain adaptation for object detection. In Proceedings of the IEEE/CVF international conference on computer vision (pp 3273–3282).
Yu, F., Chen, H., Wang, X., Xian, W., Chen, Y., Liu, F., Madhavan ,V., & Darrell, T. (2020). Bdd100k: A diverse driving dataset for heterogeneous multitask learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 2633–2642).
Yu, W., & Qin, Z. (2020). Graph convolutional network for recommendation with low-pass collaborative filters. In Proceedings of the international conference on machine learning (pp. 10936–10945).
Yu, Z., Li, J., Du, Z., Zhu, L., & Shen, H. T. (2023). A comprehensive survey on source-free domain adaptation. arXiv:2302.11803
Zhang, C., Xie, Y., Bai, H., Yu, B., Li, W., & Gao, Y. (2021). A survey on federated learning. Knowledge-Based Systems, 216, 106775.
Article Google Scholar
Zhang, D., Ye, M., Liu, Y., Xiong, L., & Zhou, L. (2022). Multi-source unsupervised domain adaptation for object detection. Information Fusion, 78, 138–148.
Article Google Scholar
Zhang, L., Qin, L., Xu, M., Chen, W., Pu, S., & Zhang, W. (2023). Randomized spectrum transformations for adapting object detector in unseen domains. IEEE Transactions on Image Processing, 32, 4868–4879.
Article Google Scholar
Zhang, S., Zhang, L., & Liu, Z. (2023b). Refined pseudo labeling for source-free domain adaptive object detection. In Proceedings of the IEEE international conference on acoustics, speech and signal processing (pp. 1–5).
Zhang, Y., Wang, Z., Mao, Y. (2021b). Rpn prototype alignment for domain adaptive object detector. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 12425–12434).
Zhao, H., Zhang, S., Wu, G., Moura, J. M., Costeira, J. P., & Gordon, G. J. (2018). Adversarial multiple source domain adaptation. In Advances in neural information processing systems (pp. 8568–8579).
Zhao, S., Li, B., Yue, X., Gu, Y., Xu, P., Hu, R., Chai, H., & Keutzer, K. (2019a). Multi-source domain adaptation for semantic segmentation. In Advances in Neural Information Processing Systems (pp. 7285–7298).
Zhao, S., Lin, C., Xu, P., Zhao, S., Guo, Y., Krishna, R., Ding, G., & Keutzer, K. (2019b). Cycleemotiongan: Emotional semantic consistency preserved cyclegan for adapting image emotions. In Proceedings of the AAAI conference on artificial intelligence (pp. 2620–2627).
Zhao, S., Wang, G., Zhang, S., Gu, Y., Li, Y., Song, Z., Xu, P., Hu, R., Chai, H., & Keutzer, K. (2020). Multi-source distilling domain adaptation. In Proceedings of the AAAI conference on artificial intelligence (pp. 12975–12983).
Zhao, S., Chen, X., Yue, X., Lin, C., Xu, P., Krishna, R., Yang, J., Ding, G., Sangiovanni-Vincentelli, A. L., & Keutzer, K. (2021). Emotional semantics-preserved and feature-aligned cyclegan for visual emotion adaptation. IEEE Transactions on Cybernetics, 52(10), 10000–10013.
Zhao, S., Li, B., Xu, P., Yue, X., Ding, G., & Keutzer, K. (2021). Madan: Multi-source adversarial domain aggregation network for domain adaptation. International Journal of Computer Vision, 129(8), 2399–2424.
Zhao, S., Xiao, Y., Guo, J., Yue, X., Yang, J., Krishna, R., Xu, P., & Keutzer, K. (2021c). Curriculum cyclegan for textual sentiment domain adaptation with multiple sources. In Proceedings of the the web conference (pp. 541–552).
Zhao, S., Yue, X., Zhang, S., Li, B., Zhao, H., Wu, B., Krishna, R., Gonzalez, J. E., Sangiovanni-Vincentelli, A. L., Seshia, S. A., et al. (2022). A review of single-source deep unsupervised visual domain adaptation. IEEE Transactions on Neural Networks and Learning Systems, 33(2), 473–493.
Article Google Scholar
Zhao, S., Hong, X., Yang, J., Zhao, Y., & Ding, G. (2023). Toward label-efficient emotion and sentiment analysis. Proceedings of the IEEE, 111(10), 1159–1197.
Article Google Scholar
Zhao, S., Chen, H., Huang, H., Xu, P., & Ding, G. (2024). More is better: Deep domain adaptation with multiple sources. In Proceedings of the international joint conference on artificial intelligence.
Zhou, W., Du, D., Zhang, L., Luo, T., & Wu, Y. (2022). Multi-granularity alignment domain adaptation for object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 9581–9590).
Zhu, J. Y., Park, T., Isola, P., & Efros, A. A. (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE/CVF international conference on computer vision (pp. 2223–2232).
Zhu, X., Su, W., Lu, L., Li, B., Wang, X., & Dai, J. (2021). Deformable detr: Deformable transformers for end-to-end object detection. In: Proceedings of the international conference on learning representations.
Zou, Z., Chen, K., Shi, Z., Guo, Y., & Ye, J. (2023). Object detection in 20 years: A survey. Proceedings of the IEEE, 111(3), 257–276.
Article Google Scholar

Download references

Acknowledgements

This work is supported by CCF-DiDi GAIA Collaborative Research Funds for Young Scholars and the National Natural Science Foundation of China (Nos. 61925107, 62021002).

Author information

Sicheng Zhao, Huizai Yao and Chuang Lin have contributed equally.

Authors and Affiliations

BNRist, Tsinghua University, Beijing, China
Sicheng Zhao, Huizai Yao, Yue Gao & Guiguang Ding
College of Computer Sciences, Nankai University, Tianjin, China
Huizai Yao
Shanghai Artificial Intelligence Laboratory, Shanghai, China
Huizai Yao
Department of Data Science, Monash University, Clayton, Australia
Chuang Lin
School of Software, Tsinghua University, Beijing, China
Yue Gao & Guiguang Ding

Authors

Sicheng Zhao
View author publications
Search author on:PubMed Google Scholar
Huizai Yao
View author publications
Search author on:PubMed Google Scholar
Chuang Lin
View author publications
Search author on:PubMed Google Scholar
Yue Gao
View author publications
Search author on:PubMed Google Scholar
Guiguang Ding
View author publications
Search author on:PubMed Google Scholar

Corresponding authors

Correspondence to Sicheng Zhao or Guiguang Ding.

Additional information

Communicated by Hong Liu.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised: The Abstract section has been corrected.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhao, S., Yao, H., Lin, C. et al. Multi-source-free Domain Adaptive Object Detection. Int J Comput Vis 132, 5950–5982 (2024). https://doi.org/10.1007/s11263-024-02170-z

Download citation

Received: 15 December 2023
Accepted: 28 June 2024
Published: 11 July 2024
Version of record: 11 July 2024
Issue date: December 2024
DOI: https://doi.org/10.1007/s11263-024-02170-z

Keywords

Part of a collection:

Special Issue on Open-World Visual Recognition

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-source-free Domain Adaptive Object Detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-source Open-Set Deep Adversarial Domain Adaptation

Source-free domain adaptive object detection based on pseudo-supervised mean teacher

Dynamic Retraining-Updating Mean Teacher for Source-Free Object Detection

Explore related subjects

Change history

08 October 2024

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now