Abstract
To enhance the transferability of object detection models in real-world scenarios where data is sampled from disparate distributions, considerable attention has been devoted to domain adaptive object detection (DAOD). Researchers have also investigated multi-source DAOD to confront the challenges posed by training samples originating from different source domains. However, existing methods encounter difficulties when source data is unavailable due to privacy preservation policies or transmission cost constraints. To address these issues, we introduce and address the problem of Multi-source-free Domain Adaptive Object Detection (MSFDAOD), which seeks to perform domain adaptation for object detection using multi-source-pretrained models without any source data or target labels. Specifically, we propose a novel Divide-and-Aggregate Contrastive Adaptation (DACA) framework. First, multiple mean-teacher detection models perform effective knowledge distillation and class-wise contrastive learning within each source domain feature space, denoted as “Divide”. Meanwhile, DACA integrates proposals, obtains unified pseudo-labels, and assigns dynamic weights to student prediction aggregation, denoted as “Aggregate”. The two-step process of “Divide” and “Aggregate” enables our method to efficiently leverage the advantages of multiple source-free models and aggregate their contributions to adaptation in a self-supervised manner. Extensive experiments are conducted on multiple popular benchmark datasets, and the results demonstrate that the proposed DACA framework significantly outperforms state-of-the-art approaches for MSFDAOD tasks.
Similar content being viewed by others
Change history
08 October 2024
A Correction to this paper has been published: https://doi.org/10.1007/s11263-024-02257-7
References
Ahmed, S. M., Raychaudhuri, D. S., Paul, S., Oymak, S., & Roy-Chowdhury, A. K. (2021). Unsupervised multi-source domain adaptation without access to source data. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 10103–10112).
Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., & Mané, D. (2016). Concrete problems in AI safety. arXiv:1606.06565
Bochkovskiy, A., Wang, C. Y., & Liao, H. Y. M. (2020). Yolov4: Optimal speed and accuracy of object detection. arXiv:2004.10934
Bodla, N., Singh, B., Chellappa, R., & Davis, L. S. (2017). Soft-NMS–improving object detection with one line of code. In Proceedings of the IEEE international conference on computer vision (pp. 5561–5569).
Cai, Q., Pan, Y., Ngo, C. W., Tian, X., Duan, L., Yao, T. (2019). Exploring object relation in mean teacher for cross-domain detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 11457–11466).
Cao, S., Joshi, D., Gui, L. Y., & Wang, Y. X. (2023). Contrastive mean teacher for domain adaptive object detectors. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 23839–23848).
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., & Zagoruyko, S. (2020). End-to-end object detection with transformers. In Proceedings of the European conference on computer vision (pp. 213–229).
Chen, C., Zheng, Z., Ding, X., Huang, Y., & Dou, Q. (2020a). Harmonizing transferability and discriminability for adapting object detectors. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 8869–8878).
Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2017). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834–848.
Chen, S., Sun, P., Song, Y., & Luo, P. (2023). Diffusiondet: Diffusion model for object detection. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 19830–19843).
Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020b). A simple framework for contrastive learning of visual representations. In Proceedings of the international conference on machine learning (pp. 1597–1607).
Chen, X., Wang, S., Long, M., & Wang, J. (2019). Transferability vs. discriminability: Batch spectral penalization for adversarial domain adaptation. In Proceedings of the international conference on machine learning (pp. 1081–1090).
Chen, Y., Li, W., Sakaridis, C., Dai, D., & Van Gool, L. (2018). Domain adaptive faster r-cnn for object detection in the wild. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 3339–3348).
Chen, Y. T., Shi, J., Ye, Z., Mertz, C., Ramanan, D., & Kong, S. (2022). Multimodal object detection via probabilistic ensembling. In Proceedings of the European conference on computer vision (pp. 139–158).
Chu, Q., Li, S., Chen, G., Li, K., & Li, X. (2023). Adversarial alignment for source free object detection. In Proceedings of the AAAI conference on artificial intelligence (pp. 452–460).
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., & Schiele, B. (2016). The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 3213–3223).
Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27.
Deng, J., Li, W., Chen, Y., & Duan, L. (2021). Unbiased mean teacher for cross-domain object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 4091–4101).
Deng, J., Xu, D., Li, W., & Duan, L. (2023). Harmonious teacher for cross-domain object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 23829–23838).
Denton, E. L., Zaremba, W., Bruna, J., LeCun, Y., & Fergus, R. (2014). Exploiting linear structure within convolutional networks for efficient evaluation. In Advances in neural information processing systems (pp. 1269–1277).
Dong, J., Fang, Z., Liu, A., Sun, G., & Liu, T. (2021). Confident anchor-induced multi-source free domain adaptation. In Advances in neural information processing systems (pp. 2848–2860).
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., & Uszkoreit, J. (2021). An image is worth 16x16 words: Transformers for image recognition at scale. In Proceedings of the international conference on learning representations.
He, Z., Zhang, L. (2020). Domain adaptive object detection via asymmetric tri-way faster-rcnn. Proceedings of the European (pp. 309–324). Conference on Computer Vision
Fang, Y., Yap, P. T., Lin, W., Zhu, H., & Liu, M. (2022). Source-free unsupervised domain adaptation: A survey. arXiv:2301.00265
Ganin, Y., & Lempitsky, V. (2015). Unsupervised domain adaptation by backpropagation. In Proceedings of the international conference on machine learning (pp. 1180–1189).
Geiger, A., Lenz, P., & Urtasun, R. (2012). Are we ready for autonomous driving? The kitti vision benchmark suite. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 3354–3361).
Girshick, R. (2015). Fast r-cnn. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 1440–1448).
Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 580–587).
Gori, M., Monfardini, G., & Scarselli, F. (2005). A new model for learning in graph domains. In Proceedings of the IEEE international joint conference on neural networks (pp. 729–734).
Han, Z., Zhang, Z., Wang, F., He, R., Su, W., Xi, X., & Yin, Y. (2023). Discriminability and transferability estimation: A Bayesian source importance estimation approach for multi-source-free domain adaptation. In Proceedings of the AAAI conference on artificial intelligence (pp. 7811–7820)
He, K., Zhang, X., Ren, S., & Sun, J. (2015). Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(9), 1904–1916.
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 770–778).
He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017). Mask r-cnn. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 2961–2969).
He, Z., Zhang, L., Gao, X., & Zhang, D. (2023). Multi-adversarial faster-RCNN with paradigm teacher for unrestricted object detection. International Journal of Computer Vision, 131(3), 680–700.
Hinton, G., Vinyals, O., & Dean, J. (2015). Distilling the knowledge in a neural network. arXiv:1503.02531
Ho, J., Jain, A., & Abbeel, P. (2020). Denoising diffusion probabilistic models. In Advances in neural information processing systems (pp. 6840–6851).
Hoffman, J., Kulis, B., Darrell, T., & Saenko, K. (2012). Discovering latent domains for multisource domain adaptation. In Proceedings of the European conference on computer vision (pp. 702–715).
Hsu, C. C., Tsai, Y. H., Lin, Y. Y., & Yang, M. H. (2020). Every pixel matters: Center-aware feature alignment for domain adaptive object detector. In Proceedings of the European conference on computer vision (pp. 733–748).
Hu, W., Miyato, T., Tokui, S., Matsumoto, E., & Sugiyama, M. (2017). Learning discrete representations via information maximizing self-augmented training. In Proceedings of the international conference on machine learning (pp. 1558–1567).
Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 4700–4708).
Huang, J., Guan, D., Xiao, A., & Lu, S. (2021). Model adaptation: Historical contrastive learning for unsupervised domain adaptation without source data. In Advances in neural information processing systems (pp. 3635–3649).
Inoue, N., Furuta, R., Yamasaki, T., & Aizawa, K. (2018). Cross-domain weakly-supervised object detection through progressive domain adaptation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5001–5009).
Johnson-Roberson, M., Barto, C., Mehta, R., Sridhar, S. N., Rosaen, K., & Vasudevan, R. (2017). Driving in the matrix: Can virtual worlds replace human-generated annotations for real world tasks? In Proceedings of the IEEE international conference on robotics and automation (pp. 746–753).
Kang, G., Jiang, L., Yang, Y., & Hauptmann, A. G. (2019). Contrastive adaptation network for unsupervised domain adaptation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 4893–4902).
Kennerley, M., Wang, J. G., Veeravalli, B., & Tan, R. T. (2023). 2pcnet: Two-phase consistency training for day-to-night unsupervised domain adaptive object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 11484–11493).
Kim, Y., Cho, D., Han, K., Panda, P., & Hong, S. (2021). Domain adaptation without source data. IEEE Transactions on Artificial Intelligence, 2(6), 508–518.
Kipf, T. N., & Welling, M. (2016). Semi-supervised classification with graph convolutional networks. arXiv:1609.02907
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems Vol. 25.
Kundu, J. N., Kulkarni, A. R., Bhambri, S., Mehta, D., Kulkarni, S. A., Jampani, V., & Radhakrishnan, V. B. (2022). Balancing discriminability and transferability for source-free domain adaptation. In Proceedings of the international conference on machine learning (pp. 11710–11728).
Lang, Q., Zhang, L., Shi, W., Chen, W., & Pu, S. (2022). Exploring implicit domain-invariant features for domain adaptive object detection. IEEE Transactions on Circuits and Systems for Video Technology, 33(4), 1816–1826.
Li, J., Xu, R., Ma, J., Zou, Q., Ma, J., & Yu, H. (2023). Domain adaptive object detection for autonomous driving under foggy weather. In Proceedings of the IEEE/CVF winter conference on applications of computer vision (pp. 612–622).
Li, S., Ye, M., Zhu, X., Zhou, L., & Xiong, L. (2022a). Source-free object detection by learning to overlook domain style. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 8014–8023).
Li, T., Sahu, A. K., Talwalkar, A., & Smith, V. (2020). Federated learning: Challenges, methods, and future directions. IEEE Signal Processing Magazine, 37(3), 50–60.
Li, X., Wang, W., Wu, L., Chen, S., Hu, X., Li, J., Tang, J., & Yang, J. (2020b). Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection. In Advances in neural information processing systems (pp. 21002–21012).
Li, X., Chen, W., Xie, D., Yang, S., Yuan, P., Pu, S., & Zhuang, Y. (2021). A free lunch for unsupervised domain adaptive object detection without source data. In Proceedings of the AAAI conference on artificial intelligence (pp. 8474–8481).
Li, Y., Wang, N., Shi, J., Liu, J., & Hou, X. (2017). Revisiting batch normalization for practical domain adaptation. In Proceedings of the international conference on learning representations workshops.
Li, Y. J., Dai, X., Ma, C. Y., Liu, Y. C., Chen, K., Wu, B., He, Z., Kitani, K., & Vajda, P. (2022b). Cross-domain adaptive teacher for object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 7581–7590).
Li, Z., Togo, R., Ogawa, T., & Haseyama, M. (2022c). Union-set multi-source model adaptation for semantic segmentation. In Proceedings of the European conference on computer vision (pp. 579–595).
Liang, J., Hu, D., & Feng, J. (2020). Do we really need to access the source data? source hypothesis transfer for unsupervised domain adaptation. In Proceedings of the international conference on machine learning (pp. 6028–6039).
Liang, J., Hu, D., Feng, J., & He, R. (2022). Dine: Domain adaptation from single and multiple black-box predictors. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 8003–8013).
Lin, C., Zhao, S., Meng, L., & Chua, T. S. (2020). Multi-source domain adaptation for visual sentiment classification. In Proceedings of the AAAI conference on artificial intelligence (pp. 2661–2668).
Lin, C., Yuan, Z., Zhao, S., Sun, P., Wang, C., & Cai, J. (2021). Domain-invariant disentangled network for generalizable object detection. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 8771–8780).
Lin, C., Sun, P., Jiang, Y., Luo, P., Qu, L., Haffari, G., Yuan, Z., Cai, J. (2023). Learning object-language alignments for open-vocabulary object detection. In Proceedings of the international conference on learning representations.
Lin, T. Y., Dollár, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2017a). Feature pyramid networks for object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 2117–2125).
Lin, T. Y., Goyal, P., Girshick, R., He, K., & Dollár, P. (2017b). Focal loss for dense object detection. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 2980–2988).
Liu, M. Y., & Tuzel, O. (2016). Coupled generative adversarial networks. In Advances in neural information processing systems (pp. 469–477).
Liu, Q., Lin, L., Shen, Z., & Yang, Z. (2023a). Periodically exchange teacher–student for source-free object detection. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 6414–6424).
Liu, S., Li, F., Zhang, H., Yang, X., Qi, X., Su, H., Zhu, J., & Zhang, L. (2022). Dab-detr: Dynamic anchor boxes are better queries for detr. In Proceedings of the international conference on learning representations.
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016). Ssd: Single shot multibox detector. In Proceedings of the European conference on computer vision (pp. 21–37).
Liu, X., Xi, W., Li, W., Xu, D., Bai, G., & Zhao, J. (2023). Co-MDA: Federated multi-source domain adaptation on black-box models. IEEE Transactions on Circuits and Systems for Video Technology, 33(12), 7658–7670.
Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 3431–3440).
Long, M., Zhu, H., Wang, J., & Jordan, M. I. (2017). Deep transfer learning with joint adaptation networks. In Proceedings of the international conference on machine learning (pp. 2208–2217).
Lu, P. J., Jui, C. Y., & Chuang, J. H. (2023). A privacy-preserving approach for multi-source domain adaptive object detection. In Proceedings of the IEEE international conference on image processing (pp. 1075–1079).
Mansour, Y., Mohri, M., & Rostamizadeh, A. (2008). Domain adaptation with multiple sources. In Advances in neural information processing systems (pp. 1041–1048).
Munir, M. A., Khan, M. H., Sarfraz, M., & Ali, M. (2021). Ssal: Synergizing between self-training and adversarial learning for domain adaptive object detection. In Advances in neural information processing systems (pp. 22770–22782).
Oord, A., Li, Y., & Vinyals, O. (2018). Representation learning with contrastive predictive coding. arXiv:1807.03748
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., & Desmaison, A. (2019). Pytorch: An imperative style, high-performance deep learning library. In Advances in neural information processing systems (pp. 8024–8035).
Peng, X., Bai, Q., Xia, X., Huang, Z., Saenko, K., & Wang, B. (2019). Moment matching for multi-source domain adaptation. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 1406–1415).
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 779–788).
Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems (pp. 91–99).
Riemer, M., Cases, I., Ajemian, R., Liu, M., Rish, I., Tu, Y., & Tesauro, G. (2019). Learning to learn without forgetting by maximizing transfer and minimizing interference. In Proceedings of the international conference on learning representations.
Robbins, H., & Monro, S. (1951). A stochastic approximation method. Annals of Mathematical Statistics, 400–407.
Saito, K., Ushiku, Y., Harada, T., & Saenko, K. (2019). Strong-weak distribution alignment for adaptive object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 6956–6965).
Shen, M., Bu, Y., & Wornell, G. W. (2023). On balancing bias and variance in unsupervised multi-source-free domain adaptation. In Proceedings of the international conference on machine learning (pp. 30976–30991).
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. In Proceedings of the international conference on learning representations.
Sindagi, V. A., Oza, P., Yasarla, R., & Patel, V. M. (2020). Prior-based domain adaptive object detection for hazy and rainy conditions. In Proceedings of the European conference on computer vision (pp. 763–780).
Solovyev, R., Wang, W., & Gabruseva, T. (2021). Weighted boxes fusion: Ensembling boxes from different object detection models. Image and Vision Computing, 107, 104117.
Sun, B., & Saenko, K. (2016). Deep coral: Correlation alignment for deep domain adaptation. In Proceedings of the European conference on computer vision (pp. 443–450).
Sun, B., Feng, J., & Saenko, K. (2016). Return of frustratingly easy domain adaptation. In Proceedings of the AAAI conference on artificial intelligence (pp. 2058–2065).
Sun, S., Shi, H., & Wu, Y. (2015). A survey of multi-source domain adaptation. Information Fusion, 24, 84–92.
Sun, T., Segu, M., Postels, J., Wang, Y., Van Gool, L., Schiele, B., Tombari, F., & Yu, F. (2022). Shift: A synthetic driving dataset for continuous multi-task domain adaptation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 21371–21382).
Szegedy, C., Ioffe, S., Vanhoucke, V., & Alemi, A. (2017). Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the AAAI conference on artificial intelligence (pp. 4278–4284).
Tarvainen, A., & Valpola, H. (2017). Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In Advances in neural information processing systems (pp. 1195–1204).
Tian, Z., Shen, C., Chen, H., & He, T. (2019). Fcos: Fully convolutional one-stage object detection. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 9627–9636).
Tzeng, E., Hoffman, J., Saenko, K., & Darrell, T. (2017). Adversarial discriminative domain adaptation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 7167–7176).
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser Ł, Polosukhin, I. (2017) Attention is all you need. In Advances in neural information processing systems (pp. 5998–6008).
Vibashan, V., Oza, P., & Patel, V. M. (2023). Instance relation graph guided source-free domain adaptive object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 3520–3530).
Wang, K., & Zhang, L. (2021). Reconcile prediction consistency for balanced object detection. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 3631–3640).
Wei, F., Gao, Y., Wu, Z., Hu, H., & Lin, S. (2021). Aligning pretraining for detection via object-level contrastive learning. In Advances in neural information processing systems (pp. 22682–22694).
Wilson, G., & Cook, D. J .(2020). A survey of unsupervised deep domain adaptation. ACM Transactions on Intelligent Systems and Technology, 11(5), 51:1–51:46
Wu, J., Chen, J., He, M., Wang, Y., Li, B., Ma, B., Gan, W., Wu, W., Wang, Y., & Huang, D. (2022). Target-relevant knowledge preservation for multi-source domain adaptive object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5301–5310).
Xiong, L., Ye, M., Zhang, D., Gan, Y., Li, X., & Zhu, Y. (2021). Source data-free domain adaptation of object detector through domain-specific perturbation. International Journal of Intelligent Systems, 36(8), 3746–3766.
Xu, M., Wang, H., Ni, B., Tian, Q., & Zhang, W. (2020). Cross-domain detection via graph-induced prototype alignment. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 12355–12364).
Xu, M., Zhang, Z., Hu, H., Wang, J., Wang, L., Wei, F., Bai, X., & Liu, Z. (2021). End-to-end semi-supervised object detection with soft teacher. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 3060–3069).
Xu, M., Qin, L., Chen, W., Pu, S., & Zhang, L. (2023). Multi-view adversarial discriminator: Mine the non-causal factors for object detection in unseen domains. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 8103–8112).
Yang, C., Liu, Y., & Yuan, Y. (2023). Transferability-guided multi-source model adaptation for medical image segmentation. In Proceedings of the international conference on medical image computing and computer-assisted intervention (pp. 703–712).
Yang, S., Wang, Y., Van De Weijer, J., Herranz, L., & Jui, S. (2020). Unsupervised domain adaptation without source data by casting a bait. 1(2), 5. arXiv:2010.12427
Yang, S., Wang, Y., Van De Weijer, J., Herranz, L., & Jui, S. (2021a). Generalized source-free domain adaptation. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 8978–8987).
Yang, S., van de Weijer, J., Herranz, L., & Jui, S. (2021). Exploiting the intrinsic neighborhood structure for source-free domain adaptation. Advances in Neural information processing systems, 34, 29393–29405.
Yao, X., Zhao, S., Xu, P., & Yang, J. (2021). Multi-source domain adaptation for object detection. In Proceedings of the IEEE/CVF international conference on computer vision (pp 3273–3282).
Yu, F., Chen, H., Wang, X., Xian, W., Chen, Y., Liu, F., Madhavan ,V., & Darrell, T. (2020). Bdd100k: A diverse driving dataset for heterogeneous multitask learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 2633–2642).
Yu, W., & Qin, Z. (2020). Graph convolutional network for recommendation with low-pass collaborative filters. In Proceedings of the international conference on machine learning (pp. 10936–10945).
Yu, Z., Li, J., Du, Z., Zhu, L., & Shen, H. T. (2023). A comprehensive survey on source-free domain adaptation. arXiv:2302.11803
Zhang, C., Xie, Y., Bai, H., Yu, B., Li, W., & Gao, Y. (2021). A survey on federated learning. Knowledge-Based Systems, 216, 106775.
Zhang, D., Ye, M., Liu, Y., Xiong, L., & Zhou, L. (2022). Multi-source unsupervised domain adaptation for object detection. Information Fusion, 78, 138–148.
Zhang, L., Qin, L., Xu, M., Chen, W., Pu, S., & Zhang, W. (2023). Randomized spectrum transformations for adapting object detector in unseen domains. IEEE Transactions on Image Processing, 32, 4868–4879.
Zhang, S., Zhang, L., & Liu, Z. (2023b). Refined pseudo labeling for source-free domain adaptive object detection. In Proceedings of the IEEE international conference on acoustics, speech and signal processing (pp. 1–5).
Zhang, Y., Wang, Z., Mao, Y. (2021b). Rpn prototype alignment for domain adaptive object detector. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 12425–12434).
Zhao, H., Zhang, S., Wu, G., Moura, J. M., Costeira, J. P., & Gordon, G. J. (2018). Adversarial multiple source domain adaptation. In Advances in neural information processing systems (pp. 8568–8579).
Zhao, S., Li, B., Yue, X., Gu, Y., Xu, P., Hu, R., Chai, H., & Keutzer, K. (2019a). Multi-source domain adaptation for semantic segmentation. In Advances in Neural Information Processing Systems (pp. 7285–7298).
Zhao, S., Lin, C., Xu, P., Zhao, S., Guo, Y., Krishna, R., Ding, G., & Keutzer, K. (2019b). Cycleemotiongan: Emotional semantic consistency preserved cyclegan for adapting image emotions. In Proceedings of the AAAI conference on artificial intelligence (pp. 2620–2627).
Zhao, S., Wang, G., Zhang, S., Gu, Y., Li, Y., Song, Z., Xu, P., Hu, R., Chai, H., & Keutzer, K. (2020). Multi-source distilling domain adaptation. In Proceedings of the AAAI conference on artificial intelligence (pp. 12975–12983).
Zhao, S., Chen, X., Yue, X., Lin, C., Xu, P., Krishna, R., Yang, J., Ding, G., Sangiovanni-Vincentelli, A. L., & Keutzer, K. (2021). Emotional semantics-preserved and feature-aligned cyclegan for visual emotion adaptation. IEEE Transactions on Cybernetics, 52(10), 10000–10013.
Zhao, S., Li, B., Xu, P., Yue, X., Ding, G., & Keutzer, K. (2021). Madan: Multi-source adversarial domain aggregation network for domain adaptation. International Journal of Computer Vision, 129(8), 2399–2424.
Zhao, S., Xiao, Y., Guo, J., Yue, X., Yang, J., Krishna, R., Xu, P., & Keutzer, K. (2021c). Curriculum cyclegan for textual sentiment domain adaptation with multiple sources. In Proceedings of the the web conference (pp. 541–552).
Zhao, S., Yue, X., Zhang, S., Li, B., Zhao, H., Wu, B., Krishna, R., Gonzalez, J. E., Sangiovanni-Vincentelli, A. L., Seshia, S. A., et al. (2022). A review of single-source deep unsupervised visual domain adaptation. IEEE Transactions on Neural Networks and Learning Systems, 33(2), 473–493.
Zhao, S., Hong, X., Yang, J., Zhao, Y., & Ding, G. (2023). Toward label-efficient emotion and sentiment analysis. Proceedings of the IEEE, 111(10), 1159–1197.
Zhao, S., Chen, H., Huang, H., Xu, P., & Ding, G. (2024). More is better: Deep domain adaptation with multiple sources. In Proceedings of the international joint conference on artificial intelligence.
Zhou, W., Du, D., Zhang, L., Luo, T., & Wu, Y. (2022). Multi-granularity alignment domain adaptation for object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 9581–9590).
Zhu, J. Y., Park, T., Isola, P., & Efros, A. A. (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE/CVF international conference on computer vision (pp. 2223–2232).
Zhu, X., Su, W., Lu, L., Li, B., Wang, X., & Dai, J. (2021). Deformable detr: Deformable transformers for end-to-end object detection. In: Proceedings of the international conference on learning representations.
Zou, Z., Chen, K., Shi, Z., Guo, Y., & Ye, J. (2023). Object detection in 20 years: A survey. Proceedings of the IEEE, 111(3), 257–276.
Acknowledgements
This work is supported by CCF-DiDi GAIA Collaborative Research Funds for Young Scholars and the National Natural Science Foundation of China (Nos. 61925107, 62021002).
Author information
Authors and Affiliations
Corresponding authors
Additional information
Communicated by Hong Liu.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The original online version of this article was revised: The Abstract section has been corrected.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhao, S., Yao, H., Lin, C. et al. Multi-source-free Domain Adaptive Object Detection. Int J Comput Vis 132, 5950–5982 (2024). https://doi.org/10.1007/s11263-024-02170-z
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1007/s11263-024-02170-z