这是indexloc提供的服务,不要输入任何密码
Skip to main content
Log in

Towards Unsupervised Domain Adaptation via Domain-Transformer

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

As a vital problem in pattern analysis and machine intelligence, Unsupervised Domain Adaptation (UDA) attempts to transfer an effective feature learner from a labeled source domain to an unlabeled target domain. Inspired by the success of the Transformer, several advances in UDA are achieved by adopting pure transformers as network architectures, but such a simple application can only capture patch-level information and lacks interpretability. To address these issues, we propose the Domain-Transformer (DoT) with domain-level attention mechanism to capture the long-range correspondence between the cross-domain samples. On the theoretical side, we provide a mathematical understanding of DoT: (1) We connect the domain-level attention with optimal transport theory, which provides interpretability from Wasserstein geometry; (2) From the perspective of learning theory, Wasserstein distance-based generalization bounds are derived, which explains the effectiveness of DoT for knowledge transfer. On the methodological side, DoT integrates the domain-level attention and manifold structure regularization, which characterize the sample-level information and locality consistency for cross-domain cluster structures. Besides, the domain-level attention mechanism can be used as a plug-and-play module, so DoT can be implemented under different neural network architectures. Instead of explicitly modeling the distribution discrepancy at domain-level or class-level, DoT learns transferable features under the guidance of long-range correspondence, so it is free of pseudo-labels and explicit domain discrepancy optimization. Extensive experiment results on several benchmark datasets validate the effectiveness of DoT.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from $39.99 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability

The datasets adopted can be requested and downloaded through the following links: ImageCLEF https://www.imageclef.org/2014/adaptation, Office-31 https://faculty.cc.gatech.edu/~judy/domainadapt/#datasets_code, Office-Home https://www.hemanthdv.org/officeHomeDataset.html, VisDA-2017 https://ai.bu.edu/visda-2017/, DomainNet https://ai.bu.edu/M3SDA/#dataset. The code will be publicly open upon acceptance.

References

  • Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., & Vaughan, J. W. (2010). A theory of learning from different domains. Machine Learning, 79(1–2), 151–175.

    Article  MathSciNet  Google Scholar 

  • Bhushan Damodaran, B., Kellenberger, B., & Flamary, R., Tuia, D., & Courty, N.(2018). DeepJDOT: Deep joint distribution optimal transport for unsupervised domain adaptation. In ECCV (pp. 447–463).

  • Caputo, B., Müller, H., & Martinez-Gomez, J., Villegas, M., Acar, B., Patricia, N., Marvasti, N., Üsküdarlı, S., Paredes, R., Cazorla, M., & Garcia-Varea, I. (2014). Imageclef 2014: Overview and analysis of the results. In International conference of the cross-language evaluation forum for European languages (pp. 192–211).

  • Courty, N., Flamary, R., Tuia, D., & Rakotomamonjy, A. (2016). Optimal transport for domain adaptation. TPAMI, 39(9), 1853–1865.

    Article  Google Scholar 

  • Courty, N., Flamary, R., & Habrard, A., & Rakotomamonjy, A. (2017). Joint distribution optimal transportation for domain adaptation. In: NeurIPS, pp 3733–3742

  • Cuturi, M. (2013). Sinkhorn distances: Lightspeed computation of optimal transport. In NeurIPS (pp. 2292–2300).

  • Dosovitskiy, A., Beyer, L., & Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., & Uszkoreit, J. (2021). An image is worth 16x16 words: Transformers for image recognition at scale. In ICLR

  • Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., March, M., & Lempitsky, V. (2016). Domain-adversarial training of neural networks. JMLR, 17(59), 1–35.

    Google Scholar 

  • Gong, B., Shi, Y., Sha, F., & Grauman, K. (2012). Geodesic flow kernel for unsupervised domain adaptation. In CVPR (pp. 2066–2073).

  • He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In CVPR (pp. 770–778).

  • He. X., & Niyogi, P. (2003). Locality preserving projections. In NeurIPS (pp. 153–160).

  • Knight, P. A. (2008). The Sinkhorn-Knopp algorithm: Convergence and applications. SIAM Journal on Matrix Analysis and Applications, 30, 261–275.

    Article  MathSciNet  Google Scholar 

  • Kumar, V., Patil, H., Lal, R., & Chakraborty, A. (2023). Improving domain adaptation through class aware frequency transformation. In IJCV (pp. 1–20).

  • Lai, Z., Vesdapunt, N., Zhou, N., Wu, J., Huynh, C. P., Li, X., Fu, K. K., & Chuah, C. N. (2023). PADCLIP: Pseudo-labeling with adaptive debiasing in clip for unsupervised domain adaptation. In ICCV (pp. 16109–16119).

  • Li, J., Chen, E., Ding, Z., Zhu, L., Lu, K., & Shen, H. T. (2021). Maximum density divergence for domain adaptation. TPAMI, 43, 3918–3930.

    Article  Google Scholar 

  • Li, M.X., Zhai, Y.M., Luo, Y.W., Ge, P. F., & Ren, C. X. (2020). Enhanced transport distance for unsupervised domain adaptation. In CVPR (pp. 13936–13944).

  • Li, S., Xie, M., Lv, F., Liu, C. H., Liang, J., Qin, C., & Li, W. (2021b). Semantic concentration for domain adaptation. In ICCV (pp. 9082–9091).

  • Li, H., Wang, Z., Xu, Y.,Yang, Y., Mei, S., & Zhang, Z. (2024). MemoNav: Working memory model for visual navigation. In CVPR (pp. 17913–17922).

  • Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., & Guo, B. (2021). Swin transformer: Hierarchical vision transformer using shifted windows. In ICCV (pp. 10012–10022).

  • Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In CVPR (pp. 3431–3440).

  • Long, M., Cao, Y., Wang, J., & Jordan, M. I. (2017). Deep transfer learning with joint adaptation networks. In ICML (pp. 2208–2217).

  • Long, M., Cao, Z., Wang, J., & Jordan, M. I. (2018). Conditional adversarial domain adaptation. In NeurIPS (pp. 1640–1650).

  • Long, M., Cao, Y., Cao, Z., Wang, J., & Jordan, M. I. (2019). Transferable representation learning with deep adaptation networks. TPAMI, 41(12), 3071–3085.

    Article  Google Scholar 

  • Luo, Y. W., Ren, C. X., Dai, D. Q., & Yan, H. (2022). Unsupervised domain adaptation via discriminative manifold propagation. TPAMI, 44(3), 1653–1669.

    Article  Google Scholar 

  • Ma, W., Zhang, J., Li, S., Liu, C. H., Wang, Y., & Li, W. (2021). Exploiting both domain-specific and invariant knowledge via a win-win transformer for unsupervised domain adaptation. Preprint at arXiv:2111.12941

  • Maaten, L. V. D., & Hinton, G. (2008). Visualizing data using t-SNE. JMLR, 9(NOV), 2579–2605.

    Google Scholar 

  • Pan, S. J., & Yang, Q. (2009). A survey on transfer learning. TKDE, 22(10), 1345–1359.

    Google Scholar 

  • Pan, Y., Yao, T., Li, Y., Wang, Y., Ngo, C. W., & Mei, T. (2019). Transferrable prototypical networks for unsupervised domain adaptation. In CVPR (pp. 2239–2247).

  • Pan, S. J., Tsang, I. W., Kwok, J. T., & Yang, Q. (2010). Domain adaptation via transfer component analysis. TNN, 22(2), 199–210.

    Google Scholar 

  • Peng, X., Usman, B., Kaushik, N., Hoffman, J., Wang, D., & Saenko, K. (2017). Visda: The visual domain adaptation challenge. Preprint at arXiv:1710.06924

  • Peyré, G., & Cuturi, M. (2019). Computational optimal transport: With applications to data science. Foundations and Trends® in Machine Learning, 11(5–6), 355–607.

    Article  Google Scholar 

  • Peng, X., Bai, Q., Xia, X., Huang, Z., Saenko, K., & Wang, B. (2019). Moment matching for multi-source domain adaptation. In ICCV (pp. 1406–1415).

  • Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., & Krueger, G. (2021). Learning transferable visual models from natural language supervision. In ICML (pp. 8748–8763).

  • Rangwani, H., Aithal, S.K., Mishra, M., Jain, A., & Radhakrishnan, V. B. (2022). A closer look at smoothness in domain adversarial training. In ICML (pp. 18378–18399).

  • Ren, C. X., Liang, B. H., Ge, P., et al. (2020). Domain adaptive person re-identification via camera style generation and label propagation. IEEE Transactions on Information Forensics and Security, 15, 1290–1302.

    Article  Google Scholar 

  • Ren, C. X., Luo, Y. W., & Dai, D. Q. (2023). BuresNet: Conditional bures metric for transferable representation learning. TPAMI, 45(4), 4198–4213.

    Article  Google Scholar 

  • Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., & Berg, A. C. (2015). Imagenet large scale visual recognition challenge. IJCV, 115(3), 211–252.

    Article  MathSciNet  Google Scholar 

  • Saenko, K., Kulis, B., Fritz, M., & Darrell, T. (2010). Adapting visual category models to new domains. In ECCV (pp. 213–226).

  • Santambrogio, F. (2015). Optimal transport for applied mathematicians. Birkäuser, NY, 55(58–63), 94.

    Google Scholar 

  • Sriperumbudur, B. K., Fukumizu, K., Gretton, A., Schölkopf, B., & Lanckriet, G. R. (2009). On integral probability metrics, \(\phi \)-divergences and binary classification. arXiv:0901.2698

  • Sun, B., Feng, J., & Saenko, K. (2016). Return of frustratingly easy domain adaptation. In AAAI (pp. 2058–2065).

  • Sun, T., Lu, C., Zhang, T., & Ling, H. (2022). Safe self-refinement for transformer-based domain adaptation. In CVPR (pp. 7181–7190).

  • Tachet des Combes, R., Zhao, H., & Wang, Y.X., & Gordon, G. J. (2020). Domain adaptation with conditional distribution matching and generalized label shift. In NeurIPS (pp. 19276–19289).

  • Touvron, H., Cord, M., & Douze, M., Massa, F., Sablayrolles, A., & Jégou, H. (2021). Training data-efficient image transformers & distillation through attention. In ICML (pp. 10347–10357).

  • Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. In NeurIPS (pp. 5998–6008).

  • Venkateswara, H., Eusebio, J., Chakraborty, S., & Panchanathan, S. (2017). Deep hashing network for unsupervised domain adaptation. In CVPR (pp. 5018–5027).

  • Villani, C. (2009). Optimal transport: Old and new. Springer.

    Book  Google Scholar 

  • Wang, Q., He, X., Jiang, X., et al. (2022). Robust bi-stochastic graph regularized matrix factorization for data clustering. TPAMI, 44(1), 390–403.

    Google Scholar 

  • Wang, X., Li, L., Ye, W., Long, M., & Wang, J. (2019). Transferable attention for domain adaptation. In AAAI (pp. 5345–5352).

  • Xia, H., & Ding, Z. (2020). Structure preserving generative cross-domain learning. In CVPR (pp. 4364–4373).

  • Xu, R., Li, G., Yang, J., & Lin, L. (2019). Larger norm more transferable: An adaptive feature norm approach for unsupervised domain adaptation. In ICCV (pp. 1426–1435).

  • Xu, R., Liu, P., Wang, L., Chen, C., & Wang, J. (2020). Reliable weighted optimal transport for unsupervised domain adaptation. In CVPR

  • Xu, T., Chen, W., Wang, P., Li, H., & Jin, R. (2022). CDTrans: Cross-domain transformer for unsupervised domain adaptation. In ICLR.

  • Yang, J., Liu, J., Xu, N., & Huang, J. (2023). Tvt: Transferable vision transformer for unsupervised domain adaptation. In WACV (pp. 520–530).

  • Yang, X., Deng, C., Liu, T., et al. (2022). Heterogeneous graph attention network for unsupervised multiple-target domain adaptation. TPAMI, 44(4), 1992–2003.

    Article  Google Scholar 

  • Zaheer, M., Guruganesh, G., Dubey, K.A., Ainslie, J., Alberti, C., Ontanon, S., Pham, P., Ravula, A., Wang, Q., Yang, L., & Ahmed, A. (2020). Big bird: Transformers for longer sequences. In NeurIPS (pp. 17283–17297).

  • Zhang, K., Schölkopf, B., Muandet, K., & Wang, Z. (2013). Domain adaptation under target and conditional shift. In ICML (pp. 819–827).

  • Zhang, M., Li, W., & Du, Q. (2018). Diverse region-based CNN for hyperspectral image classification. TIP, 27(6), 2623–2634.

    MathSciNet  Google Scholar 

  • Zhang, Y., Liu, T., Long, M., & Jordan, M. (2019a). Bridging theory and algorithm for domain adaptation. In ICML (pp. 7404–7413).

  • Zhang, Z., Wang, M., & Nehorai, A. (2019). Optimal transport in reproducing kernel Hilbert spaces: Theory and applications. TPAMI, 42(7), 1741–1754.

    Article  Google Scholar 

  • Zhao, X., Huang, L., Nie, J., & Wei, Z. (2023). Towards adaptive multi-scale intermediate domain via progressive training for unsupervised domain adaptation. IEEE Transactions on Multimedia pp. 1–11.

  • Zhu, J., Bai, H., & Wang, L. (2023). Patch-mix transformer for unsupervised domain adaptation: A game perspective. In CVPR (pp. 3561–3571).

  • Zhu, X., Su, W., Lu, L., Li, B., Wang, X., & Dai, J. (2021a). Deformable detr: Deformable transformers for end-to-end object detection. In ICLR.

  • Zhu, Y., Zhuang, F., Wang, J., Ke, G., Chen, J., Bian, J., Xiong, H., & He, Q. (2021). Deep subdomain adaptation network for image classification. TNNLS, 32(4), 1713–1722.

    MathSciNet  Google Scholar 

Download references

Acknowledgements

This work is supported in part by National Natural Science Foundation of China (Grant No. 62376291), in part by Guangdong Basic and Applied Basic Research Foundation (2023B1515020004), in part by Science and Technology Program of Guangzhou (2024A04J6413), in part by the Fundamental Research Funds for the Central Universities, Sun Yat-sen University (24xkjc013), in part by Guangdong Province Key Laboratory of Computational Science at Sun Yat-sen University (2020B1212060032), in part by Key Laboratory of Machine Intelligence and Advanced Computing, Ministry of Education, and in part by the Hong Kong Innovation and Technology Commission (InnoHK Project CIMDA).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chuan-Xian Ren.

Additional information

Communicated by Wanli Ouyang.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 885 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ren, CX., Zhai, Y., Luo, YW. et al. Towards Unsupervised Domain Adaptation via Domain-Transformer. Int J Comput Vis 132, 6163–6183 (2024). https://doi.org/10.1007/s11263-024-02174-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1007/s11263-024-02174-9

Keywords