Integrated Heterogeneous Graph Attention Network for Incomplete Multi-modal Clustering

Wang, Yu; Yao, Xinjie; Zhu, Pengfei; Li, Weihao; Cao, Meng; Hu, Qinghua

doi:10.1007/s11263-024-02066-y

Integrated Heterogeneous Graph Attention Network for Incomplete Multi-modal Clustering

Published: 24 April 2024

Volume 132, pages 3847–3866, (2024)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

2029 Accesses
11 Citations
Explore all metrics

Abstract

Incomplete multi-modal clustering (IMmC) is challenging due to the unexpected missing of some modalities in data. A key to this problem is to explore complementarity information among different samples with incomplete information of unpaired data. Despite preliminary progress, existing methods suffer from (1) relying heavily on paired data, and (2) difficulty in mining complementarity on data with high missing rates. To address the problems, we propose a novel method, Integrated Heterogeneous Graph ATtention (IHGAT) network, for IMmC. To fully exploit the complementarity among different samples and modalities, we first construct a set of integrated heterogeneous graphs based on the similarity graph learned from unified latent representations and the modality-specific availability graphs formed by the existing relations of different samples. Thereafter, the attention mechanism is applied to the constructed integrated heterogeneous graph to aggregate the embedded content of heterogeneous neighbors for each node. In this way, the representations of missing modalities can be learned based on the complementarity information of other samples and their other modalities. Finally, the consistency of probability distribution is embedded into the network for clustering. Consequently, the proposed method can form a complete latent space where incomplete information can be supplemented by other related samples via the learned intrinsic structure. Extensive experiments on eight public datasets show that the proposed IHGAT outperforms existing methods under various settings and is typically more robust in cases of high missing rates.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multimodal heterogeneous graph attention network

Article 10 October 2022

High-order consensus graph learning for incomplete multi-view clustering

Article 10 March 2025

McH-HGCN: multi-curvature hyperbolic heterogeneous graph convolutional network with type triplets

Article 05 April 2023

Data Availibility

The CUB Wah et al. (2011) dataset can be obtained from https://www.vision.caltech.edu/datasets/cub_200_2011/. The Football dataset can be obtained from http://mlg.ucd.ie/aggregation/index.html. The ORL dataset can be obtained from https://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html. The PIE dataset can be obtained from http://www.cs.cmu.edu/afs/cs/project/PIE/MultiPie/Multi-Pie/Home.html. The Politics dataset can be obtained from http://mlg.ucd.ie/aggregation/index.html. The 3Sources dataset can be obtained from http://mlg.ucd.ie/datasets/3sources.html.

Notes

References

Baltrušaitis, T., Ahuja, C., & Morency, L.-P. (2018). Multimodal machine learning: A survey and taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(2), 423–443.
Article Google Scholar
Bothorel, C., Cruz, J. D., Magnani, M., & Micenkova, B. (2015). Clustering attributed graphs: Models, measures and methods. Network Science, 3(3), 408–444.
Article Google Scholar
Brasó, G., Cetintas, O., & Leal-Taixé, L. (2022). Multi-object tracking and segmentation via neural message passing. International Journal of Computer Vision, 130(12), 3035–3053.
Article Google Scholar
Brissman, E., Johnander, J., Danelljan, M., & Felsberg, M. (2023). Recurrent graph neural networks for video instance segmentation. International Journal of Computer Vision, 131(2), 471–495.
Article Google Scholar
Cao, Y., Luo, X., Yang, J., Cao, Y., & Yang, M. Y. (2022). Locality guided cross-modal feature aggregation and pixel-level fusion for multispectral pedestrian detection. Information Fusion, 88, 1–11.
Article Google Scholar
Chang, S., Han, W., Tang, J., Qi, G.-J., Aggarwal, C. C., & Huang, T. S. (2015). Heterogeneous network embedding via deep architectures. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 119–128.
Chen, L., Gao, Y., Huang, X., Jensen, C. S., & Zheng, B. (2020). Efficiently distributed clustering algorithms on star-schema heterogeneous graphs. IEEE Transactions on Knowledge and Data Engineering, pp. 1–15.
Chen, Y., Mancini, M., Zhu, X., & Akata, Z. (2022). Semi-supervised and unsupervised deep visual learning: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1–23.
Chen, Y., Xiao, X., & Zhou, Y. (2019). Jointly learning kernel representation tensor and affinity matrix for multi-view clustering. IEEE Transactions on Multimedia, 22(8), 1985–1997.
Article Google Scholar
Cheng, J., Wang, Q., Tao, Z., Xie, D.-Y., & Gao, Q. (2020). Multi-view attribute graph convolution networks for clustering. In IJCAI, pp. 2973–2979.
Deng, S., Wen, J., Liu, C., Yan, K., Xu, G., & Xu, Y. (2023). Projective incomplete multi-view clustering. IEEE Transactions on Neural Networks and Learning Systems.
Enders, C. K. (2010). Applied missing data analysis. Guilford press.
Fang, U., Li, M., Li, J., Gao, L., Jia, T., & Zhang, Y. (2023). A comprehensive survey on multi-view clustering. IEEE Transactions on Knowledge and Data Engineering, 35(12), 12350–12368.
Article Google Scholar
Hamilton, W., Ying, Z., & Leskovec, J. (2017). Inductive representation learning on large graphs. Advances in Neural Information Processing systems.
Han, R., Gan, Y., Wang, L., Li, N., Feng, W., & Wang, S. (2023). Relating view directions of complementary-view mobile cameras via the human shadow. International Journal of Computer Vision, pp. 1–16.
Hotelling, H. (1992). Relations between two sets of variates. In Breakthroughs in Statistics, pp. 162–190. Springer.
Kumar, R., Chen, T., Hardt, M., Beymer, D., Brannon, K., & Syeda-Mahmood, T. (2013). Multiple kernel completion and its application to cardiac disease discrimination. In 2013 IEEE 10th International Symposium on Biomedical Imaging, pp. 764–767. IEEE.
Le, Q. & Mikolov, T. (2014). Distributed representations of sentences and documents. In International Conference on Machine Learning, pp. 1188–1196. PMLR.
Li, L., Wan, Z., & He, H. (2021). Incomplete multi-view clustering with joint partition and graph learning. IEEE Transactions on Knowledge and Data Engineering, pp. 1–15.
Li, X., Wu, Y., Ester, M., Kao, B., Wang, X., & Zheng, Y. (2017). Semi-supervised clustering in attributed heterogeneous information networks. In Proceedings of the 26th International Conference on World Wide Web, pp. 1621–1629.
Lin, Y., Gou, Y., Liu, X., Bai, J., Lv, J., & Peng, X. (2023). Dual contrastive prediction for incomplete multi-view representation learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(4), 4447–4461.
Google Scholar
Lin, Y., Gou, Y., Liu, Z., Li, B., Lv, J., & Peng, X. (2021). Completer: Incomplete multi-view clustering via contrastive prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11174–11183.
Michieli, U., & Zanuttigh, P. (2022). Edge-aware graph matching network for part-based semantic segmentation. International Journal of Computer Vision, 130(11), 2797–2821.
Article Google Scholar
Qi, G.-J., Aggarwal, C. C., & Huang, T. S. (2012). On clustering heterogeneous social media objects with outlier links. In Proceedings of the 5th ACM International Conference on Web Search and Data Mining, pp. 553–562.
Shao, W., He, L., & Philip, S. Y. (2015). Multiple incomplete views clustering via weighted nonnegative matrix factorization with $l_ {2, 1}$ regularization. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 318–334. Springer.
Shi, C., Li, Y., Zhang, J., Sun, Y., & Philip, S. Y. (2016). A survey of heterogeneous information network analysis. IEEE Transactions on Knowledge and Data Engineering, 29(1), 17–37.
Article Google Scholar
Tao, Z., Liu, H., Li, J., Wang, Z., & Fu, Y. (2019). Adversarial graph embedding for ensemble clustering. In International Joint Conferences on Artificial Intelligence Organization, pp. 3562–3568.
Tran, L., Liu, X., Zhou, J., & Jin, R. (2017). Missing modalities imputation via cascaded residual autoencoder. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1405–1414.
Veličković, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., & Bengio, Y. (2018). Graph Attention Networks. International Conference on Learning Representations.
Wah, C., Branson, S., Welinder, P., Perona, P., & Belongie, S. (2011). The caltech-ucsd birds-200-2011 dataset.
Wang, Q., Ding, Z., Tao, Z., Gao, Q., & Fu, Y. (2018). Partial multi-view clustering via consistent gan. In 2018 IEEE International Conference on Data Mining (ICDM), pp. 1290–1295. IEEE.
Wang, Q., Ding, Z., Tao, Z., Gao, Q., & Fu, Y. (2021). Generative partial multi-view clustering with adaptive fusion and cycle consistency. IEEE Transactions on Image Processing, 30, 1771–1783.
Article Google Scholar
Wang, Q., Lian, H., Sun, G., Gao, Q., & Jiao, L. (2020). icmsc: Incomplete cross-modal subspace clustering. IEEE Transactions on Image Processing, 30, 305–317.
Article Google Scholar
Wang, Q., Zhan, L., Thompson, P., & Zhou, J. (2020b). Multimodal learning with incomplete modalities by knowledge distillation. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1828–1838.
Wang, X., Ji, H., Shi, C., Wang, B., Ye, Y., Cui, P., & Yu, P. S. (2019). Heterogeneous graph attention network. In The World Wide Web Conference, pp. 2022–2032.
Wen, J., Xu, G., Tang, Z., Wang, W., Fei, L., & Xu, Y. (2023a). Graph regularized and feature aware matrix factorization for robust incomplete multi-view clustering. IEEE Transactions on Circuits and Systems for Video Technology.
Wen, J., Yan, K., Zhang, Z., Xu, Y., Wang, J., Fei, L., & Zhang, B. (2020). Adaptive graph completion based incomplete multi-view clustering. IEEE Transactions on Multimedia, 23, 2493–2504.
Article Google Scholar
Wen, J., Zhang, Z., Fei, L., Zhang, B., Xu, Y., Zhang, Z., & Li, J. (2023). A survey on incomplete multiview clustering. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 53(2), 1136–1149.
Article Google Scholar
Wen, J., Zhang, Z., Zhang, Z., Zhu, L., Fei, L., Zhang, B., & Xu, Y. (2021). Unified tensor framework for incomplete multi-view clustering and missing-view inferring. In Proceedings of the AAAI Conference on Artificial Intelligence, 35, 10273–10281.
Article Google Scholar
Xiang, S., Yuan, L., Fan, W., Wang, Y., Thompson, P. M., & Ye, J. (2013). Multi-source learning with block-wise missing data for alzheimer’s disease prediction. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 185–193.
Xie, D., Zhang, X., Gao, Q., Han, J., Xiao, S., & Gao, X. (2019). Multiview clustering by joint latent representation and similarity learning. IEEE Transactions on Cybernetics, 50(11), 4848–4854.
Article Google Scholar
Xu, C., Tao, D., & Xu, C. (2015). Multi-view learning with incomplete views. IEEE Transactions on Image Processing, 24(12), 5812–5825.
Article MathSciNet Google Scholar
Xu, K., Hu, W., Leskovec, J., & Jegelka, S. (2019). How powerful are graph neural networks? International Conference on Learning Representations.
Yang, L., Shen, C., Hu, Q., Jing, L., & Li, Y. (2019). Adaptive sample-level graph combination for partial multiview clustering. IEEE Transactions on Image Processing, 29, 2780–2794.
Article Google Scholar
Yang, S., Li, L., Wang, S., Zhang, W., Huang, Q., & Tian, Q. (2019). Skeletonnet: A hybrid network with a skeleton-embedding process for multi-view image representation learning. IEEE Transactions on Multimedia, 21(11), 2916–2929.
Article Google Scholar
Yuan, L., Wang, Y., Thompson, P. M., Narayan, V. A., & Ye, J. (2012). Multi-source learning for joint analysis of incomplete multi-modality neuroimaging data. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1149–1157.
Zhan, K., Nie, F., Wang, J., & Yang, Y. (2018). Multiview consensus graph clustering. IEEE Transactions on Image Processing, 28(3), 1261–1270.
Article MathSciNet Google Scholar
Zhang, C., Cui, Y., Han, Z., Zhou, J. T., Fu, H., & Hu, Q. (2022). Deep partial multi-view learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44, 2402–2415.
Google Scholar
Zhang, C., Fu, H., Hu, Q., Cao, X., Xie, Y., Tao, D., & Xu, D. (2018). Generalized latent multi-view subspace clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(1), 86–99.
Article Google Scholar
Zhang, C., Fu, H., Wang, J., Li, W., Cao, X., & Hu, Q. (2020). Tensorized multi-view subspace representation learning. International Journal of Computer Vision, 128(8–9), 2344–2361.
Article MathSciNet Google Scholar
Zhang, C., Song, D., Huang, C., Swami, A., & Chawla, N. V. (2019). Heterogeneous graph neural network. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 793–803.
Zhang, L., Zhao, Y., Zhu, Z., Shen, D., & Ji, S. (2018). Multi-view missing data completion. IEEE Transactions on Knowledge and Data Engineering, 30(7), 1296–1309.
Article Google Scholar
Zhang, Y., Xiong, Y., Kong, X., Li, S., Mi, J., & Zhu, Y. (2018c). Deep collective classification in heterogeneous information networks. In Proceedings of the 2018 World Wide Web Conference, pp. 399–408.
Zhao, J., Wang, X., Shi, C., Hu, B., Song, G., & Ye, Y. (2021). Heterogeneous graph structure learning for graph neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, pp. 4697–4705.

Download references

Acknowledgements

This work was supported in part by the National Science and Technology Major Project under Grant 2022ZD0116500, in part by the National Natural Science Foundation of China under Grants 62106174, 62222608, 62266035, and 61925602, and in part by Tianjin Natural Science Funds for Distinguished Young Scholar under Grant 23JCJQJC00270.

Funding

National Science and Technology Major Project under Grant 2022ZD0116500; National Natural Science Foundation of China under Grants 62106174, 62222608, 62266035, and 61925602; Tianjin Natural Science Funds for Distinguished Young Scholar under Grant 23JCJQJC00270.

Author information

Authors and Affiliations

College of Intelligence and Computing, Tianjin University, Tianjin, China
Yu Wang, Xinjie Yao, Pengfei Zhu, Meng Cao & Qinghua Hu
Engineering Research Center of City Intelligence and Digital Governance, Ministry of Education of the People’s Republic of China, Tianjin, China
Yu Wang, Xinjie Yao, Pengfei Zhu & Qinghua Hu
Haihe Lab of ITAI, Tianjin, China
Yu Wang, Xinjie Yao, Pengfei Zhu & Qinghua Hu
Department of Computer Science, Boston University, Boston, USA
Weihao Li

Authors

Yu Wang
View author publications
Search author on:PubMed Google Scholar
Xinjie Yao
View author publications
Search author on:PubMed Google Scholar
Pengfei Zhu
View author publications
Search author on:PubMed Google Scholar
Weihao Li
View author publications
Search author on:PubMed Google Scholar
Meng Cao
View author publications
Search author on:PubMed Google Scholar
Qinghua Hu
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Pengfei Zhu.

Ethics declarations

Conflict of interest

The authors have no Conflict of interest to declare that are relevant to the content of this article.

Additional information

Communicated by Massimiliano Mancini.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, Y., Yao, X., Zhu, P. et al. Integrated Heterogeneous Graph Attention Network for Incomplete Multi-modal Clustering. Int J Comput Vis 132, 3847–3866 (2024). https://doi.org/10.1007/s11263-024-02066-y

Download citation

Received: 11 February 2023
Accepted: 08 March 2024
Published: 24 April 2024
Version of record: 24 April 2024
Issue date: September 2024
DOI: https://doi.org/10.1007/s11263-024-02066-y

Keywords

Part of a collection:

Special Issue on Multimodal Learning

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Integrated Heterogeneous Graph Attention Network for Incomplete Multi-modal Clustering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multimodal heterogeneous graph attention network

High-order consensus graph learning for incomplete multi-view clustering

McH-HGCN: multi-curvature hyperbolic heterogeneous graph convolutional network with type triplets

Explore related subjects

Data Availibility

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now