Abstract
Feature selection is a fundamental technique for reducing the dimensionality of high-dimensional data by identifying the most relevant features while discarding redundant or irrelevant ones. In unsupervised settings, where labeled data are unavailable and labeling is costly, effective feature selection becomes even more challenging. This paper proposes AE-MCDM, a novel unsupervised feature selection method that integrates autoencoder-based feature extraction with multi-criteria decision-making (MCDM). The autoencoder captures high-level feature representations, and the connection weights between input features and hidden neurons reflect feature importance. These weights are then processed using MCDM to rank and select the most informative features. Unlike conventional unsupervised feature selection methods, AE-MCDM leverages deep representation learning to enhance feature evaluation. To the best of our knowledge, this is the first attempt to combine autoencoders with MCDM for feature selection. Extensive experiments on various datasets demonstrate that AE-MCDM outperforms existing methods in terms of clustering performance, measured by metrics such as accuracy, precision, recall, and normalized mutual information (NMI), while also achieving competitive computational efficiency.
Similar content being viewed by others
Data availability
Data generated during the study are subject to a data sharing mandate and available in a few public repositories. All used data are cited in text.
References
Hashemi A, Pajoohan M-R, Dowlatshahi MB (2024) NSOFS: a non-dominated sorting-based online feature selection algorithm. Neural Comput Appl 36:1181–1197. https://doi.org/10.1007/s00521-023-09089-5
Karimi F, Dowlatshahi MB, Hashemi A (2023) SemiACO: a semi-supervised feature selection based on ant colony optimization. Expert Syst Appl 214:119130. https://doi.org/10.1016/j.eswa.2022.119130
Hashemi A, Pajoohan M-R, Dowlatshahi MB (2022) Online streaming feature selection based on Sugeno fuzzy integral. In: 2022 9th Iranian Joint Congress on Fuzzy and Intelligent Systems (CFIS). pp 1–6
Theng D, Bhoyar KK (2024) Feature selection techniques for machine learning: a survey of more than two decades of research. Knowl Inf Syst 66:1575–1637. https://doi.org/10.1007/s10115-023-02010-5
Dhal P, Azad C (2022) A comprehensive survey on feature selection in the various fields of machine learning. Appl Intell 52:4543–4581. https://doi.org/10.1007/s10489-021-02550-9
Dowlatshahi MB, Hashemi A (2023) Unsupervised feature selection: a fuzzy multi-criteria decision-making approach. Iran J Fuzzy Syst 20:55–70. https://doi.org/10.22111/IJFS.2023.7630
Jia W, Sun M, Lian J, Hou S (2022) Feature dimensionality reduction: a review. Complex Intell Syst 8:2663–2693. https://doi.org/10.1007/s40747-021-00637-x
Hashemi A, Joodaki M, Joodaki NZ, Dowlatshahi MB (2022) Ant colony optimization equipped with an ensemble of heuristics through multi-criteria decision making: a case study in ensemble feature selection. Appl Soft Comput 124:109046. https://doi.org/10.1016/j.asoc.2022.109046
Hashemi A, Dowlatshahi MB, Nezamabadi-pour H (2020) MFS-MCDM: Multi-label feature selection using multi-criteria decision making. Knowl-Based Syst 206:106365. https://doi.org/10.1016/j.knosys.2020.106365
Qian W, Huang J, Xu F et al (2023) A survey on multi-label feature selection from perspectives of label fusion. Inform Fusion 100:101948. https://doi.org/10.1016/j.inffus.2023.101948
Hancer E, Xue B, Zhang M (2020) A survey on feature selection approaches for clustering. Artif Intell Rev 53:4519–4545. https://doi.org/10.1007/s10462-019-09800-w
Mahesh B (2020) Machine learning algorithms-a review. International Journal of Science and Research (IJSR)[Internet] 9:381–386
Sidhom O, Ghazouani H, Barhoumi W (2024) Three-phases hybrid feature selection for facial expression recognition. J Supercomput 80:8094–8128. https://doi.org/10.1007/s11227-023-05758-3
Ayad AG, Sakr NA, Hikal NA (2024) A hybrid approach for efficient feature selection in anomaly intrusion detection for IoT networks. J Supercomput 80:26942–26984. https://doi.org/10.1007/s11227-024-06409-x
Hashemi A, Dowlatshahi MB (2023) A Fuzzy Integral Approach for Ensembling Unsupervised Feature Selection Algorithms. In: 2023 28th International Computer Conference, Computer Society of Iran (CSICC). pp 1–6
Got A, Moussaoui A, Zouache D (2021) Hybrid filter-wrapper feature selection using whale optimization algorithm: a multi-objective approach. Expert Syst Appl 183:115312. https://doi.org/10.1016/j.eswa.2021.115312
Zaman EAK, Mohamed A, Ahmad A (2022) Feature selection for online streaming high-dimensional data: a state-of-the-art review. Appl Soft Comput 127:109355. https://doi.org/10.1016/j.asoc.2022.109355
Solorio-Fernández S, Carrasco-Ochoa JA, Martínez-Trinidad JFco (2020) A review of unsupervised feature selection methods. Artif Intell Rev 53:907–948. https://doi.org/10.1007/s10462-019-09682-y
Liao H, Chen H, Yin T et al (2025) A general adaptive unsupervised feature selection with auto-weighting. Neural Netw 181:106840. https://doi.org/10.1016/j.neunet.2024.106840
Han K, Wang Y, Zhang C, et al (2018) Autoencoder Inspired Unsupervised Feature Selection. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). pp 2941–2945
Hafezalkotob A, Hafezalkotob A, Liao H, Herrera F (2019) An overview of MULTIMOORA for multi-criteria decision-making: theory, developments, applications, and challenges. Inform Fusion 51:145–177. https://doi.org/10.1016/j.inffus.2018.12.002
Hashemi A, Dowlatshahi MB, Nezamabadi-pour H (2021) VMFS: a VIKOR-based multi-target feature selection. Expert Syst Appl 182:115224. https://doi.org/10.1016/j.eswa.2021.115224
Hashemi A, Dowlatshahi MB, Nezamabadi-pour H (2022) Ensemble of feature selection algorithms: a multi-criteria decision-making approach. Int J Mach Learn & Cyber 13:49–69. https://doi.org/10.1007/s13042-021-01347-z
Lin X, Guan J, Chen B, Zeng Y (2022) Unsupervised feature selection via orthogonal basis clustering and local structure preserving. IEEE Trans Neural Networks and Learn Syst 33:6881–6892. https://doi.org/10.1109/TNNLS.2021.3083763
Guo J, Zhu W (2018) Dependence Guided Unsupervised Feature Selection. In: Proceedings of the AAAI Conference on Artificial Intelligence 32, https://doi.org/10.1609/aaai.v32i1.11904
Zhu P, Zhu W, Wang W et al (2017) Non-convex regularized self-representation for unsupervised feature selection. Image Vis Comput 60:22–29. https://doi.org/10.1016/j.imavis.2016.11.014
Huang D, Cai X, Wang C-D (2019) Unsupervised feature selection with multi-subspace randomization and collaboration. Knowl-Based Syst 182:104856. https://doi.org/10.1016/j.knosys.2019.07.027
Xie J, Wang M, Xu S et al (2021) The unsupervised feature selection algorithms based on standard deviation and cosine similarity for genomic data analysis. Front Genet. https://doi.org/10.3389/fgene.2021.684100
Beiranvand F, Mehrdad V, Dowlatshahi MB (2022) Unsupervised feature selection for image classification: a bipartite matching-based principal component analysis approach. Knowl-Based Syst 250:109085. https://doi.org/10.1016/j.knosys.2022.109085
Feng S, Duarte MF (2018) Graph autoencoder-based unsupervised feature selection with broad and local data structure preservation. Neurocomputing 312:310–323. https://doi.org/10.1016/j.neucom.2018.05.117
Xu X, Gu H, Wang Y et al (2019) Autoencoder Based Feature Selection Method for Classification of Anticancer Drug Response. Front Genet. https://doi.org/10.3389/fgene.2019.00233
Uzma MU, Halim Z (2023) Protein encoder: an autoencoder-based ensemble feature selection scheme to predict protein secondary structure. Expert Syst Appl 213:119081. https://doi.org/10.1016/j.eswa.2022.119081
Yousefi-Azar M, Varadharajan V, Hamey L, Tupakula U (2017) Autoencoder-based feature learning for cyber security applications. In: 2017 International Joint Conference on Neural Networks (IJCNN). pp 3854–3861
Gong X, Yu L, Wang J et al (2022) Unsupervised feature selection via adaptive autoencoder with redundancy control. Neural Netw 150:87–101. https://doi.org/10.1016/j.neunet.2022.03.004
Zhang Y, Yang A, Xiong C et al (2014) Feature selection using data envelopment analysis. Knowl-Based Syst 64:70–80. https://doi.org/10.1016/j.knosys.2014.03.022
Lee C-Y, Cai J-Y (2020) LASSO variable selection in data envelopment analysis with small datasets. Omega 91:102019. https://doi.org/10.1016/j.omega.2018.12.008
Meng Q, Catchpoole D, Skillicom D, Kennedy PJ (2017) Relational autoencoder for feature extraction. In: 2017 International Joint Conference on Neural Networks (IJCNN). pp 364–371
Olshausen BA, Field DJ (1997) Sparse coding with an overcomplete basis set: a strategy employed by V1? Vision Res 37:3311–3325. https://doi.org/10.1016/S0042-6989(97)00169-7
Meng L, Ding S, Xue Y (2017) Research on denoising sparse autoencoder. Int J Mach Learn Cybern 8:1719–1729. https://doi.org/10.1007/s13042-016-0550-y
Taherdoost H, Madanchian M (2023) Multi-criteria decision making (MCDM) methods and concepts. Encyclopedia 3:77–87. https://doi.org/10.3390/encyclopedia3010006
Khan J, Wei JS, Ringnér M et al (2001) Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat Med 7:673–679. https://doi.org/10.1038/89044
Chiaretti S, Li X, Gentleman R et al (2004) Gene expression profile of adult T-cell acute lymphocytic leukemia identifies distinct subsets of patients with different response to therapy and survival. Blood 103:2771–2778. https://doi.org/10.1182/blood-2003-09-3243
Christensen BC, Houseman EA, Marsit CJ et al (2009) Aging and environmental exposures alter tissue-specific DNA methylation dependent upon CpG island context. PLoS Genet 5:e1000602. https://doi.org/10.1371/journal.pgen.1000602
Johnsen H, Pesich R, Geisler S et al (2003) Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc Natl Acad Sci USA 100:8418–8423. https://doi.org/10.1073/pnas.0932692100
Alon U, Barkai N, Notterman DA et al (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci 96:6745–6750. https://doi.org/10.1073/pnas.96.12.6745
Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11:86–92. https://doi.org/10.1214/aoms/1177731944
Funding
This research did not receive any specific grant from public, commercial, or not-for-profit funding agencies.
Author information
Authors and Affiliations
Contributions
Amin Hashemi, Mohammad Bagher Dowlatshahi, Parham Moradi, and Siamak Farshidi proposed the research idea, then Amin Hashemi implemented the experiments, and finally, Amin Hashemi and Mohammad Bagher Dowlatshahi wrote the manuscript. All authors discussed the results and contributed to the final manuscript. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Hashemi, A., Dowlatshahi, M.B., Farshidi, S. et al. AE-MCDM: an autoencoder-based multi-criteria decision-making approach for unsupervised feature selection. J Supercomput 81, 804 (2025). https://doi.org/10.1007/s11227-025-07316-5
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1007/s11227-025-07316-5