Abstract
The need for learning from unlabeled data is increasing in contemporary machine learning. Methods for unsupervised feature ranking, which identify the most important features in such data are thus gaining attention, and so are their applications in studying high throughput biological experiments or user bases for recommender systems. We propose FRANe (Feature Ranking via Attribute Networks), an unsupervised algorithm capable of finding key features in given unlabeled data set. FRANe is based on ideas from network reconstruction and network analysis. FRANe performs better than state-of-the-art competitors, as we empirically demonstrate on a large collection of benchmarks. Moreover, we provide the time complexity analysis of FRANe further demonstrating its scalability. Finally, FRANe offers as the result the interpretable relational structures used to derive the feature importances.
Supported by the Slovenian Research Agency (grant P2-0103 and a young researcher grant), and European Commission (grant 952215).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Backstrom, L., Leskovec, J.: Supervised random walks: predicting and recommending links in social networks. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, WSDM 2011, pp. 635–644. Association for Computing Machinery, New York (2011). https://doi.org/10.1145/1935826.1935914
Benavoli, A., Corani, G., Demšar, J., Zaffalon, M.: Time for a change: a tutorial for comparing multiple classifiers through Bayesian analysis. J. Mach. Learn. Res. 18(1), 2653–2688 (2017)
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)
Bin Abdullah, I.: Incremental PageRank for Twitter data using hadoop. Master’s thesis, School of Informatics, University of Edinburgh, Scotland (2010)
Chiquet, J., Robin, S., Mariadassou, M.: Variational inference for sparse network reconstruction from count data. In: International Conference on Machine Learning, pp. 1162–1171. PMLR (2019)
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Doquet, G., Sebag, M.: Agnostic feature selection. In: Brefeld, U., Fromont, E., Hotho, A., Knobbe, A., Maathuis, M., Robardet, C. (eds.) ECML PKDD 2019. LNCS (LNAI), vol. 11906, pp. 343–358. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-46150-8_21
He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: Proceedings of the 18th International Conference on Neural Information Processing Systems, NIPS 2005, pp. 507–514. MIT Press, Cambridge (2005)
Langfelder, P., Horvath, S.: WGCNA: an R package for weighted correlation network analysis. BMC Bioinform. 9(1), 559 (2008)
Li, J., et al.: Feature selection: a data perspective. ACM Comput. Surv. (CSUR) 50(6), 94 (2018)
Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: bringing order to the web. Technical report 1999-66, Stanford InfoLab (November 1999)
Sanguinetti, G., et al.: Gene regulatory network inference: an introductory survey. In: Sanguinetti, G., Huynh-Thu, V. (eds.) Gene Regulatory Networks, pp. 1–23. Springer, New York (2019). https://doi.org/10.1007/978-1-4939-8882-2_1
Solorio-Fernández, S., Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F.: A review of unsupervised feature selection methods. Artif. Intell. Rev. 53(2), 907–948 (2019). https://doi.org/10.1007/s10462-019-09682-y
Stańczyk, U., Jain, L.C. (eds.): Feature Selection for Data and Pattern Recognition. SCI, vol. 584. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-45620-0
Wagle, N., Jasani, S., Gawand, S., Tilekar, S., Patil, P.: Twitter UserRank using hadoop MapReduce. In: Proceedings of the ACM Symposium on Women in Research 2016, WIR 2016, pp. 150–153, Association for Computing Machinery, New York (2016). https://doi.org/10.1145/2909067.2909095
Zhu, Z., Peng, Q., Guan, X.: Personalized PageRank based feature selection for high-dimension data. In: 2019 11th International Conference on Knowledge and Systems Engineering (KSE), pp. 1–6 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Primožič, U., Škrlj, B., Džeroski, S., Petković, M. (2021). Unsupervised Feature Ranking via Attribute Networks. In: Soares, C., Torgo, L. (eds) Discovery Science. DS 2021. Lecture Notes in Computer Science(), vol 12986. Springer, Cham. https://doi.org/10.1007/978-3-030-88942-5_26
Download citation
DOI: https://doi.org/10.1007/978-3-030-88942-5_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88941-8
Online ISBN: 978-3-030-88942-5
eBook Packages: Computer ScienceComputer Science (R0)