Non-linear Feature Selection Based on Convolution Neural Networks with Sparse Regularization

Wu, Wen-Bin; Chen, Si-Bao; Ding, Chris; Luo, Bin

doi:10.1007/s12559-023-10230-8

Non-linear Feature Selection Based on Convolution Neural Networks with Sparse Regularization

Published: 01 December 2023

Volume 16, pages 654–670, (2024)
Cite this article

Cognitive Computation Aims and scope Submit manuscript

Wen-Bin Wu ORCID: orcid.org/0000-0002-6491-1642¹,
Si-Bao Chen¹,
Chris Ding² &
…
Bin Luo¹

445 Accesses
1 Citation
Explore all metrics

Abstract

The efficacy of feature selection methods in dimensionality reduction and enhancing the performance of learning algorithms has been well documented. Traditional feature selection algorithms often grapple with delineating non-linear relationships between features and responses. While deep neural networks excel in capturing such non-linearities, their inherent “black-box” nature detracts from their interpretability. Furthermore, the complexity of deep network architectures can give rise to prolonged training durations and the challenge of vanishing gradients. This study aims to refine network structures, hasten network training, and bolster model interpretability without forfeiting accuracy. This paper delves into a sparse-weighted feature selection approach grounded in convolutional neural networks, termed the low-dimensional sparse-weighted feature selection network (LSWFSNet). LSWFSNet integrates a convolutional selection kernel between the input and convolutional layers, facilitating weighted convolutional calculations on input data while imposing sparse constraints on the selection kernel. Features with significant weights in this kernel are earmarked for subsequent operations in the LSWFSNet computational domain, while those with negligible weights are eschewed to diminish model intricacy. By streamlining the network’s input data, LSWFSNet refines the post-convolution feature maps, thus simplifying its structure. Acknowledging the intrinsic interconnections within the data, our study amalgamates diverse sparse constraints into a cohesive objective function. This ensures the convolutional kernel’s sparsity while acknowledging the structural dynamics of the data. Notably, the foundational convolutional network in this method can be substituted with any deep convolutional network, contingent upon suitable adjustments to the convolutional selection kernel in relation to input data dimensions. The LSWFSNet model was tested on human emotion electroencephalography (EEG) datasets curated by Shanghai Jiao Tong University. When various sparse constraint methodologies were employed, the convolutional kernel manifested sparsity. Regions in the convolutional selection kernel with non-zero weights were identified as having strong correlations with emotional responses. The empirical outcomes not only resonate with extant neuroscience insights but also supersede the baseline network in accuracy metrics. LSWFSNet’s applicability extends to pivotal tasks like keypoint recognition, be it the extraction of salient pixels in facial detection models or the isolation of target attributes in object detection frameworks. This study’s significance is anchored in the amalgamation of sparse constraint techniques with deep convolutional networks, supplanting traditional fully connected networks. This fusion amplifies model interpretability and broadens its applicability, notably in image processing arenas.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Novel Feature Fusion with Self-adaptive Weight Method Based on Deep Learning for Image Classification

Sparse representations of high dimensional neural data

Article Open access 04 May 2022

Sparse Deep LSTMs with Convolutional Attention for Human Action Recognition

Article 19 March 2021

Data Availability

The data used in this study was collected by the Department of Computer Science, Shanghai Jiao Tong University. The author of this paper established communication with the department via email and signed an application for data usage. With the department’s consent, the author obtained permission to use the data solely for academic research purposes. Without explicit permission from the relevant department of the Department of Computer Science, Shanghai Jiao Tong University, the author of this paper is not authorized to share or disclose the data.

References

Zhang S, Lang Z-Q. Orthogonal least squares based fast feature selection for linear classification. Patt Recog. 2022;3(123):108419.
Article Google Scholar
Shang R, Zhang X, Feng J, et al. Sparse and low-dimensional representation with maximum entropy adaptive graph for feature selection. Neurocomputing. 2022;7(485):57–73.
Article Google Scholar
Hallajian B, Motameni H, Akbari E. Ensemble feature selection using distance-based supervised and unsupervised methods in binary classification. Exp Syst Appl. 2022;15(200):116794.
Article Google Scholar
Li M, Huan J, Yang J. Automatic feature extraction and fusion recognition of motor imagery EEG using multilevel multiscale CNN. Med Biol Eng Comput. 2021;59:2037–50.
Article Google Scholar
Chen S, Ding CHQ, Zhou Z, Luo B. Feature selection based on correlation deflation. Neural Comput Appl. 2019;10(31):6383–92.
Article Google Scholar
You D, Sun M, Liang S, et al. Online feature selection for multi-source streaming features. Inf Sci. 2022;4(590):267–95.
Article Google Scholar
Wei Z, Li Q, Wei J, et al. Neural networks for a class of sparse optimization with $L_0$-regularization. Neural Netw. 2022;151:211–21.
Vu V, Lei J. Minimax sparse principal subspace estimation in high dimension. Inst Math Stat. 2013;6(41):2905–47.
MathSciNet Google Scholar
Pang T, Nie F, Han J, et al. Efficient feature selection via $L_{2,0}$-norm constrained sparsed regression. IEEE Trans Knowl Data Eng. 2019;5(31):880–93.
Article Google Scholar
Jin X, Miao J, Wang Q, et al. Sparse matrix factorization with $L_{2,1}$-norm for matrix completion. Patt Recog. 2022;127:108655.
Article Google Scholar
Huang Y, Jie W, Yu Z, et al. Supervised feature selection through deep neural networks with pairwise connected structure. Knowl Based Syst. 2020;27(204):106202.
Article Google Scholar
Tokovarov M. Convolutional neural networks with reusable full-dimension-long layers for feature selection and classification of motor imagery in EEG signals. In: 29th International Conference on Artificial Neural Networks. 2020. p. 79–91.
Wu Y, Lan Y, Zhang L, et al. Feature flow regularization: improving structured sparsity in deep neural networks. Neural Netw. 2023;161:598–613.
Article Google Scholar
Nie F, Huang H, Cai X, Ding C.: Efficient and robust feature selection via joint $L_{2,1}$-norm minimization. In: Proceedings of the 23rd International Conference on Neural Information Processing Systems. 2020. p. 1813–21.
Wang Z, Nie F, Lai T, et al. Discriminative feature selection via a structured sparse subspace learning Module. In: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI-2020). pp. 3009–15.
Zhang H, Wang J, Sun Z, et al. Feature selection for neural networks using group Lasso regularization. IEEE Trans Knowl Data Eng. 2020;4(32):659–73.
Article Google Scholar
Cai X, Nie F, Huang H. Exact top-k feature selection via l2,0-norm constraint. In: Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence. 2013. p. 1240–6.
Scardapance S, Comminiello D, Hussain A, et al. Group sparse regularization for deep neural networks. Neurocomputing. 2017;7(241):81–9.
Article Google Scholar
Rui T, Zou J, Zhou Y, et al. Convolutional neural network feature maps selection based on LDA. Multimed Tools Appl. 2018;77:10635–49.
Article Google Scholar
Xie X, Zhang H, Wang J, et al. Learning optimized structure of neural networks by hidden node pruning with $L_1$ regularization. IEEE Trans Cybern. 2020;3(50):1333–46.
Article Google Scholar
Li Y, Yu C, Wasserman W. Deep feature selection: theory and application to identify enhancers and promoters. J Comput Biol. 2016;5(23):322–36.
Article Google Scholar
Yamada Y, Lindenbaum O, Negahban S, et al. Feature selection using stochastic gates. In: Proceedings of the 37th International Conference on Machine Learning, 119. 2020. p. 10648–59.
Roffo G, Melzi S, Castellani U, et al. Infinite feature selection: a graph-based feature filtering approach. IEEE Trans Patt Anal Mach Intell. 2021;12(43):4396–410.
Article Google Scholar
Zuo Z, Li J, Xu H, et al. Curvature-based feature selection with application in classifying electronic health records. Technol Forecast Soc Change. 2021;173:121–7.
Article Google Scholar
Guo X, Yu K, Cao F, et al. Error-aware Markov blanket learning for causal feature selection. Inf Sci. 2022;589:849–77.
Article Google Scholar
Saadatmand H, Akbarzadeh-T M-R. Set-based integer-coded fuzzy granular evolutionary algorithms for high-dimensional feature selection. Appl Soft Comput. 2023;142:110240.

Download references

Funding

This study was funded by NSFC Key Project of International (Regional) Cooperation and Exchanges (no. 61860206004) and in part by the National Natural Science Foundation of China (no. 61976004).

Author information

Authors and Affiliations

Anhui Provincial Key Lab of Multimodal Cognitive Computation, MOE Key Lab of ICSP, School of Computer Science and Technology, Anhui University, Hefei, 230601, China
Wen-Bin Wu, Si-Bao Chen & Bin Luo
School of Data Science, The Chinese University of Hong Kong, Shenzhen, 518172, China
Chris Ding

Authors

Wen-Bin Wu
View author publications
Search author on:PubMed Google Scholar
Si-Bao Chen
View author publications
Search author on:PubMed Google Scholar
Chris Ding
View author publications
Search author on:PubMed Google Scholar
Bin Luo
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Si-Bao Chen.

Ethics declarations

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed Consent

Informed consent was obtained from all individual participants included in the study.

Competing Interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

In this paper, seven various types of convolutional networks are used for backbone, namely VGG-16, Alexnet, Googlenet, Resnet-34, Densenet-101, Efficientnet-B0, and Mobilenet-V2.

Backbone with and without sparse constraints are trained with the same feature vectors from the training set that is used to construct the feature subset. For fair comparison under various sparsity constraints, the parameter setting of the model, including learning rate, is kept exactly the same. The algorithm and methodology are described in “Proposed New Architecture Description” section.

Tables 3 and 4 show the comparison of the accuracy of different networks under different sparsity constraints, where the numbers in parentheses in the table represent the proportion of features screened out by the model under that sparsity constraint.

Table 3 Classification accuracy of single-channel EEG signals under different sparsity constraints

Full size table

From the accuracy of the two tables, it is easy to find that the accuracy achieved by shallow networks, such as VGG16, is significantly higher than deep networks, such as Densenet101. The results show that there is no strong correlation between the model’s depth and the accuracy. In addition, the feedback of the brain is not completely consistent due to other relevant factors such as the physical state of the human subjects. Therefore, the data sampled from the same human subjects at different times are not exactly the same, which leads to the model accuracy not being exactly the same, for example, “JL20140404” and “JL20140419” in Table 3.

From the sparsity in the two tables, the model can reduce the amount of input data up to $30\%$. However, this does not reach the desirable degree of sparsity. This may be due to the fact that emotional feedback is a very complex process, not just an activity which involving a certain part of brain regions. But the method does achieve results in the aspect of input reduction.

Table 4 Classification accuracy of multi-channel EEG signals under different sparsity constraints

Full size table

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wu, WB., Chen, SB., Ding, C. et al. Non-linear Feature Selection Based on Convolution Neural Networks with Sparse Regularization. Cogn Comput 16, 654–670 (2024). https://doi.org/10.1007/s12559-023-10230-8

Download citation

Received: 02 November 2022
Accepted: 12 November 2023
Published: 01 December 2023
Version of record: 01 December 2023
Issue date: March 2024
DOI: https://doi.org/10.1007/s12559-023-10230-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Non-linear Feature Selection Based on Convolution Neural Networks with Sparse Regularization

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Novel Feature Fusion with Self-adaptive Weight Method Based on Deep Learning for Image Classification

Sparse representations of high dimensional neural data

Sparse Deep LSTMs with Convolutional Attention for Human Action Recognition

Explore related subjects

Data Availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethical Approval

Informed Consent

Competing Interest

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now