这是indexloc提供的服务,不要输入任何密码
Skip to main content
Log in

Non-linear Feature Selection Based on Convolution Neural Networks with Sparse Regularization

  • Published:
Cognitive Computation Aims and scope Submit manuscript

Abstract

The efficacy of feature selection methods in dimensionality reduction and enhancing the performance of learning algorithms has been well documented. Traditional feature selection algorithms often grapple with delineating non-linear relationships between features and responses. While deep neural networks excel in capturing such non-linearities, their inherent “black-box” nature detracts from their interpretability. Furthermore, the complexity of deep network architectures can give rise to prolonged training durations and the challenge of vanishing gradients. This study aims to refine network structures, hasten network training, and bolster model interpretability without forfeiting accuracy. This paper delves into a sparse-weighted feature selection approach grounded in convolutional neural networks, termed the low-dimensional sparse-weighted feature selection network (LSWFSNet). LSWFSNet integrates a convolutional selection kernel between the input and convolutional layers, facilitating weighted convolutional calculations on input data while imposing sparse constraints on the selection kernel. Features with significant weights in this kernel are earmarked for subsequent operations in the LSWFSNet computational domain, while those with negligible weights are eschewed to diminish model intricacy. By streamlining the network’s input data, LSWFSNet refines the post-convolution feature maps, thus simplifying its structure. Acknowledging the intrinsic interconnections within the data, our study amalgamates diverse sparse constraints into a cohesive objective function. This ensures the convolutional kernel’s sparsity while acknowledging the structural dynamics of the data. Notably, the foundational convolutional network in this method can be substituted with any deep convolutional network, contingent upon suitable adjustments to the convolutional selection kernel in relation to input data dimensions. The LSWFSNet model was tested on human emotion electroencephalography (EEG) datasets curated by Shanghai Jiao Tong University. When various sparse constraint methodologies were employed, the convolutional kernel manifested sparsity. Regions in the convolutional selection kernel with non-zero weights were identified as having strong correlations with emotional responses. The empirical outcomes not only resonate with extant neuroscience insights but also supersede the baseline network in accuracy metrics. LSWFSNet’s applicability extends to pivotal tasks like keypoint recognition, be it the extraction of salient pixels in facial detection models or the isolation of target attributes in object detection frameworks. This study’s significance is anchored in the amalgamation of sparse constraint techniques with deep convolutional networks, supplanting traditional fully connected networks. This fusion amplifies model interpretability and broadens its applicability, notably in image processing arenas.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from $39.99 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data Availability

The data used in this study was collected by the Department of Computer Science, Shanghai Jiao Tong University. The author of this paper established communication with the department via email and signed an application for data usage. With the department’s consent, the author obtained permission to use the data solely for academic research purposes. Without explicit permission from the relevant department of the Department of Computer Science, Shanghai Jiao Tong University, the author of this paper is not authorized to share or disclose the data.

References

  1. Zhang S, Lang Z-Q. Orthogonal least squares based fast feature selection for linear classification. Patt Recog. 2022;3(123):108419.

    Article  Google Scholar 

  2. Shang R, Zhang X, Feng J, et al. Sparse and low-dimensional representation with maximum entropy adaptive graph for feature selection. Neurocomputing. 2022;7(485):57–73.

    Article  Google Scholar 

  3. Hallajian B, Motameni H, Akbari E. Ensemble feature selection using distance-based supervised and unsupervised methods in binary classification. Exp Syst Appl. 2022;15(200):116794.

    Article  Google Scholar 

  4. Li M, Huan J, Yang J. Automatic feature extraction and fusion recognition of motor imagery EEG using multilevel multiscale CNN. Med Biol Eng Comput. 2021;59:2037–50.

    Article  Google Scholar 

  5. Chen S, Ding CHQ, Zhou Z, Luo B. Feature selection based on correlation deflation. Neural Comput Appl. 2019;10(31):6383–92.

    Article  Google Scholar 

  6. You D, Sun M, Liang S, et al. Online feature selection for multi-source streaming features. Inf Sci. 2022;4(590):267–95.

    Article  Google Scholar 

  7. Wei Z, Li Q, Wei J, et al. Neural networks for a class of sparse optimization with \(L_0\)-regularization. Neural Netw. 2022;151:211–21.

  8. Vu V, Lei J. Minimax sparse principal subspace estimation in high dimension. Inst Math Stat. 2013;6(41):2905–47.

    MathSciNet  Google Scholar 

  9. Pang T, Nie F, Han J, et al. Efficient feature selection via \(L_{2,0}\)-norm constrained sparsed regression. IEEE Trans Knowl Data Eng. 2019;5(31):880–93.

    Article  Google Scholar 

  10. Jin X, Miao J, Wang Q, et al. Sparse matrix factorization with \(L_{2,1}\)-norm for matrix completion. Patt Recog. 2022;127:108655.

    Article  Google Scholar 

  11. Huang Y, Jie W, Yu Z, et al. Supervised feature selection through deep neural networks with pairwise connected structure. Knowl Based Syst. 2020;27(204):106202.

    Article  Google Scholar 

  12. Tokovarov M. Convolutional neural networks with reusable full-dimension-long layers for feature selection and classification of motor imagery in EEG signals. In: 29th International Conference on Artificial Neural Networks. 2020. p. 79–91.

  13. Wu Y, Lan Y, Zhang L, et al. Feature flow regularization: improving structured sparsity in deep neural networks. Neural Netw. 2023;161:598–613.

    Article  Google Scholar 

  14. Nie F, Huang H, Cai X, Ding C.: Efficient and robust feature selection via joint \(L_{2,1}\)-norm minimization. In: Proceedings of the 23rd International Conference on Neural Information Processing Systems. 2020. p. 1813–21.

  15. Wang Z, Nie F, Lai T, et al. Discriminative feature selection via a structured sparse subspace learning Module. In: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI-2020). pp. 3009–15.

  16. Zhang H, Wang J, Sun Z, et al. Feature selection for neural networks using group Lasso regularization. IEEE Trans Knowl Data Eng. 2020;4(32):659–73.

    Article  Google Scholar 

  17. Cai X, Nie F, Huang H. Exact top-k feature selection via l2,0-norm constraint. In: Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence. 2013. p. 1240–6.

  18. Scardapance S, Comminiello D, Hussain A, et al. Group sparse regularization for deep neural networks. Neurocomputing. 2017;7(241):81–9.

    Article  Google Scholar 

  19. Rui T, Zou J, Zhou Y, et al. Convolutional neural network feature maps selection based on LDA. Multimed Tools Appl. 2018;77:10635–49.

    Article  Google Scholar 

  20. Xie X, Zhang H, Wang J, et al. Learning optimized structure of neural networks by hidden node pruning with \(L_1\) regularization. IEEE Trans Cybern. 2020;3(50):1333–46.

    Article  Google Scholar 

  21. Li Y, Yu C, Wasserman W. Deep feature selection: theory and application to identify enhancers and promoters. J Comput Biol. 2016;5(23):322–36.

    Article  Google Scholar 

  22. Yamada Y, Lindenbaum O, Negahban S, et al. Feature selection using stochastic gates. In: Proceedings of the 37th International Conference on Machine Learning, 119. 2020. p. 10648–59.

  23. Roffo G, Melzi S, Castellani U, et al. Infinite feature selection: a graph-based feature filtering approach. IEEE Trans Patt Anal Mach Intell. 2021;12(43):4396–410.

    Article  Google Scholar 

  24. Zuo Z, Li J, Xu H, et al. Curvature-based feature selection with application in classifying electronic health records. Technol Forecast Soc Change. 2021;173:121–7.

    Article  Google Scholar 

  25. Guo X, Yu K, Cao F, et al. Error-aware Markov blanket learning for causal feature selection. Inf Sci. 2022;589:849–77.

    Article  Google Scholar 

  26. Saadatmand H, Akbarzadeh-T M-R. Set-based integer-coded fuzzy granular evolutionary algorithms for high-dimensional feature selection. Appl Soft Comput. 2023;142:110240.

Download references

Funding

This study was funded by NSFC Key Project of International (Regional) Cooperation and Exchanges (no. 61860206004) and in part by the National Natural Science Foundation of China (no. 61976004).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Si-Bao Chen.

Ethics declarations

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed Consent

Informed consent was obtained from all individual participants included in the study.

Competing Interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

In this paper, seven various types of convolutional networks are used for backbone, namely VGG-16, Alexnet, Googlenet, Resnet-34, Densenet-101, Efficientnet-B0, and Mobilenet-V2.

Backbone with and without sparse constraints are trained with the same feature vectors from the training set that is used to construct the feature subset. For fair comparison under various sparsity constraints, the parameter setting of the model, including learning rate, is kept exactly the same. The algorithm and methodology are described in “Proposed New Architecture Description” section.

Tables 3 and 4 show the comparison of the accuracy of different networks under different sparsity constraints, where the numbers in parentheses in the table represent the proportion of features screened out by the model under that sparsity constraint.

Table 3 Classification accuracy of single-channel EEG signals under different sparsity constraints

From the accuracy of the two tables, it is easy to find that the accuracy achieved by shallow networks, such as VGG16, is significantly higher than deep networks, such as Densenet101. The results show that there is no strong correlation between the model’s depth and the accuracy. In addition, the feedback of the brain is not completely consistent due to other relevant factors such as the physical state of the human subjects. Therefore, the data sampled from the same human subjects at different times are not exactly the same, which leads to the model accuracy not being exactly the same, for example, “JL20140404” and “JL20140419” in Table 3.

From the sparsity in the two tables, the model can reduce the amount of input data up to \(30\%\). However, this does not reach the desirable degree of sparsity. This may be due to the fact that emotional feedback is a very complex process, not just an activity which involving a certain part of brain regions. But the method does achieve results in the aspect of input reduction.

Table 4 Classification accuracy of multi-channel EEG signals under different sparsity constraints

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, WB., Chen, SB., Ding, C. et al. Non-linear Feature Selection Based on Convolution Neural Networks with Sparse Regularization. Cogn Comput 16, 654–670 (2024). https://doi.org/10.1007/s12559-023-10230-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1007/s12559-023-10230-8

Keywords