这是indexloc提供的服务,不要输入任何密码
Skip to main content
Log in

Underwater acoustic target recognition based on multi-scale feature and CRDNet

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

To enhance the recognition accuracy of underwater acoustic target recognition (UATR) via artificial neural networks, a novel UATR approach based on multi-scale features and convolutional residual dense network (CRDNet) is proposed. This paper incorporates a multi-scale convolutional structure into the enhanced ConvNextV2 module and proposes an acoustic feature structure SFbank based on singular value decomposition (SVD). Compared to traditional network frameworks and single acoustic filtering features, this structure demonstrates significant improvements in recognition accuracy, precision, and F1-scores. Experimental validation of the proposed method is conducted on the ShipsEar dataset, achieving a recognition accuracy of 99.08%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from $39.99 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data availability

No datasets were generated or analysed during the current study.

References

  1. Liao S, Xiao W, Wang Y (2024) Dynamic hybrid parallel computing of the ray model for solving underwater acoustic fields in vast sea. Sci Rep 14(1):25385. https://doi.org/10.1038/s41598-024-76564-x

    Article  Google Scholar 

  2. Li P, Wu J, Wang Y et al (2022) STM: spectrogram transformer model for underwater acoustic target recognition. J Mar Sci Eng 10(10):1428

    Article  Google Scholar 

  3. Wang P, Peng Y (2020) Research on feature extraction and recognition method of underwater acoustic target based on deep convolutional network. In: 2020 IEEE International Conference on Advances in Electrical Engineering and Computer Applications (AEECA), pp 863–868. https://doi.org/10.1109/AEECA49918.2020.9213504

  4. Feng S, Ma S, Zhu X, Yan M (2024) Artificial intelligence-based underwater acoustic target recognition: a survey. Remote Sens 16:3333. https://doi.org/10.3390/rs16173333

    Article  Google Scholar 

  5. Luo X, Chen L, Zhou H et al (2023) A survey of underwater acoustic target recognition methods based on machine learning. J Mar Sci Eng 11(2):384

    Article  Google Scholar 

  6. Woo S, Debnath S, Hu R et al (2023) ConvNeXt V2: Co-designing and scaling convnets with masked autoencoders. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 16133–16142

  7. Jiang J, Shi T, Huang M et al (2020) Multi-scale spectral feature extraction for underwater acoustic target recognition. Measurement 166:108227

    Article  Google Scholar 

  8. Zhou A, Li X, Zhang W et al (2023) An attention-based multi-scale convolution network for intelligent underwater acoustic signal recognition. Ocean Eng 287:115784

    Article  Google Scholar 

  9. Pan X, Sun J, Feng TH et al (2024) Underwater target recognition based on adaptive multi-feature fusion network. Multimedia Tools Appl 83:1–21

    Google Scholar 

  10. Zhao D, Lei Y, Xu J et al (2022) A comparative study of four types of multi-scale entropies in feature extraction of underwater acoustic signals for potential GNSS positioning applications. Front Phys 10:1058474

    Article  Google Scholar 

  11. National Park Service (2022) Soundclips. Available at: https://www.nps.gov/glba/learn/nature/soundclips

  12. Song H, Wang H, Xiao S, Wang Y, Zhong Z, Yu L, Shan M, Liu B (2025) Underwater acoustic target recognition based on multi-scale residuals and dual attention mechanism. In: IEEE 7th International Conference on Communications, Information System and Computer Engineering (CISCE), pp 285–289. https://doi.org/10.1109/CISCE65916.2025.11065495

  13. Kim SJ, Chung YJ (2022) Multi-scale features for transformer model to improve the performance of sound event detection. Appl Sci 12(5):2626

    Article  Google Scholar 

  14. Hu F, Song P, He R et al (2023) MSARN: a multi-scale attention residual network for end-to-end environmental sound classification. Neural Process Lett 55(8):11449–11465

    Article  Google Scholar 

  15. Wang X, Song Y, Su L et al (2023) Recognition of abnormal car door noise based on multi-scale feature fusion. Proc Inst Mech Eng D J Automob Eng 237(6):1353–1364

    Article  Google Scholar 

  16. Chen Q, Wu Z, Zhong Q et al (2022) Heart sound classification based on mel-frequency cepstrum coefficient features and multi-scale residual recurrent neural networks. J Nanoelectron Optoelectron 17(8):1144–1153

    Article  Google Scholar 

  17. Zhou N, Wang L (2023) Triple feature extraction method based on multi-scale dispersion entropy and multi-scale permutation entropy in sound-based fault diagnosis. Front Phys 11:1180595

    Article  Google Scholar 

  18. Hu Y, Sun X, He L et al (2022) A generalized network based on multi-scale densely connection and residual attention for sound source localization and detection. J Acoust Soc Am 151(3):1754–1768

    Article  Google Scholar 

  19. Zeng D, Yan S, Yang J, Pan X (2025) An efficient deep learning approach with frequency and channel optimization for underwater acoustic target recognition. Sci Rep 15:27369. https://doi.org/10.1038/s41598-025-12452-2

    Article  Google Scholar 

  20. Li J, Wang J, Xu T, Shu J, Liu Y, Ma Y, Xu Y (2025) Dynamic stochastic model optimization for underwater acoustic navigation via singular value decomposition. J Mar Sci Eng 13:1329. https://doi.org/10.3390/jmse13071329

    Article  Google Scholar 

  21. Ji F, Lu S, Ni J, Li Z, Feng W (2025) Underwater target recognition method based on singular spectrum analysis and channel attention convolutional neural network. Sensors (Basel) 25(8):2573. https://doi.org/10.3390/s25082573

    Article  Google Scholar 

  22. Gao S, Li W, Zhang Y et al (2024) Extraction of acoustic normal mode depth functions using range-difference method with vertical linear array data. J Ocean Univ China 23:871–882. https://doi.org/10.1007/s11802-024-5742-6

    Article  Google Scholar 

  23. Chang D, Wang C, Jiang C (2012) Singular value decomposition based feature extraction technique for physiological signal analysis. J Med Syst 36(3):1769–1777. https://doi.org/10.1007/s10916-010-9636-3

    Article  Google Scholar 

  24. Kristomo D (2019) Dimensionality reduction of speech signals using singular value decomposition and Karhunen-Loeve. In: Proceedings of International Conference on Information System and Technology (ICIST), SCITEPRESS

  25. Grondin F, Glass J (2018) SVD-PHAT: a fast sound source localization method. arXiv preprint arXiv:1811.11785v2

  26. Wang Y, Tian Y, Liu J et al (2023) Multi-stage multi-scale local feature fusion for infrared small target detection. Remote Sens 15(18):4506

    Article  Google Scholar 

  27. Yu L, Xu F, Qu Y et al (2024) Speech emotion recognition based on multi-dimensional feature extraction and multi-scale feature fusion. Appl Acoust 216:109752

    Article  Google Scholar 

  28. Guo H, Liu W (2024) Dmaf-net: deep multi-scale attention fusion network for hyperspectral image classification with limited samples. Sensors 24(10):3153

    Article  Google Scholar 

  29. Pang S, Chen Z, Yin F (2022) Lightweight multi-scale aggregated residual attention networks for image super-resolution. Multimedia Tools Appl 81(4):4797–4819

    Article  Google Scholar 

  30. Pan H, Yang H, Xie L et al (2023) Multi-scale fusion visual attention network for facial micro-expression recognition. Front Neurosci 17:1216181

    Article  Google Scholar 

  31. Deng Y, Hu X, Li B et al (2023) Multi-scale self-attention-based feature enhancement for detection of targets with small image sizes. Pattern Recognit Lett 166:46–52

    Article  Google Scholar 

  32. Xie Y, Chen T, Xu J (2023) Advancing underwater acoustic target recognition via adaptive data pruning and smoothness-inducing regularization. arXiv preprint arXiv:2304.11907

  33. Jin A, Zeng X (2023) A novel deep learning method for underwater target recognition based on Res-Dense convolutional neural network with attention mechanism. J Mar Sci Eng 11(1):69

    Article  MathSciNet  Google Scholar 

  34. Tan J, Pan X (2023) Underwater acoustic target recognition based on convolutional neural network and multi-feature fusion. In: Proceedings of 3rd International Conference on Computer Vision and Pattern Analysis (ICCPA 2023), SPIE, vol 12754, pp 778–784

  35. Santos-Domínguez D, Torres-Guijarro S, Cardenal-López A et al (2016) ShipsEar: an underwater vessel noise database. Appl Acoust 113:64–69

    Article  Google Scholar 

  36. Yang S, Jin A, Zeng X et al (2024) Underwater acoustic target recognition based on sub-band concatenated Mel spectrogram and multidomain attention mechanism. Eng Appl Artif Intell 133:107983

    Article  Google Scholar 

  37. Li J, Wang B, Cui X et al (2022) Underwater acoustic target recognition based on attention residual network. Entropy 24(11):1657

    Article  Google Scholar 

  38. Ong JB, Ng WK, Kuo CC (2018) Convolutional neural networks with transformed input based on robust tensor network decomposition. arXiv preprint arXiv:1812.02622. https://doi.org/10.48550/arXiv.1812.02622

  39. Park DS, Chan W, Zhang Y et al (2019) SpecAugment: A simple data augmentation method for automatic speech recognition. arXiv preprint arXiv:1904.08779

  40. Nam H, Lee J, Kim S et al (2021) FilterAugment: an acoustic environmental data augmentation method. arXiv preprint arXiv:2110.03282

  41. Liu C, Dollár P, He K et al (2020) Are labels necessary for neural architecture search? In: Proceedings of European Conference on Computer Vision (ECCV)

  42. Woo S, Park J, Lee JY, Kweon IS (2018) CBAM: Convolutional block attention module. In: Proc Eur Conf on Computer Vision (ECCV)

  43. Wang H, Zheng S, Chen Y et al (2023) CAM++: a fast and efficient network for speaker verification using context-aware masking. arXiv preprint arXiv:2303.00332

  44. Okabe K, Koshinaka T, Shinoda K (2018) Attentive statistics pooling for deep speaker embedding. In: Proceedings of Interspeech, pp 2252–2256

  45. Peddinti V, Povey D, Khudanpur S (2015) A time delay neural network architecture for efficient modeling of long temporal contexts. Proc Interspeech 2015:3214–3218. https://doi.org/10.21437/Interspeech.2015-647

    Article  Google Scholar 

  46. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proc IEEE Conf on Computer Vision and Pattern Recognition (CVPR), pp 2261–2269. https://doi.org/10.1109/CVPR.2017.243

  47. Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: Proceedings of the International Conference on Learning Representations (ICLR)4

  48. Goodfellow I, Bengio Y, Courville A (2016) Deep Learning. MIT Press, Chapter, p 6

    MATH  Google Scholar 

  49. Loshchilov I, Hutter F (2017) SGDR: Stochastic gradient descent with warm restarts. In: Proceedings of the International Conference on Learning Representations (ICLR)

  50. van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(Nov):2579–2605

    MATH  Google Scholar 

  51. Irfan M, Islam MR, Kim JM, Kim TS (2021) DeepShip: an underwater acoustic benchmark dataset and a separable convolution based autoencoder for classification. Expert Syst Appl 183:115270. https://doi.org/10.1016/j.eswa.2021.115270

    Article  Google Scholar 

Download references

Funding

This work is supported in part by Ningxia Natural Science Foundation General Project (2022AAC03757, 2023AAC03889), R&D Program of Beijing Municipal Education Commission (KM202410017006), and National Key Research and Development Program of China (2023YFC3011704-2).

Author information

Authors and Affiliations

Authors

Contributions

J.L., X.Y., and Y.C. contributed to the methodology design of the study. J.L., X.Y., L.Z., and W.W. were responsible for the implementation of the proposed approach. P.Y. and H.T. carried out the formal analysis and investigation. The original draft of the manuscript was written by X.Y. W.W., Y.C., and X.Z. contributed to the review and editing of the manuscript. J.L., L.Z., and W.W. supervised the entire project. All authors reviewed and approved the final version of the manuscript.

Corresponding author

Correspondence to Wei Wei.

Ethics declarations

Conflict of interest

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, J., Chen, Y., Yang, X. et al. Underwater acoustic target recognition based on multi-scale feature and CRDNet. J Supercomput 81, 1358 (2025). https://doi.org/10.1007/s11227-025-07806-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • DOI: https://doi.org/10.1007/s11227-025-07806-6

Keywords