Al-Ali et al., 2021 - Google Patents
Enhanced forensic speaker verification performance using the ICA-EBM algorithm under noisy and reverberant environmentsAl-Ali et al., 2021
View PDF- Document ID
- 18402879875903198832
- Author
- Al-Ali A
- Chandran V
- Naik G
- Publication year
- Publication venue
- Evolutionary Intelligence
External Links
Snippet
Forensic speaker verification performance reduces significantly under high levels of noise and reverberation. Multiple channel speech enhancement algorithms, such as independent component analysis by entropy bound minimization (ICA-EBM), can be used to improve …
- 238000004088 simulation 0 abstract description 18
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. hidden Markov models [HMMs]
- G10L15/142—Hidden Markov Models [HMMs]
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Taherian et al. | Robust speaker recognition based on single-channel and multi-channel speech enhancement | |
| Koizumi et al. | DNN-based source enhancement to increase objective sound quality assessment score | |
| CN107993670B (en) | Microphone array speech enhancement method based on statistical model | |
| Zhang et al. | Deep neural network-based bottleneck feature and denoising autoencoder-based dereverberation for distant-talking speaker identification | |
| Krueger et al. | Model-based feature enhancement for reverberant speech recognition | |
| Nakatani et al. | Dominance based integration of spatial and spectral features for speech enhancement | |
| Kolossa et al. | Separation and robust recognition of noisy, convolutive speech mixtures using time-frequency masking and missing data techniques | |
| Al-Karawi | Mitigate the reverberation effect on the speaker verification performance using different methods | |
| Aroudi et al. | Dbnet: Doa-driven beamforming network for end-to-end reverberant sound source separation | |
| Ganapathy | Multivariate autoregressive spectrogram modeling for noisy speech recognition | |
| Janský et al. | Auxiliary function-based algorithm for blind extraction of a moving speaker | |
| Xiong et al. | Front-end technologies for robust ASR in reverberant environments—spectral enhancement-based dereverberation and auditory modulation filterbank features | |
| Al-Karawi et al. | Model selection toward robustness speaker verification in reverberant conditions | |
| Barhoush et al. | Speaker identification and localization using shuffled MFCC features and deep learning | |
| Alam et al. | Speech recognition in reverberant and noisy environments employing multiple feature extractors and i-vector speaker adaptation | |
| Venkatesan et al. | Binaural classification-based speech segregation and robust speaker recognition system | |
| Al-Ali et al. | Enhanced forensic speaker verification performance using the ICA-EBM algorithm under noisy and reverberant environments | |
| Sheeja et al. | Speech dereverberation and source separation using DNN-WPE and LWPR-PCA | |
| Al-Ali et al. | Enhanced forensic speaker verification using multi-run ICA in the presence of environmental noise and reverberation conditions | |
| Girin et al. | Audio source separation into the wild | |
| Li et al. | Speech enhancement based on binaural sound source localization and cosh measure wiener filtering | |
| Marti et al. | Automatic speech recognition in cocktail-party situations: A specific training for separated speech | |
| Aroudi et al. | DBNET: DOA-driven beamforming network for end-to-end farfield sound source separation | |
| Srinivasarao | Speech signal analysis and enhancement using combined wavelet Fourier transform with stacked deep learning architecture | |
| Vestman et al. | Time-varying autoregressions for speaker verification in reverberant conditions |