Li et al., 2020 - Google Patents
Beamformed feature for learning-based dual-channel speech separationLi et al., 2020
- Document ID
- 3024893008027029866
- Author
- Li H
- Zhang X
- Gao G
- Publication year
- Publication venue
- ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
External Links
Snippet
This paper deals with the problem of separating target speech signal from reverberant and noisy environment with dual microphones, where the target speech comes from a predefined direction range. First, we apply two differential beamformers with opposite …
- 238000000926 separation method 0 title abstract description 12
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using predictive techniques
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/005—Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R25/00—Deaf-aid sets providing an auditory perception; Electric tinnitus maskers providing an auditory perception
- H04R25/40—Arrangements for obtaining a desired directivity characteristic
- H04R25/407—Circuits for combining signals of a plurality of transducers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R1/00—Details of transducers, loudspeakers or microphones
- H04R1/20—Arrangements for obtaining desired frequency or directional characteristics
- H04R1/32—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
- H04R1/40—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/15—Aspects of sound capture and related signal processing for recording or reproduction
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Wang et al. | Complex spectral mapping for single-and multi-channel speech enhancement and robust ASR | |
| Taherian et al. | Robust speaker recognition based on single-channel and multi-channel speech enhancement | |
| Zhang et al. | Deep learning based binaural speech separation in reverberant environments | |
| Chazan et al. | Multi-microphone speaker separation based on deep DOA estimation | |
| Kinoshita et al. | A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research | |
| Han et al. | Learning spectral mapping for speech dereverberation and denoising | |
| CN114078481B (en) | Voice enhancement method and device based on two-channel neural network time-frequency masking and hearing aid equipment | |
| CN110970053A (en) | Multichannel speaker-independent voice separation method based on deep clustering | |
| Wang et al. | On spatial features for supervised speech separation and its application to beamforming and robust ASR | |
| Xiao et al. | The NTU-ADSC systems for reverberation challenge 2014 | |
| Richard et al. | Audio signal processing in the 21st century: The important outcomes of the past 25 years | |
| Nakatani et al. | Dominance based integration of spatial and spectral features for speech enhancement | |
| Dadvar et al. | Robust binaural speech separation in adverse conditions based on deep neural network with modified spatial features and training target | |
| Liu et al. | Inplace gated convolutional recurrent neural network for dual-channel speech enhancement | |
| Tu et al. | An information fusion framework with multi-channel feature concatenation and multi-perspective system combination for the deep-learning-based robust recognition of microphone array speech | |
| Priyanka et al. | Multi-channel speech enhancement using early and late fusion convolutional neural networks | |
| Kovalyov et al. | Dsenet: Directional signal extraction network for hearing improvement on edge devices | |
| Hsu et al. | Learning-based personal speech enhancement for teleconferencing by exploiting spatial-spectral features | |
| Tu et al. | LSTM-based iterative mask estimation and post-processing for multi-channel speech enhancement | |
| WO2020064089A1 (en) | Determining a room response of a desired source in a reverberant environment | |
| Fan et al. | A regression approach to binaural speech segregation via deep neural network | |
| Xiang et al. | Distributed microphones speech separation by learning spatial information with recurrent neural network | |
| Li et al. | Beamformed feature for learning-based dual-channel speech separation | |
| Wen et al. | Neural Directed Speech Enhancement with Dual Microphone Array in High Noise Scenario | |
| Aroudi et al. | DBNET: DOA-driven beamforming network for end-to-end farfield sound source separation |