Taherian et al., 2019 - Google Patents

Deep learning based multi-channel speaker recognition in noisy and reverberant environments

Taherian et al., 2019

Document ID: 5892504555854448721
Author: Taherian H; Wang Z; Wang D
Publication year: 2019
Publication venue: Interspeech

External Links

Cited by

Snippet

Despite successful applications of multi-channel signal processing in robust automatic speech recognition (ASR), relatively little research has been conducted on the effectiveness of such techniques in the robust speaker recognition domain. This paper introduces time …

Continue reading at par.nsf.gov (PDF) (other versions)

239000011159 matrix material 0 abstract description 21

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/005—Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R25/00—Deaf-aid sets providing an auditory perception; Electric tinnitus maskers providing an auditory perception
- H04R25/40—Arrangements for obtaining a desired directivity characteristic
- H04R25/407—Circuits for combining signals of a plurality of transducers
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field
- H04S7/301—Automatic calibration of stereophonic sound system, e.g. with test microphone

Similar Documents

Publication	Publication Date	Title
Taherian et al.	2019	Deep learning based multi-channel speaker recognition in noisy and reverberant environments
Taherian et al.	2020	Robust speaker recognition based on single-channel and multi-channel speech enhancement
Tan et al.	2022	Neural spectrospatial filtering
Wang et al.	2020	Complex spectral mapping for single-and multi-channel speech enhancement and robust ASR
Yoshioka et al.	2018	Multi-microphone neural speech separation for far-field multi-talker speech recognition
Kinoshita et al.	2016	A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research
Wang et al.	2018	All-Neural Multi-Channel Speech Enhancement.
Wang et al.	2021	Sequential multi-frame neural beamforming for speech separation and enhancement
Ren et al.	2021	A Causal U-Net Based Neural Beamforming Network for Real-Time Multi-Channel Speech Enhancement.
US20110096915A1 (en)	2011-04-28	Audio spatialization for conference calls with multiple and moving talkers
CN114078481B (en)	2024-12-17	Voice enhancement method and device based on two-channel neural network time-frequency masking and hearing aid equipment
US20100217590A1 (en)	2010-08-26	Speaker localization system and method
Gu et al.	2024	Rezero: Region-customizable sound extraction
Xiao et al.	2014	The NTU-ADSC systems for reverberation challenge 2014
Araki et al.	2016	Spatial correlation model based observation vector clustering and MVDR beamforming for meeting recognition
Koldovský et al.	2013	Semi-blind noise extraction using partially known position of the target source
Habets	2010	Speech dereverberation using statistical reverberation models
Zhang et al.	2021	A Deep Learning Approach to Multi-Channel and Multi-Microphone Acoustic Echo Cancellation.
Kim	2019	Hearing aid speech enhancement using phase difference-controlled dual-microphone generalized sidelobe canceller
Koldovský et al.	2013	Noise reduction in dual-microphone mobile phones using a bank of pre-measured target-cancellation filters
Kovalyov et al.	2023	Dsenet: Directional signal extraction network for hearing improvement on edge devices
Sivasankaran et al.	2021	Analyzing the impact of speaker localization errors on speech separation for automatic speech recognition
Song et al.	2021	An integrated multi-channel approach for joint noise reduction and dereverberation
Bohlender et al.	2021	Neural networks using full-band and subband spatial features for mask based source separation
Masuyama et al.	2023	Exploring the integration of speech separation and recognition with self-supervised learning representation