+

Taherian et al., 2019 - Google Patents

Deep learning based multi-channel speaker recognition in noisy and reverberant environments

Taherian et al., 2019

View PDF
Document ID
5892504555854448721
Author
Taherian H
Wang Z
Wang D
Publication year
Publication venue
Interspeech

External Links

Snippet

Despite successful applications of multi-channel signal processing in robust automatic speech recognition (ASR), relatively little research has been conducted on the effectiveness of such techniques in the robust speaker recognition domain. This paper introduces time …
Continue reading at par.nsf.gov (PDF) (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/04Training, enrolment or model building
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/005Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R25/00Deaf-aid sets providing an auditory perception; Electric tinnitus maskers providing an auditory perception
    • H04R25/40Arrangements for obtaining a desired directivity characteristic
    • H04R25/407Circuits for combining signals of a plurality of transducers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/301Automatic calibration of stereophonic sound system, e.g. with test microphone

Similar Documents

Publication Publication Date Title
Taherian et al. Deep learning based multi-channel speaker recognition in noisy and reverberant environments
Taherian et al. Robust speaker recognition based on single-channel and multi-channel speech enhancement
Tan et al. Neural spectrospatial filtering
Wang et al. Complex spectral mapping for single-and multi-channel speech enhancement and robust ASR
Yoshioka et al. Multi-microphone neural speech separation for far-field multi-talker speech recognition
Kinoshita et al. A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research
Wang et al. All-Neural Multi-Channel Speech Enhancement.
Wang et al. Sequential multi-frame neural beamforming for speech separation and enhancement
Ren et al. A Causal U-Net Based Neural Beamforming Network for Real-Time Multi-Channel Speech Enhancement.
US20110096915A1 (en) Audio spatialization for conference calls with multiple and moving talkers
CN114078481B (en) Voice enhancement method and device based on two-channel neural network time-frequency masking and hearing aid equipment
US20100217590A1 (en) Speaker localization system and method
Gu et al. Rezero: Region-customizable sound extraction
Xiao et al. The NTU-ADSC systems for reverberation challenge 2014
Araki et al. Spatial correlation model based observation vector clustering and MVDR beamforming for meeting recognition
Koldovský et al. Semi-blind noise extraction using partially known position of the target source
Habets Speech dereverberation using statistical reverberation models
Zhang et al. A Deep Learning Approach to Multi-Channel and Multi-Microphone Acoustic Echo Cancellation.
Kim Hearing aid speech enhancement using phase difference-controlled dual-microphone generalized sidelobe canceller
Koldovský et al. Noise reduction in dual-microphone mobile phones using a bank of pre-measured target-cancellation filters
Kovalyov et al. Dsenet: Directional signal extraction network for hearing improvement on edge devices
Sivasankaran et al. Analyzing the impact of speaker localization errors on speech separation for automatic speech recognition
Song et al. An integrated multi-channel approach for joint noise reduction and dereverberation
Bohlender et al. Neural networks using full-band and subband spatial features for mask based source separation
Masuyama et al. Exploring the integration of speech separation and recognition with self-supervised learning representation
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载