Richard et al., 2023 - Google Patents

Audio signal processing in the 21st century: The important outcomes of the past 25 years

Richard et al., 2023

Document ID: 15133692059261478305
Author: Richard G; Smaragdis P; Gannot S; Naylor P; Makino S; Kellermann W; Sugiyama A
Publication year: 2023
Publication venue: IEEE Signal Processing Magazine

External Links

Cited by

Snippet

Audio signal processing has passed many landmarks in its development as a research topic. Many are well known, such as the development of the phonograph in the second half of the 19th century and technology associated with digital telephony that burgeoned in the …

Continue reading at telecom-paris.hal.science (PDF) (other versions)

238000012545 processing 0 title abstract description 89

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using predictive techniques
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/15—Aspects of sound capture and related signal processing for recording or reproduction
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/005—Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field

Similar Documents

Publication	Publication Date	Title
Richard et al.	2023	Audio signal processing in the 21st century: The important outcomes of the past 25 years
Haeb-Umbach et al.	2019	Speech processing for digital home assistants: Combining signal processing with deep-learning techniques
EP3707716B1 (en)	2021-12-01	Multi-channel speech separation
Zhang et al.	2017	Deep learning based binaural speech separation in reverberant environments
Wang et al.	2021	Sequential multi-frame neural beamforming for speech separation and enhancement
Chatterjee et al.	2022	ClearBuds: wireless binaural earbuds for learning-based speech enhancement
EP3363017A1 (en)	2018-08-22	Distributed audio capture and mixing
Wang et al.	2018	On spatial features for supervised speech separation and its application to beamforming and robust ASR
Guizzo et al.	2021	L3DAS21 challenge: Machine learning for 3D audio signal processing
Tesch et al.	2023	Multi-channel speech separation using spatially selective deep non-linear filters
Gogate et al.	2020	Deep neural network driven binaural audio visual speech separation
Yu et al.	2021	Audio-visual multi-channel integration and recognition of overlapped speech
He	2017	Spatial audio reproduction with primary ambient extraction
Ren et al.	2021	A neural beamforming network for b-format 3d speech enhancement and recognition
Dadvar et al.	2019	Robust binaural speech separation in adverse conditions based on deep neural network with modified spatial features and training target
Sun et al.	2019	A speaker-dependent approach to separation of far-field multi-talker microphone array speech for front-end processing in the CHiME-5 challenge
Kovalyov et al.	2023	Dsenet: Directional signal extraction network for hearing improvement on edge devices
Corey	2019	Microphone array processing for augmented listening
Hsu et al.	2022	Learning-based personal speech enhancement for teleconferencing by exploiting spatial-spectral features
Pandey et al.	2024	All neural low-latency directional speech extraction
Okuno et al.	2011	Robot audition: Missing feature theory approach and active audition
Kindt et al.	2022	Improved separation of closely-spaced speakers by exploiting auxiliary direction of arrival information within a U-Net architecture
Spille et al.	2013	Using binarual processing for automatic speech recognition in multi-talker scenes
Li et al.	2020	Beamformed feature for learning-based dual-channel speech separation
Mandel	2010	Binaural model-based source separation and localization