Chen et al., 2020 - Google Patents
Using Taigi dramas with Mandarin Chinese subtitles to improve Taigi speech recognitionChen et al., 2020
View PDF- Document ID
 - 7052415600376253625
 - Author
 - Chen P
 - Wu C
 - Lee H
 - Tsao S
 - Ko M
 - Wang H
 - Publication year
 - Publication venue
 - 2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)
 
External Links
Snippet
An obvious problem with automatic speech recognition (ASR) for Taigi is that the amount of  training data is far from enough to build a practical ASR system. Collecting speech data with  reliable transcripts for training the acoustic model (AM) is feasible but expensive. Moreover … 
    - 241001672694 Citrus reticulata 0 title abstract description 39
 
Classifications
- 
        
- G—PHYSICS
 - G10—MUSICAL INSTRUMENTS; ACOUSTICS
 - G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
 - G10L15/00—Speech recognition
 - G10L15/08—Speech classification or search
 - G10L15/18—Speech classification or search using natural language modelling
 - G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
 - G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
 
 - 
        
- G—PHYSICS
 - G10—MUSICAL INSTRUMENTS; ACOUSTICS
 - G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
 - G10L15/00—Speech recognition
 - G10L15/08—Speech classification or search
 - G10L15/18—Speech classification or search using natural language modelling
 - G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
 - G10L15/19—Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
 - G10L15/197—Probabilistic grammars, e.g. word n-grams
 
 - 
        
- G—PHYSICS
 - G10—MUSICAL INSTRUMENTS; ACOUSTICS
 - G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
 - G10L15/00—Speech recognition
 - G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
 - G10L15/065—Adaptation
 - G10L15/07—Adaptation to the speaker
 
 - 
        
- G—PHYSICS
 - G10—MUSICAL INSTRUMENTS; ACOUSTICS
 - G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
 - G10L15/00—Speech recognition
 - G10L15/08—Speech classification or search
 - G10L2015/088—Word spotting
 
 - 
        
- G—PHYSICS
 - G10—MUSICAL INSTRUMENTS; ACOUSTICS
 - G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
 - G10L15/00—Speech recognition
 - G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
 - G10L15/063—Training
 
 - 
        
- G—PHYSICS
 - G10—MUSICAL INSTRUMENTS; ACOUSTICS
 - G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
 - G10L15/00—Speech recognition
 - G10L15/08—Speech classification or search
 - G10L2015/085—Methods for reducing search complexity, pruning
 
 - 
        
- G—PHYSICS
 - G10—MUSICAL INSTRUMENTS; ACOUSTICS
 - G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
 - G10L15/00—Speech recognition
 - G10L15/28—Constructional details of speech recognition systems
 - G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
 
 - 
        
- G—PHYSICS
 - G10—MUSICAL INSTRUMENTS; ACOUSTICS
 - G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
 - G10L15/00—Speech recognition
 - G10L15/08—Speech classification or search
 - G10L15/14—Speech classification or search using statistical models, e.g. hidden Markov models [HMMs]
 - G10L15/142—Hidden Markov Models [HMMs]
 - G10L15/144—Training of HMMs
 
 - 
        
- G—PHYSICS
 - G10—MUSICAL INSTRUMENTS; ACOUSTICS
 - G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
 - G10L15/00—Speech recognition
 - G10L15/26—Speech to text systems
 - G10L15/265—Speech recognisers specially adapted for particular applications
 
 - 
        
- G—PHYSICS
 - G10—MUSICAL INSTRUMENTS; ACOUSTICS
 - G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
 - G10L15/00—Speech recognition
 - G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
 - G10L2015/226—Taking into account non-speech caracteristics
 - G10L2015/228—Taking into account non-speech caracteristics of application context
 
 - 
        
- G—PHYSICS
 - G10—MUSICAL INSTRUMENTS; ACOUSTICS
 - G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
 - G10L15/00—Speech recognition
 - G10L15/005—Language recognition
 
 - 
        
- G—PHYSICS
 - G10—MUSICAL INSTRUMENTS; ACOUSTICS
 - G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
 - G10L15/00—Speech recognition
 - G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
 
 - 
        
- G—PHYSICS
 - G10—MUSICAL INSTRUMENTS; ACOUSTICS
 - G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
 - G10L13/00—Speech synthesis; Text to speech systems
 
 - 
        
- G—PHYSICS
 - G10—MUSICAL INSTRUMENTS; ACOUSTICS
 - G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
 - G10L17/00—Speaker identification or verification
 
 - 
        
- G—PHYSICS
 - G10—MUSICAL INSTRUMENTS; ACOUSTICS
 - G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
 - G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
 
 
Similar Documents
| Publication | Publication Date | Title | 
|---|---|---|
| US7590533B2 (en) | New-word pronunciation learning using a pronunciation graph | |
| Li et al. | Code-switch language model with inversion constraints for mixed language speech recognition | |
| Arisoy et al. | Turkish broadcast news transcription and retrieval | |
| Stolcke et al. | Recent innovations in speech-to-text transcription at SRI-ICSI-UW | |
| Lyu et al. | SEAME: a Mandarin-English code-switching speech corpus in south-east asia. | |
| Kumar et al. | A large-vocabulary continuous speech recognition system for Hindi | |
| US7412387B2 (en) | Automatic improvement of spoken language | |
| JP2002287787A (en) | Disambiguation language model | |
| Biadsy et al. | Google's cross-dialect Arabic voice search | |
| Nouza et al. | ASR for South Slavic Languages Developed in Almost Automated Way. | |
| US8170865B2 (en) | Speech recognition device and method thereof | |
| Cucu et al. | SMT-based ASR domain adaptation methods for under-resourced languages: Application to Romanian | |
| Avram et al. | Towards a romanian end-to-end automatic speech recognition based on deepspeech2 | |
| Messaoudi et al. | Arabic broadcast news transcription using a one million word vocalized vocabulary | |
| Ayan et al. | “Can you give me another word for hyperbaric?”: Improving speech translation using targeted clarification questions | |
| Masmoudi et al. | Phonetic tool for the Tunisian Arabic. | |
| Reddy et al. | Integration of statistical models for dictation of document translations in a machine-aided human translation task | |
| Goel et al. | Approaches to automatic lexicon learning with limited training examples | |
| Chen et al. | Using Taigi dramas with Mandarin Chinese subtitles to improve Taigi speech recognition | |
| Maison et al. | Pronunciation modeling for names of foreign origin | |
| Raux | Automated lexical adaptation and speaker clustering based on pronunciation habits for non-native speech recognition. | |
| Wang et al. | Multi-scale-audio indexing for translingual spoken document retrieval | |
| Huang et al. | Exploring retraining-free speech recognition for intra-sentential code-switching | |
| Narayanan et al. | Speech recognition engineering issues in speech to speech translation system design for low resource languages and domains | |
| Laurent et al. | Unsupervised acoustic model training for the Korean language |