+

WO2008105263A1 - 重み係数学習システム及び音声認識システム - Google Patents

重み係数学習システム及び音声認識システム Download PDF

Info

Publication number
WO2008105263A1
WO2008105263A1 PCT/JP2008/052721 JP2008052721W WO2008105263A1 WO 2008105263 A1 WO2008105263 A1 WO 2008105263A1 JP 2008052721 W JP2008052721 W JP 2008052721W WO 2008105263 A1 WO2008105263 A1 WO 2008105263A1
Authority
WO
WIPO (PCT)
Prior art keywords
weight coefficient
score
update
audio recognition
coefficient learning
Prior art date
Application number
PCT/JP2008/052721
Other languages
English (en)
French (fr)
Inventor
Tadashi Emori
Yoshifumi Onishi
Original Assignee
Nec Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nec Corporation filed Critical Nec Corporation
Priority to EP08711545A priority Critical patent/EP2133868A4/en
Priority to US12/528,864 priority patent/US8494847B2/en
Priority to JP2009501184A priority patent/JP5294086B2/ja
Publication of WO2008105263A1 publication Critical patent/WO2008105263A1/ja

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Machine Translation (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

 重み係数学習システムは、学習音声データを認識しその認識結果を出力する音声認識手段と、音響モデルと言語モデルとから得られるスコアに対し、学習音声データの正解テキストを用いて計算される正解のスコアと、認識結果のスコアとの差が大きくなるようにスコアにかかる重み係数を更新する重み係数更新手段と、更新された後のスコアを用いて、重み係数更新手段に戻り重み係数を再更新するか否かを判断する収束判断手段と、更新された後のスコアを用いて、音声認識手段まで戻りその処理をやり直して重み係数更新手段により重み係数を更新するか否かを判断する重み係数収束判断手段とを備える。
PCT/JP2008/052721 2007-02-28 2008-02-19 重み係数学習システム及び音声認識システム WO2008105263A1 (ja)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP08711545A EP2133868A4 (en) 2007-02-28 2008-02-19 WEIGHT COEFFICIENT LEARNING SYSTEM AND AUDIO RECOGNITION SYSTEM
US12/528,864 US8494847B2 (en) 2007-02-28 2008-02-19 Weighting factor learning system and audio recognition system
JP2009501184A JP5294086B2 (ja) 2007-02-28 2008-02-19 重み係数学習システム及び音声認識システム

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2007-049975 2007-02-28
JP2007049975 2007-02-28

Publications (1)

Publication Number Publication Date
WO2008105263A1 true WO2008105263A1 (ja) 2008-09-04

Family

ID=39721099

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2008/052721 WO2008105263A1 (ja) 2007-02-28 2008-02-19 重み係数学習システム及び音声認識システム

Country Status (4)

Country Link
US (1) US8494847B2 (ja)
EP (1) EP2133868A4 (ja)
JP (1) JP5294086B2 (ja)
WO (1) WO2008105263A1 (ja)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011039468A (ja) * 2009-08-14 2011-02-24 Korea Electronics Telecommun 電子辞書で音声認識を用いた単語探索装置及びその方法
WO2012105231A1 (ja) * 2011-02-03 2012-08-09 日本電気株式会社 モデル適応化装置、モデル適応化方法およびモデル適応化用プログラム
CN106463114A (zh) * 2015-03-31 2017-02-22 索尼公司 信息处理设备、控制方法及程序

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101390433B1 (ko) * 2009-03-31 2014-04-29 후아웨이 테크놀러지 컴퍼니 리미티드 신호 잡음 제거 방법, 신호 잡음 제거 장치, 및 오디오 디코딩 시스템
US20110313762A1 (en) * 2010-06-20 2011-12-22 International Business Machines Corporation Speech output with confidence indication
US8688454B2 (en) * 2011-07-06 2014-04-01 Sri International Method and apparatus for adapting a language model in response to error correction
US9117194B2 (en) 2011-12-06 2015-08-25 Nuance Communications, Inc. Method and apparatus for operating a frequently asked questions (FAQ)-based system
US9015097B2 (en) * 2012-12-19 2015-04-21 Nuance Communications, Inc. System and method for learning answers to frequently asked questions from a semi-structured data source
US9064001B2 (en) 2013-03-15 2015-06-23 Nuance Communications, Inc. Method and apparatus for a frequently-asked questions portal workflow
US10395555B2 (en) * 2015-03-30 2019-08-27 Toyota Motor Engineering & Manufacturing North America, Inc. System and method for providing optimal braille output based on spoken and sign language
CN111583906B (zh) * 2019-02-18 2023-08-15 中国移动通信有限公司研究院 一种语音会话的角色识别方法、装置及终端
CN110853669B (zh) * 2019-11-08 2023-05-16 腾讯科技(深圳)有限公司 音频识别方法、装置及设备
CN112185363B (zh) * 2020-10-21 2024-02-13 北京猿力未来科技有限公司 音频处理方法及装置

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07152394A (ja) * 1993-07-22 1995-06-16 At & T Corp 結合されたストリングモデルの最小誤認率訓練
JPH0981182A (ja) * 1995-09-11 1997-03-28 Atr Onsei Honyaku Tsushin Kenkyusho:Kk 隠れマルコフモデルの学習装置及び音声認識装置
JPH1185186A (ja) * 1997-09-08 1999-03-30 Atr Onsei Honyaku Tsushin Kenkyusho:Kk 不特定話者音響モデル生成装置及び音声認識装置
JP2000352993A (ja) * 1999-06-14 2000-12-19 Oki Electric Ind Co Ltd 音声認識システム及びヒドン・マルコフ・モデルの学習方法
JP2002278578A (ja) * 2001-02-13 2002-09-27 Koninkl Philips Electronics Nv 音声認識システム、学習装置、及び、最大エントロピー音声モデルのフリーパラメータの反復値計算方法
JP2004333738A (ja) * 2003-05-06 2004-11-25 Nec Corp 映像情報を用いた音声認識装置及び方法
JP2007017548A (ja) * 2005-07-05 2007-01-25 Advanced Telecommunication Research Institute International 音声認識結果の検証装置及びコンピュータプログラム
JP2007049975A (ja) 2005-08-15 2007-03-01 Mikio Kuzuu ジャムの製造方法

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5710866A (en) * 1995-05-26 1998-01-20 Microsoft Corporation System and method for speech recognition using dynamically adjusted confidence measure
JPH09258786A (ja) 1996-03-21 1997-10-03 Fuji Xerox Co Ltd 調整機能を有する音声認識装置
JPH09274498A (ja) 1996-04-04 1997-10-21 Fuji Xerox Co Ltd 音声認識装置
US5819220A (en) * 1996-09-30 1998-10-06 Hewlett-Packard Company Web triggered word set boosting for speech interfaces to the world wide web
US6490555B1 (en) * 1997-03-14 2002-12-03 Scansoft, Inc. Discriminatively trained mixture models in continuous speech recognition
US6865528B1 (en) * 2000-06-01 2005-03-08 Microsoft Corporation Use of a unified language model
US6671669B1 (en) * 2000-07-18 2003-12-30 Qualcomm Incorporated combined engine system and method for voice recognition
US6754625B2 (en) * 2000-12-26 2004-06-22 International Business Machines Corporation Augmentation of alternate word lists by acoustic confusability criterion
US7395205B2 (en) * 2001-02-13 2008-07-01 International Business Machines Corporation Dynamic language model mixtures with history-based buckets
US7627473B2 (en) 2004-10-15 2009-12-01 Microsoft Corporation Hidden conditional random field models for phonetic classification and speech recognition
US8898052B2 (en) * 2006-05-22 2014-11-25 Facebook, Inc. Systems and methods for training statistical speech translation systems from speech utilizing a universal speech recognizer
US7716049B2 (en) * 2006-06-30 2010-05-11 Nokia Corporation Method, apparatus and computer program product for providing adaptive language model scaling
US20080154600A1 (en) * 2006-12-21 2008-06-26 Nokia Corporation System, Method, Apparatus and Computer Program Product for Providing Dynamic Vocabulary Prediction for Speech Recognition

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07152394A (ja) * 1993-07-22 1995-06-16 At & T Corp 結合されたストリングモデルの最小誤認率訓練
JPH0981182A (ja) * 1995-09-11 1997-03-28 Atr Onsei Honyaku Tsushin Kenkyusho:Kk 隠れマルコフモデルの学習装置及び音声認識装置
JPH1185186A (ja) * 1997-09-08 1999-03-30 Atr Onsei Honyaku Tsushin Kenkyusho:Kk 不特定話者音響モデル生成装置及び音声認識装置
JP2000352993A (ja) * 1999-06-14 2000-12-19 Oki Electric Ind Co Ltd 音声認識システム及びヒドン・マルコフ・モデルの学習方法
JP2002278578A (ja) * 2001-02-13 2002-09-27 Koninkl Philips Electronics Nv 音声認識システム、学習装置、及び、最大エントロピー音声モデルのフリーパラメータの反復値計算方法
JP2004333738A (ja) * 2003-05-06 2004-11-25 Nec Corp 映像情報を用いた音声認識装置及び方法
JP2007017548A (ja) * 2005-07-05 2007-01-25 Advanced Telecommunication Research Institute International 音声認識結果の検証装置及びコンピュータプログラム
JP2007049975A (ja) 2005-08-15 2007-03-01 Mikio Kuzuu ジャムの製造方法

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
F.J. OCH: "Discriminative Training and Maximum Entropy Models for Statistical Machine Translation", PROC. ACL, July 2002 (2002-07-01), pages 295 - 302, XP002524733
KITA: "Language model and calculation 4: Probabilistic language model", 1999, UNIVERSITY OF TOKYO PRESS
LAFFERTY: "Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data", PROC. OF ICML, 2001, pages 282 - 289
S. YOUNG: "The HTK Book for HTK", April 2005, CAMBRIDGE UNIVERSITY ENGINEERING DEPARTMENT, pages: 1 - 345
See also references of EP2133868A4

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011039468A (ja) * 2009-08-14 2011-02-24 Korea Electronics Telecommun 電子辞書で音声認識を用いた単語探索装置及びその方法
WO2012105231A1 (ja) * 2011-02-03 2012-08-09 日本電気株式会社 モデル適応化装置、モデル適応化方法およびモデル適応化用プログラム
CN106463114A (zh) * 2015-03-31 2017-02-22 索尼公司 信息处理设备、控制方法及程序

Also Published As

Publication number Publication date
JPWO2008105263A1 (ja) 2010-06-03
EP2133868A1 (en) 2009-12-16
US20100094629A1 (en) 2010-04-15
EP2133868A4 (en) 2013-01-16
US8494847B2 (en) 2013-07-23
JP5294086B2 (ja) 2013-09-18

Similar Documents

Publication Publication Date Title
WO2008105263A1 (ja) 重み係数学習システム及び音声認識システム
WO2015009586A3 (en) Performing an operation relative to tabular data based upon voice input
MX2017001121A (es) Reconocimiento del habla en base a acustica y a dominio para vehiculos.
WO2013134641A3 (en) Recognizing speech in multiple languages
EP4038519A4 (en) FEDERATED LEARNING USING HETEROGENEOUS MODEL TYPES AND ARCHITECTURES
PH12014500482A1 (en) Systems and methods for language learning
EP4214640A4 (en) METHODS AND SYSTEMS FOR UPDATING MACHINE LEARNING MODELS
WO2007120418A3 (en) Electronic multilingual numeric and language learning tool
WO2007103520A3 (en) Codebook-less speech conversion method and system
WO2012169737A3 (en) Display apparatus and method for executing link and method for recognizing voice thereof
EP3353766A4 (en) METHODS FOR AUTOMATED GENERATION OF VOICE SAMPLE ASSET PRODUCTION NOTES FOR USERS OF DISTRIBUTED LANGUAGE LEARNING SYSTEM, AUTOMATED RECOGNITION AND QUANTIFICATION OF ACCENT AND ENHANCED SPEECH RECOGNITION
WO2012135229A3 (en) Conversational dialog learning and correction
WO2012064408A3 (en) Method for tone/intonation recognition using auditory attention cues
WO2016033291A3 (en) Virtual assistant development system
WO2011084998A3 (en) Word-level correction of speech input
WO2012036424A3 (en) Method and apparatus for performing microphone beamforming
WO2010086447A3 (en) A method and system for developing language and speech
WO2012039938A3 (en) Full-sequence training of deep structures for speech recognition
WO2008015571A3 (en) Simulation-assisted search
WO2009016631A3 (en) Automatic context sensitive language correction and enhancement using an internet corpus
WO2014066106A3 (en) Techniques for input method editor language models using spatial input models
EP2026327A4 (en) LANGUAGE MODEL LEARNING, LANGUAGE MODEL LEARNING AND LANGUAGE MODEL LEARNING PROGRAM
WO2009025356A1 (ja) 音声認識装置および音声認識方法
WO2012134877A3 (en) Computer-implemented systems and methods evaluating prosodic features of speech
ATE442641T1 (de) Spracherkennungsverfahren und -system, das an die eigenschaften von nichtmuttersprachlern angepasst ist

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08711545

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2009501184

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 12528864

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2008711545

Country of ref document: EP

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载