+

WO1998002862A1 - Appareil de formation interactive en langues - Google Patents

Appareil de formation interactive en langues Download PDF

Info

Publication number
WO1998002862A1
WO1998002862A1 PCT/IL1997/000143 IL9700143W WO9802862A1 WO 1998002862 A1 WO1998002862 A1 WO 1998002862A1 IL 9700143 W IL9700143 W IL 9700143W WO 9802862 A1 WO9802862 A1 WO 9802862A1
Authority
WO
WIPO (PCT)
Prior art keywords
expected
audio
user
responses
speech
Prior art date
Application number
PCT/IL1997/000143
Other languages
English (en)
Inventor
Zeev Shpiro
Original Assignee
Digispeech (Israel) Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Digispeech (Israel) Ltd. filed Critical Digispeech (Israel) Ltd.
Priority to IL12355697A priority Critical patent/IL123556A0/xx
Priority to AU24032/97A priority patent/AU2403297A/en
Priority to EP97919627A priority patent/EP0852782A4/fr
Priority to JP10505803A priority patent/JPH11513144A/ja
Priority to BR9702341-8A priority patent/BR9702341A/pt
Publication of WO1998002862A1 publication Critical patent/WO1998002862A1/fr

Links

Classifications

    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B7/00Electrically-operated teaching apparatus or devices working with questions and answers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B19/00Teaching not covered by other main groups of this subclass
    • G09B19/04Speaking
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B19/00Teaching not covered by other main groups of this subclass
    • G09B19/06Foreign languages
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B5/00Electrically-operated educational appliances
    • G09B5/06Electrically-operated educational appliances with both visual and audible presentation of the material to be studied
    • G09B5/065Combinations of audio and video presentations, e.g. videotapes, videodiscs, television systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0631Creating reference templates; Clustering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/225Feedback of the input speech

Definitions

  • the present invention relates to speech recognition systems having application inter alia in educational systems and more particularly to computerized systems providing phoneme based speech recognition and for teaching language
  • a product having substantially the same feature is commercially available from The Learning Company under the trade name "Learn to Speak English"
  • IBM IBM VoiceType, Simply Speaking for students, home users and small businesses
  • IBM VoiceType for professional and business use
  • IBM IBM
  • ASR-1500 marketed by Lernout & Hauspie Speech Products N V of Leper, Belgium
  • the present invention seeks to provide a further improved computerized system for teaching language which provides an indication to the user of the type of pronunciation error or errors that the user is making.
  • apparatus for interactive language training comprising: a trigger generator for eliciting expected audio responses by a user; an expected audio response reference library containing a multiplicity of reference expected responses, the multiplicity of reference expected responses including a first plurality of reference expected responses having acceptable pronunciation and for each of the first plurality of reference expected responses having acceptable pronunciation, a second plurality of reference expected responses each having different pronunciation errors; an audio response scorer which indicates the relationship between the expected audio response provided by the user and the reference expected responses; and a user feedback interface which indicates to the user the pronunciation errors in the expected audio responses provided by the user.
  • the user feedback interface also provides instruction to the user how to overcome the pronunciation errors.
  • the user feedback interface indicates to the user each pronunciation error immediately following each expected audio response.
  • the feedback interface provides audio and visual indications of the pronunciation errors.
  • the audio specimen generator is operative such that the expected audio response is a repetition of the audio specimen.
  • the audio specimen generator is operative such that the expected audio response is other than a repetition of the audio specimen.
  • the audio specimen generator is operative such that the expected audio response is an audio specimen which may be chosen from among more than one possible expected audio responses
  • the trigger generator comprises an audio specimen generator for playing audio specimens to a user.
  • the trigger generator comprises an visual trigger generator for providing a visual trigger output to a user
  • the expected audio response library comprises an expected audio response reference database
  • the expected audio response reference database comprises a multiplicity of templates and is speaker independent.
  • a method for interactive language training comprising eliciting expected audio responses by a user; providing an expected audio response reference library containing a multiplicity of reference expected responses, the multiplicity of reference expected responses including a first plurality of reference expected responses having acceptable pronunciation and for each of the first plurality of reference expected responses having acceptable pronunciation, a second plurality of reference expected responses each having different pronunciation errors, indicating the relationship between the expected audio response provided by the user and the reference expected responses; and indicating to the user the pronunciation errors in the expected audio responses provided by the user
  • the method also includes providing instruction to the user how to overcome the pronunciation errors Still further in accordance with a preferred embodiment of the present invention the method also includes indicating to the user each pronunciation error immediately following each expected audio response.
  • the method also includes providing audio and visual indications of said pronunciation errors to said user.
  • the method also includes the expected audio response is a repetition of said audio specimen.
  • the method also includes the expected audio response is other than a repetition of said audio specimen.
  • the expected audio response is an audio specimen which may be chosen from among more than one possible expected audio responses.
  • the step of eliciting audio responses includes playing audio specimens to a user.
  • the step of eliciting comprises providing a visual trigger output to a user.
  • a speech recognition apparatus including at least one data base containing speech elements of at least first and second languages, a receiver, receiving spoken speech to be recognized, and a comparator, comparing features of the spoken speech with a combination of features of the speech elements of at least first and second languages. It is appreciated that in certain cases a combination of features of the speech elements may include the features of a single speech element. A feature of the speech element may include the speech element signal.
  • a language teaching system including a trigger generator for eliciting expected audio responses by a user, a speech recognizer receiving the expected audio responses spoken by a user, the speech recognizer including at least one data base containing speech elements of at least first and second languages, a receiver, receiving spoken speech to be recognized, and a comparator, comparing features of said spoken speech with a combination of features of said speech elements of at least first and second languages, and a user feedback interface which indicates to the user errors in the expected audio responses spoken by the user.
  • a combination of the features of the speech elements may include the features of a single speech element.
  • a feature of the speech element may include the speech element signal.
  • the speech elements include at least one of phonemes, diphones and transitions between phonemes.
  • the language teaching system also includes a template generator operative to generate phrase templates.
  • the language teaching system also includes a feature extractor operative to extract features of spoken speech received by the receiver.
  • a method for speech recognition including providing at least one data base containing speech elements of at least first and second languages, receiving spoken speech to be recognized, and comparing features of the spoken speech with a combination of features of the speech elements of at least first and second languages. It is appreciated that in certain cases a combination of the features of the speech elements may include the features of a single speech element. A feature of the speech element may include the speech element signal.
  • the spoken speech is spoken in a first language by a user who is a native speaker of a second language and wherein the at least one data base contains speech elements of both the first and the second languages.
  • the at least first and second languages include different national languages.
  • the at least first and second languages include different dialects of a single national language.
  • Fig. 1 is a generalized pictorial illustration of an interactive language teaching system constructed and operative in accordance with a preferred embodiment of the present invention
  • Fig. 2 is a generalized functional block diagram of the operation of the system of Fig. 1 during language teaching;
  • Fig. 3 is a generalized functional block diagram of the operation of the system of Fig. 1 during audio reference library generation in accordance with one embodiment of the present invention
  • Fig. 4 is a generalized functional block diagram of the operation of the system of Fig. 1 during audio reference library generation in accordance with another embodiment of the present invention
  • Figs. 5A and 5B together constitute a generalized flow chart illustrating operation of the system during language teaching in accordance with the generalized functional block diagram of Fig. 2;
  • Figs. 6A, 6B and 6C together constitute a generalized flow chart illustrating one method of operation of the system during audio reference library generation for language teaching in accordance with the generalized functional block diagram of Fig. 3;
  • Fig. 7 is a generalized flow chart illustrating operation of the system during audio reference library generation for language teaching in accordance with the generalized functional block diagram of Fig. 4;
  • Fig. 8 is a simplified illustration of the creation of a phonetic template database of the type employed in Fig. 4;
  • Fig. 9 is a simplified illustration of a labeled speech waveform
  • Fig. 10 illustrates the creation of a multiple language phonetic database in accordance with a preferred embodiment of the present invention
  • Fig. 11 is an illustration of speech recognition employing phonemes
  • Fig. 12 is an illustration of speech recognition employing phonemes of various languages.
  • Fig 1 is a generalized pictorial illustration of an interactive language teaching system constructed and operative in accordance with a preferred embodiment of the present invention and to Fig 2, which is a generalized functional block diagram of the operation of the system of Fig 1 during language teaching
  • the system of the present invention differs from that of U S Patent 5,487,671 in that it operates with reference expected responses each having different pronunciation errors and includes an audio response scorer which indicates the relationship between the expected audio response provided by the user and the reference expected responses having pronunciation errors
  • Figs 1 and 2 incorporates speech recognition functionality in accordance with a preferred embodiment of the present invention
  • the system of Figs 1 and 2 is preferably based on a conventional personal computer 10, such as an IBM PC or compatible, using an Intel 80486 CPU running at 33 MHZ or higher, with at least 8MB of memory and running a DOS rev 6 0 or above operating system
  • the personal computer 10 is preferably equipped with an auxiliary audio module 12
  • a suitable audio module 12 is the Digispeech Plus audio adapter (DS31 1 ) manufactured by Digispeech, Inc and distributed in the USA by DSP SOLUTIONS Inc , Mountain View, CA
  • a headset 14 is preferably associated with audio module 12
  • the personal computer 10 and audio module 12 are supplied with suitable software so as to provide the following functionalities: a trigger generator for eliciting expected audio responses by the user
  • the trigger generator preferably comprises an audio specimen generator for playing audio specimens to a user but may additionally or alternatively comprise a visual trigger generator for providing a visual trigger output to a user, an expected audio response reference library containing a multiplicity of reference expected responses, the multiplicity of reference expected responses including a first plurality of reference expected responses having acceptable pronunciation and for each of the first plurality of reference expected responses having acceptable pronunciation, a second plurality of reference expected responses each having different pronunciation errors.
  • the second plurality of reference expected responses may include responses constructed from phonemes of various languages and may have application in speech recognition generally; an audio response scorer which indicates the relationship between the expected audio response provided by the user and the reference expected responses; and a user feedback interface which indicates to the user the pronunciation errors, if any, in the expected audio responses provided by the user.
  • the user feedback interface preferably provides audio feedback via the audio module 12 and headset 14. Additionally, as seen in Figs. 1 and 2, a display 16 is preferably provided to indicate pronunciation errors to the user in a visible manner, as illustrated, for example in Fig. 1.
  • A. Interim Audio Specimens Database This database is generated by recording a plurality of native speakers including a distribution of various geographical origins, various ages and both genders.
  • the plurality of native speakers may include speakers speaking various different languages.
  • Each speaker pronounces a plurality of predetermined phrases. For each of the plurality of predetermined phrases, each speaker pronounces the phrase correctly and also repeats the phrase incorrectly a few times, each time with a different one of a plurality of predetermined pronunciation errors.
  • this database includes plural recordings of each of the above pronounced phrases for each speaker, so as to provide an enhanced statistical base.
  • B. Expected Audio Response Reference Database This is a database containing templates rather than recorded speech.
  • templates may be provided.
  • One type of template, useful in word- based speech recognition may be derived from Database A in a manner described hereinbelow.
  • Another type of template, useful in phoneme-based speech recognition comprises various combinations of features of speech elements which together represent a phrase.
  • Templates useful in word-based speech recognition may be derived from the Interim Audio Specimens Database A, by extracting the speech parameters of each of the pronounced phrases and combining them statistically so as to represent the pronunciation of the plurality of native speakers referred to hereinabove.
  • each template represents a statistical combination of the pronunciations of a group of native speakers.
  • a single template may be produced to cover all of the native speakers whose pronunciation is recorded in the Interim Audio Specimens Database A, or plural templates may be used, when a single template does not accurately represent the entire range of native speakers.
  • one template may represent males and the other females.
  • separate templates may each contain phonemes of a different language
  • the Expected Audio Response Reference Database B constitutes the expected audio response reference library, referred to above. This is a speaker independent database.
  • templates may be provided.
  • One type of template useful in word- based speech recognition may be derived from Database A in a manner described hereinabove.
  • Another type of template, useful in phoneme-based speech recognition comprises various combinations of features of speech elements which together represent a phrase.
  • Phonetic Database This is a commercially available database of speech parameters of phonemes for a given language. Such databases are available, for example, from AT & T, Speech Systems Incorporated of Boulder, Colorado, U.S.A. and Lernout & Hauspie Speech Products N.V. of Leper, Belgium. Multiple phonetic databases, each containing speech parameters of phonemes of a different language may be provided and are collectively referred to as a Phonetic Database.
  • E. Expected Audio Specimens Database This is a collection of recordings of each of a single trained speaker pronouncing each of the plurality of phrases correctly.
  • Reference Audio Specimens Database This is a collection of recordings of a single trained speaker pronouncing each of the plurality of phrases incorrectly a few times, each with a different one of a plurality of predetermined pronunciation errors.
  • Fig. 2 is a generalized functional block diagram of the operation of the system of Fig. 1 during language teaching.
  • Audio specimens stored in Expected Audio Specimen Database E are played to the user via the audio module 14 (Fig. 1) in order to elicit expected audio responses by the user.
  • a microphone 20, normally part of headset 14, is employed to record the user's audio responses, which are stored in User Followup Database D.
  • the audio specimens typically include spoken phrases.
  • the phrases may include one or more words.
  • a visual trigger generator for providing a visual trigger output to a user for eliciting expected audio responses by the user.
  • Spoken phrase parameters are extracted from the user's audio responses and are compared with reference phrase parameters to measure the likelihood of a match between the spoken phrase parameters of the user's audio response and the reference phrase parameters of a corresponding correct or incorrect phrase stored in the Expected Audio Response Reference Database B.
  • the reference phrase parameters need not necessarily comprise words or combinations of words. Instead the reference phrase parameters may comprise various combinations of features of speech elements, particularly when phoneme based speech recognition is being carried out.
  • the result of the likelihood measurement is selection of a phrase which is closest to the user's audio response or an indication of failure to make any match.
  • An audio and preferably also visible feedback indication is provided to the user, identifying the matched phrase and indicating whether it is correct or incorrect.
  • the user response may include a word, several words, a sentence or a number of sentences out of which one only or several phrases are matched during the teaching process. Additional teaching information as how to overcome indicated errors is preferably also provided in an audio-visual manner. Headphones 22, preferably forming part of headset 14 (Fig. 1) and display 16 are preferably employed for this purpose.
  • Fig. 3 is a generalized functional block diagram of the operation of the system of Fig. 1 during generation of the Expected Audio Response Reference Database B in accordance with one embodiment of the present invention.
  • a microphone 30, is used to record phrases spoken by a plurality of native speakers, including a distribution of various geographical origins, various ages and both genders.
  • Each speaker pronounces a plurality of predetermined phrases. For each of the plurality of predetermined phrases, each speaker pronounces the phrase correctly and also repeats the phrase incorrectly a few times, each time with a different one of a plurality of predetermined pronunciation errors.
  • the recordings are retained in the Interim Audio Specimens database A.
  • this database includes plural recordings of each of the above pronounced phrases for each speaker, so as to provide an enhanced statistical base.
  • each phrase is recorded correctly N times by each of a plurality of M speakers. It is additionally recorded N times by each of M speakers in L different forms each containing a different pronunciation error.
  • Fig. 4 is a generalized functional block diagram of the operation of the system of Fig. 1 during audio reference library generation in accordance with another embodiment of the present invention.
  • the Expected Audio Response Reference Database B is computer generated by generating text and phonetic language files which are employed to produce phonetic language records.
  • the phonetic language record is employed together with Phonetic Database C to generate phrase templates which together constitute the Expected Audio Response Reference Database B.
  • phrase templates are typically not words or combinations of words but rather combinations of features of speech elements, such as phonemes, diphones and transitions between phonemes.
  • features of speech to be recognized are compared with these combinations in order to find a best match.
  • Figs. 5A and 5B which together constitute a generalized flow chart illustrating operation of the system during language teaching in accordance with the generalized functional block diagram of Fig. 2.
  • the user's response is recorded and compared with reference expected responses contained in the Expected Audio Response Reference Database B, by the Student Response Specimen Recorder as described in US Patent No. 5,487,671, the disclosure of which is hereby incorporated by reference.
  • the mispronounced phrase is played to the user from the Reference Audio Specimens Database F.
  • a User followup database D may be employed to play back the latest or earlier user responses for indicating user progress, to be included in the system feedback, or other purposes.
  • FIG. 6A, 6B and 6C which together constitute a generalized flow chart illustrating operation of the system during audio reference library generation for language teaching in accordance with the generalized functional block diagram of Fig. 3.
  • the trained speaker speaks the correct phrase and a plurality of incorrect phrases, whose pronunciation is similar to the correct phrase but for one or more errors in pronunciation to provide reference expected responses each having different pronunciation errors
  • Each such set of correct and incorrect phrases is recorded.
  • the Interim Audio Specimens Database A contains the various recordings Database A is employed, as described above with reference to Fig 3, to produce the Expected Audio Response Reference Database B, Fig 6C for use in word-based speech recognition
  • Fig 7 is a generalized flow chart illustrating operation of the system during audio reference library generation for language teaching in accordance with the generalized functional block diagram of Fig 4
  • a computer is employed to enter plain text and a phonetic language and to convert the text to the indicated phonetic language.
  • a phrase template is generated
  • the phrase template is then stored in the Expected Audio Response reference Database B
  • This described process is carried out for each phrase template being used by the system.
  • the phrase templates are typically not words or combinations of words but rather combinations of features of speech elements, such as phonemes, diphones and transitions between phonemes.
  • features of speech to be recognized are compared with these combinations in order to find a best match.
  • FIGs. 8 and 9 illustrate the creation of Phonetic Database C of the type employed in Figs. 4 and 7 in accordance with a preferred embodiment of the present invention.
  • a database 50 of labeled speech can be obtained from TIMIT Acoustic-Phonetic Continuous Speech Corpora, available from the Linguistic Data Consortium, the University of Pennsylvania, at e-mail address online-service@ldc.upenn.edu.
  • a template builder 52 typically embodied in commercially available software, such as HTK (Hidden Markov Model Toolkit) available from Entropic Cambridge Research Laboratories, Ltd., e-mail address sales@entropic.com, operates on the database 50 and provides the Phonetic Database C.
  • HTK Hidden Markov Model Toolkit
  • the Phonetic Database C is realized by combining plural phonetic databases 54, 56, as illustrated in Fig. 10. It is a particular feature of the invention that phonetic databases 54 and 56, including phonemes of a language being learned or spoken as well as the native language of the user, may thus be combined to provide enhanced speech recognition.
  • Fig. 11 is an illustration of speech recognition employing phonemes.
  • the expected word is "tomato”.
  • a net of expected alternative pronunciations is created.
  • the speaker can pronounce the first “o” as “O”, “OW” or “U”, the "O” pronunciation being considered to be correct.
  • Fig. 11 is characterized that all of the phonemes being used for speech recognition belong to a single language.
  • Fig. 12 is an illustration of speech recognition employing phonemes of various languages.
  • the present example is designed for recognizing English spoken by a native Japanese speaker.
  • the expected word is "Los” as in “Los Angeles”. It is seen that here, the speaker can pronounce the "L” as “L” (circled “L”), an English “R.” (circled “R”) or a Japanese “R” (boxed “R”).
  • Fig. 12 is characterized that not all of the phonemes being used for speech recognition belong to a single language.
  • some of the phonemes are English language phonemes (circled letters) and some of the phonemes are Japanese language phonemes (boxed letters).

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Educational Administration (AREA)
  • General Physics & Mathematics (AREA)
  • Educational Technology (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

La présente invention concerne un appareil de formation interactive en langues incluant un générateur de déclenchement pour obtenir des réponses sonores attendues par un utilisateur. On a recours à une bibliothèque de référence des réponses sonores attendues contenant un ensemble de réponses de référence attendues, ledit ensemble de réponses de référence attendues incluant un premier sous-esemble de réponses de référence attendues donnant la prononciation correcte. Et à chaque premier sous-ensemble de réponses de référence attendues donnant la prononciation correcte correspond un deuxième sous-ensemble de réponses de référence attendues contenant différentes erreurs de prononciation. Un vérificateur de réponse sonore indique le rapport entre la réponse sonore attendue fournie par l'utilisateur et la réponse de référence attendue. Une interface de rétroinformation utilisateur (12, 14, 16) indique à l'utilisateur les erreurs de prononciation dans les réponses sonores attendues fournies par l'utilisateur. La présente invention décrit aussi un appareil de reconnaissance vocale incluant au moins une base de données contenant des phonèmes d'au moins une première et une deuxième langue, un récepteur de reconnaissance de la voix à reconnaître et un dispositif de comparaison des caractéristiques du discours oral possédant une combinaison de caractéristiques desdits phonèmes d'au moins une première et une deuxième langue. Il est admis que, dans certains cas, une combinaison de phonèmes peut n'inclure qu'un seul phonème. L'invention décrit aussi un procédé de reconnaissance vocale.
PCT/IL1997/000143 1996-07-11 1997-05-04 Appareil de formation interactive en langues WO1998002862A1 (fr)

Priority Applications (5)

Application Number Priority Date Filing Date Title
IL12355697A IL123556A0 (en) 1996-07-11 1997-05-04 Apparatus for interactive language training
AU24032/97A AU2403297A (en) 1996-07-11 1997-05-04 Apparatus for interactive language training
EP97919627A EP0852782A4 (fr) 1996-07-11 1997-05-04 Appareil de formation interactive en langues
JP10505803A JPH11513144A (ja) 1996-07-11 1997-05-04 対話型言語トレーニング装置
BR9702341-8A BR9702341A (pt) 1996-07-11 1997-05-04 Aparelho para treinamento linguìstico interativo

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US08/678,229 1996-07-11
US08/678,229 US5766015A (en) 1996-07-11 1996-07-11 Apparatus for interactive language training

Publications (1)

Publication Number Publication Date
WO1998002862A1 true WO1998002862A1 (fr) 1998-01-22

Family

ID=24721939

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IL1997/000143 WO1998002862A1 (fr) 1996-07-11 1997-05-04 Appareil de formation interactive en langues

Country Status (9)

Country Link
US (1) US5766015A (fr)
EP (1) EP0852782A4 (fr)
JP (1) JPH11513144A (fr)
KR (1) KR19990044575A (fr)
CN (1) CN1197525A (fr)
AU (1) AU2403297A (fr)
BR (1) BR9702341A (fr)
IL (1) IL123556A0 (fr)
WO (1) WO1998002862A1 (fr)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999040556A1 (fr) * 1998-02-09 1999-08-12 Syracuse Language Systems, Inc. Appareil de reconnaissance de la parole et methode d'apprentissage
EP1091336A1 (fr) * 1999-10-06 2001-04-11 Ascom AG Procédé et dispositif de détection et correction d'erreurs en langage parlé
WO2001067227A1 (fr) * 2000-03-10 2001-09-13 Seungheon Baek Appareil et procede d'affichage de phrases pour l'etude des langues etrangeres
US6358054B1 (en) 1995-05-24 2002-03-19 Syracuse Language Systems Method and apparatus for teaching prosodic features of speech
DE10010232B4 (de) * 1999-03-05 2004-08-05 Auralog Verfahren und Vorrichtung zur Spracherkennung
EP0917129B1 (fr) * 1997-11-17 2004-12-15 International Business Machines Corporation Méthode et dispositif de reconnaissance de la parole
US6961700B2 (en) 1996-09-24 2005-11-01 Allvoice Computing Plc Method and apparatus for processing the output of a speech recognition engine
CN1311423C (zh) * 2003-08-11 2007-04-18 索尼电子有限公司 利用多语言字典执行语音识别的系统和方法
EP1482469A3 (fr) * 2003-05-29 2007-04-18 Robert Bosch Gmbh Système, procédé et dispositif pour l'enseignement de langue avec un portail vocal
GB2458461A (en) * 2008-03-17 2009-09-23 Kai Yu Spoken language learning system
EP2924676A1 (fr) * 2014-03-25 2015-09-30 Oticon A/s Systèmes d'apprentissage adaptatif à base d'écoute

Families Citing this family (69)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6283760B1 (en) * 1994-10-21 2001-09-04 Carl Wakamoto Learning and entertainment device, method and system and storage media therefor
US6022221A (en) 1997-03-21 2000-02-08 Boon; John F. Method and system for short- to long-term memory bridge
US20040219494A1 (en) * 1997-03-21 2004-11-04 Boon John F. Authoring tool and method of use
US6017219A (en) * 1997-06-18 2000-01-25 International Business Machines Corporation System and method for interactive reading and language instruction
US5927988A (en) * 1997-12-17 1999-07-27 Jenkins; William M. Method and apparatus for training of sensory and perceptual systems in LLI subjects
US6019607A (en) * 1997-12-17 2000-02-01 Jenkins; William M. Method and apparatus for training of sensory and perceptual systems in LLI systems
US7203649B1 (en) * 1998-04-15 2007-04-10 Unisys Corporation Aphasia therapy system
US6077080A (en) * 1998-10-06 2000-06-20 Rai; Shogen Alphabet image reading method
US6468084B1 (en) * 1999-08-13 2002-10-22 Beacon Literacy, Llc System and method for literacy development
WO2001024139A1 (fr) * 1999-09-27 2001-04-05 Kojima Co., Ltd. Systeme d'evaluation de la prononciation
US6302695B1 (en) * 1999-11-09 2001-10-16 Minds And Technologies, Inc. Method and apparatus for language training
JP3520022B2 (ja) 2000-01-14 2004-04-19 株式会社国際電気通信基礎技術研究所 外国語学習装置、外国語学習方法および媒体
US7280964B2 (en) * 2000-04-21 2007-10-09 Lessac Technologies, Inc. Method of recognizing spoken language with recognition of language color
US6963841B2 (en) * 2000-04-21 2005-11-08 Lessac Technology, Inc. Speech training method with alternative proper pronunciation database
US6847931B2 (en) 2002-01-29 2005-01-25 Lessac Technology, Inc. Expressive parsing in computerized conversion of text to speech
US6865533B2 (en) * 2000-04-21 2005-03-08 Lessac Technology Inc. Text to speech
US6705869B2 (en) 2000-06-02 2004-03-16 Darren Schwartz Method and system for interactive communication skill training
US7203840B2 (en) * 2000-12-18 2007-04-10 Burlingtonspeech Limited Access control for interactive learning system
US7996321B2 (en) * 2000-12-18 2011-08-09 Burlington English Ltd. Method and apparatus for access control to language learning system
AU2002239627A1 (en) * 2000-12-18 2002-07-01 Digispeech Marketing Ltd. Spoken language teaching system based on language unit segmentation
AU2002231045A1 (en) * 2000-12-18 2002-07-01 Digispeech Marketing Ltd. Method of providing language instruction and a language instruction system
US6435876B1 (en) * 2001-01-02 2002-08-20 Intel Corporation Interactive learning of a foreign language
US20020115044A1 (en) * 2001-01-10 2002-08-22 Zeev Shpiro System and method for computer-assisted language instruction
US6882707B2 (en) * 2001-02-21 2005-04-19 Ultratec, Inc. Method and apparatus for training a call assistant for relay re-voicing
US7881441B2 (en) * 2005-06-29 2011-02-01 Ultratec, Inc. Device independent text captioned telephone service
US8416925B2 (en) 2005-06-29 2013-04-09 Ultratec, Inc. Device independent text captioned telephone service
US6953343B2 (en) * 2002-02-06 2005-10-11 Ordinate Corporation Automatic reading system and methods
TW556152B (en) * 2002-05-29 2003-10-01 Labs Inc L Interface of automatically labeling phonic symbols for correcting user's pronunciation, and systems and methods
US7219059B2 (en) * 2002-07-03 2007-05-15 Lucent Technologies Inc. Automatic pronunciation scoring for language learning
JP2004053652A (ja) * 2002-07-16 2004-02-19 Asahi Kasei Corp 発音判定システム、システム管理用サーバ及びプログラム
US7752045B2 (en) * 2002-10-07 2010-07-06 Carnegie Mellon University Systems and methods for comparing speech elements
US20040176960A1 (en) * 2002-12-31 2004-09-09 Zeev Shpiro Comprehensive spoken language learning system
US7524191B2 (en) * 2003-09-02 2009-04-28 Rosetta Stone Ltd. System and method for language instruction
GB2435373B (en) * 2004-02-18 2009-04-01 Ultratec Inc Captioned telephone service
US8515024B2 (en) 2010-01-13 2013-08-20 Ultratec, Inc. Captioned telephone service
WO2005091247A1 (fr) * 2004-03-22 2005-09-29 Lava Consulting Pty Ltd Methode d'enseignement
US7903084B2 (en) * 2004-03-23 2011-03-08 Fujitsu Limited Selective engagement of motion input modes
US20050212753A1 (en) * 2004-03-23 2005-09-29 Marvit David L Motion controlled remote controller
US20050212760A1 (en) 2004-03-23 2005-09-29 Marvit David L Gesture based user interface supporting preexisting symbols
US7301526B2 (en) 2004-03-23 2007-11-27 Fujitsu Limited Dynamic adaptation of gestures for motion controlled handheld devices
US7365736B2 (en) * 2004-03-23 2008-04-29 Fujitsu Limited Customizable gesture mappings for motion controlled handheld devices
US7301529B2 (en) * 2004-03-23 2007-11-27 Fujitsu Limited Context dependent gesture response
US7301528B2 (en) * 2004-03-23 2007-11-27 Fujitsu Limited Distinguishing tilt and translation motion components in handheld devices
US7365737B2 (en) * 2004-03-23 2008-04-29 Fujitsu Limited Non-uniform gesture precision
US7280096B2 (en) * 2004-03-23 2007-10-09 Fujitsu Limited Motion sensor engagement for a handheld device
US7365735B2 (en) * 2004-03-23 2008-04-29 Fujitsu Limited Translation controlled cursor
US7301527B2 (en) * 2004-03-23 2007-11-27 Fujitsu Limited Feedback based user interface for motion controlled handheld devices
US20060008781A1 (en) * 2004-07-06 2006-01-12 Ordinate Corporation System and method for measuring reading skills
NZ534092A (en) * 2004-07-12 2007-03-30 Kings College Trustees Computer generated interactive environment with characters for learning a language
US20100099065A1 (en) * 2004-12-23 2010-04-22 Carl Isamu Wakamoto Interactive cinematic system for bonus features for movies, tv contents, anime and cartoons, music videos, language training, entertainment and social networking
US11258900B2 (en) 2005-06-29 2022-02-22 Ultratec, Inc. Device independent text captioned telephone service
CN101223565B (zh) * 2005-07-15 2013-02-27 理查德·A·莫 语音发音培训装置和语音发音培训方法
JP2009525492A (ja) * 2005-08-01 2009-07-09 一秋 上川 英語音、および他のヨーロッパ言語音の表現方法と発音テクニックのシステム
US7657221B2 (en) * 2005-09-12 2010-02-02 Northwest Educational Software, Inc. Virtual oral recitation examination apparatus, system and method
JP2009128675A (ja) * 2007-11-26 2009-06-11 Toshiba Corp 音声を認識する装置、方法およびプログラム
US8340968B1 (en) * 2008-01-09 2012-12-25 Lockheed Martin Corporation System and method for training diction
US8064817B1 (en) * 2008-06-02 2011-11-22 Jakob Ziv-El Multimode recording and transmitting apparatus and its use in an interactive group response system
TW201019288A (en) * 2008-11-13 2010-05-16 Ind Tech Res Inst System and method for conversation practice in simulated situations
CN101510423B (zh) * 2009-03-31 2011-06-15 北京志诚卓盛科技发展有限公司 一种分层次、交互式发音质量评估与诊断系统
US20110189646A1 (en) * 2010-02-01 2011-08-04 Amos Benninga Pedagogical system method and apparatus
US10019995B1 (en) * 2011-03-01 2018-07-10 Alice J. Stiebel Methods and systems for language learning based on a series of pitch patterns
US8805673B1 (en) * 2011-07-14 2014-08-12 Globalenglish Corporation System and method for sharing region specific pronunciations of phrases
JP6267636B2 (ja) * 2012-06-18 2018-01-24 エイディシーテクノロジー株式会社 音声応答装置
US10026329B2 (en) 2012-11-26 2018-07-17 ISSLA Enterprises, LLC Intralingual supertitling in language acquisition
CN104880683B (zh) * 2014-02-28 2018-02-13 西门子(深圳)磁共振有限公司 一种磁共振成像系统的匀场片的检测装置、方法和系统
CN105825853A (zh) * 2015-01-07 2016-08-03 中兴通讯股份有限公司 语音识别设备语音切换方法及装置
CN107945621A (zh) * 2017-11-13 2018-04-20 董国玉 一种便于交流的数学公式记忆装置
CN108877808B (zh) * 2018-07-24 2020-12-25 广东小天才科技有限公司 一种防误触的语音唤醒方法及家教设备
CN113920803B (zh) * 2020-07-10 2024-05-10 上海流利说信息技术有限公司 一种错误反馈方法、装置、设备及可读存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5393236A (en) * 1992-09-25 1995-02-28 Northeastern University Interactive speech pronunciation apparatus and method
US5487671A (en) * 1993-01-21 1996-01-30 Dsp Solutions (International) Computerized system for teaching speech
US5503560A (en) * 1988-07-25 1996-04-02 British Telecommunications Language training

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB9223066D0 (en) * 1992-11-04 1992-12-16 Secr Defence Children's speech training aid
US5428707A (en) * 1992-11-13 1995-06-27 Dragon Systems, Inc. Apparatus and methods for training speech recognition systems and their users and otherwise improving speech recognition performance

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5503560A (en) * 1988-07-25 1996-04-02 British Telecommunications Language training
US5393236A (en) * 1992-09-25 1995-02-28 Northeastern University Interactive speech pronunciation apparatus and method
US5487671A (en) * 1993-01-21 1996-01-30 Dsp Solutions (International) Computerized system for teaching speech

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP0852782A4 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6358054B1 (en) 1995-05-24 2002-03-19 Syracuse Language Systems Method and apparatus for teaching prosodic features of speech
US6358055B1 (en) 1995-05-24 2002-03-19 Syracuse Language System Method and apparatus for teaching prosodic features of speech
US6961700B2 (en) 1996-09-24 2005-11-01 Allvoice Computing Plc Method and apparatus for processing the output of a speech recognition engine
EP0917129B1 (fr) * 1997-11-17 2004-12-15 International Business Machines Corporation Méthode et dispositif de reconnaissance de la parole
US6134529A (en) * 1998-02-09 2000-10-17 Syracuse Language Systems, Inc. Speech recognition apparatus and method for learning
WO1999040556A1 (fr) * 1998-02-09 1999-08-12 Syracuse Language Systems, Inc. Appareil de reconnaissance de la parole et methode d'apprentissage
DE10010232B4 (de) * 1999-03-05 2004-08-05 Auralog Verfahren und Vorrichtung zur Spracherkennung
EP1091336A1 (fr) * 1999-10-06 2001-04-11 Ascom AG Procédé et dispositif de détection et correction d'erreurs en langage parlé
WO2001067227A1 (fr) * 2000-03-10 2001-09-13 Seungheon Baek Appareil et procede d'affichage de phrases pour l'etude des langues etrangeres
EP1482469A3 (fr) * 2003-05-29 2007-04-18 Robert Bosch Gmbh Système, procédé et dispositif pour l'enseignement de langue avec un portail vocal
US8371857B2 (en) 2003-05-29 2013-02-12 Robert Bosch Gmbh System, method and device for language education through a voice portal
CN1311423C (zh) * 2003-08-11 2007-04-18 索尼电子有限公司 利用多语言字典执行语音识别的系统和方法
GB2458461A (en) * 2008-03-17 2009-09-23 Kai Yu Spoken language learning system
EP2924676A1 (fr) * 2014-03-25 2015-09-30 Oticon A/s Systèmes d'apprentissage adaptatif à base d'écoute

Also Published As

Publication number Publication date
JPH11513144A (ja) 1999-11-09
KR19990044575A (ko) 1999-06-25
EP0852782A1 (fr) 1998-07-15
CN1197525A (zh) 1998-10-28
AU2403297A (en) 1998-02-09
BR9702341A (pt) 2000-10-24
IL123556A0 (en) 1998-10-30
US5766015A (en) 1998-06-16
EP0852782A4 (fr) 1998-12-23

Similar Documents

Publication Publication Date Title
US5766015A (en) Apparatus for interactive language training
Lamel et al. Bref, a large vocabulary spoken corpus for french1
US7280964B2 (en) Method of recognizing spoken language with recognition of language color
US6424935B1 (en) Two-way speech recognition and dialect system
Paul et al. The design for the Wall Street Journal-based CSR corpus
US7143033B2 (en) Automatic multi-language phonetic transcribing system
Kasuriya et al. Thai speech corpus for Thai speech recognition
WO2004063902B1 (fr) Procede d'entrainement vocal a instruction en couleur
Zechner et al. Towards automatic scoring of non-native spontaneous speech
Proença et al. The LetsRead corpus of Portuguese children reading aloud for performance evaluation
Comerford et al. The voice of the computer is heard in the land (and it listens too!)[speech recognition]
Samudravijaya Computer recognition of spoken Hindi
Janyoi et al. An Isarn dialect HMM-based text-to-speech system
Precoda Non-mainstream languages and speech recognition: Some challenges
Ha et al. Common Errors in Pronunciation of Non-English Majored Students at the University of Transport and Communication Ho Chi Minh Campus
Marasek et al. Multi-level annotation in SpeeCon Polish speech database
Yong et al. Low footprint high intelligibility Malay speech synthesizer based on statistical data
Szymański et al. First evaluation of Polish LVCSR acoustic models obtained from the JURISDIC database
Kirschning et al. Verification of correct pronunciation of Mexican Spanish using speech technology
Demenko et al. LVCSR speech database-JURISDIC
Catanghal et al. Computer Discriminative Acoustic Tool for Reading Enhancement and Diagnostic: Development and Pilot Test
Kane et al. Introducing difficulty-levels in pronunciation learning.
Alsulaiman et al. Development and Analysis of a Versatile Dataset of Speech, Real and Synthesized, of Arabic Learners
Kiissel et al. Estonian isolated-word text-to-speech synthesiser
Lenzo et al. Rapid-deployment text-to-speech in the DIPLOMAT system.

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 97190882.6

Country of ref document: CN

AK Designated states

Kind code of ref document: A1

Designated state(s): AL AM AT AT AU AZ BA BB BG BR BY CA CH CN CU CZ CZ DE DE DK DK EE EE ES FI FI GB GE GH HU IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SK TJ TM TR TT UA UG US UZ VN YU AM AZ BY KG KZ MD RU TJ TM

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH KE LS MW SD SZ UG AT BE CH DE DK ES FI FR GB GR IE

ENP Entry into the national phase

Ref document number: 1998 505803

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 1019980701824

Country of ref document: KR

WWE Wipo information: entry into national phase

Ref document number: 1997919627

Country of ref document: EP

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 09043043

Country of ref document: US

WWP Wipo information: published in national office

Ref document number: 1997919627

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWP Wipo information: published in national office

Ref document number: 1019980701824

Country of ref document: KR

WWW Wipo information: withdrawn in national office

Ref document number: 1997919627

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: CA

WWW Wipo information: withdrawn in national office

Ref document number: 1019980701824

Country of ref document: KR

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载