WO2009036800A1 - Procédé de traduction de texte et dispositif électronique associé - Google Patents
Procédé de traduction de texte et dispositif électronique associé Download PDFInfo
- Publication number
- WO2009036800A1 WO2009036800A1 PCT/EP2007/059883 EP2007059883W WO2009036800A1 WO 2009036800 A1 WO2009036800 A1 WO 2009036800A1 EP 2007059883 W EP2007059883 W EP 2007059883W WO 2009036800 A1 WO2009036800 A1 WO 2009036800A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- text
- phrase
- translation
- translating
- accordance
- Prior art date
Links
- 238000013519 translation Methods 0.000 title claims abstract description 98
- 238000000034 method Methods 0.000 title claims abstract description 22
- 238000004590 computer program Methods 0.000 claims abstract description 5
- 238000010295 mobile communication Methods 0.000 claims description 21
- 230000000694 effects Effects 0.000 claims description 2
- 230000014616 translation Effects 0.000 description 84
- 238000004891 communication Methods 0.000 description 10
- 238000012545 processing Methods 0.000 description 9
- 238000000605 extraction Methods 0.000 description 4
- 238000001514 detection method Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000012015 optical character recognition Methods 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/58—Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/1444—Selective acquisition, locating or processing of specific regions, e.g. highlighted text, fiducial marks or predetermined fields
- G06V30/1456—Selective acquisition, locating or processing of specific regions, e.g. highlighted text, fiducial marks or predetermined fields based on user interactions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Definitions
- the present invention relates to a method of text translation comprising capturing text, performing character recognition on the text and presenting a corresponding translation.
- the present invention also relates to an electronic device capable of such text translation and a computer program comprising software instructions that, when executed, performs such a text translation.
- a telephone in a GSM, GPRS, EDGE, UMTS or CDMA2000 type of system is capable of recording, conveying and displaying both still images and moving images, i.e. video streams, in addition to audio data such as speech or music.
- OCR optical character recognition
- a method of text translation comprising the steps of: capturing a text image input, performing character recognition on the text image input, such that a recognized text is provided, selecting an extensive mode or a target mode, the extensive mode comprising the sub-steps of: performing linguistic analysis in respect to the recognized text, translating the recognized text, and selecting a phrase, and the target mode comprising the sub-steps of: selecting a phrase from the recognized text, performing linguistic analysis in respect to the selected phrase, translating the selected phrase, executing the selected mode, and presenting the corresponding translation of the selected phrase.
- the invention likewise concerns an electronic device capable of such text translation.
- an electronic device capable of text translation comprising: means for capturing a text image input, means for performing character recognition on the text image input, such that a recognized text is provided, extensive mode translating means for: performing linguistic analysis in respect to the recognized text, translating the recognized text, and selecting a phrase, target mode translating means for: selecting a phrase from the recognized text, performing linguistic analysis in respect to the selected phrase, translating the selected phrase, means for selecting one of said extensive mode translating means and said target mode translating means to effect translation, and means for presenting the corresponding translation of the selected phrase.
- the invention concerns a computer program, why according to a third aspect of the invention there is provided a computer program comprising software instructions that, when executed, performs text translation of the kind defined above.
- a computer program comprising software instructions that, when executed, performs text translation of the kind defined above.
- the extensive mode is suitable for users confronted with text in a foreign language to which they are essentially unfamiliar, and for which text they therefore need to have an extensive translation in order to grasp an understanding of the content.
- the captured text is, after character recognition, linguistically analysed and each phrase of the recognized text is then translated.
- each phrase of the recognized text is then translated.
- the target mode is suitable for users confronted with text in a foreign language to which they are essentially familiar, and for which text they therefore do not need to have an extensive translation in order to grasp an understanding of the content.
- the user is most likely interested in translating a target word or phrase, which he/she is unfamiliar with, and not the translation of the entire recognized text, which is the case in the extensive mode as described above.
- target mode the captured text is, after character recognition, not initially translated.
- a phrase is first selected, and not until after the selection is this phrase linguistically analysed and translated.
- the processing subsequently involves linguistic analysis only in respect to the selected phrase, not the entire recognized text, and thereby the processing time is shorter in target mode than in extensive mode.
- the ability to select different text translation modes in accordance with the present invention thus provides text translation suitable for use cases having different expectations with regards to translation of a text.
- the recognized text, the selected phrase, and the corresponding translation of the selected phrase are preferably, although not necessarily, presented simultaneously.
- capturing a text image input comprises acquiring an image with a camera comprised in a mobile communication terminal.
- the linguistic analysis and/or the translating may likewise be performed remotely from said mobile communication terminal.
- the analysis and/or the translating may for instance be executed in a server or any other communication entity connected to the terminal via a communication network.
- a translation of the sentence comprising the selected phrase may be requested, not just the phrase. Consequently, the target mode may further comprise the sub-steps of performing linguistic analysis in respect to the recognized text, and translating a sentence comprising the selected phrase.
- the text translation method may further comprise the step of presenting the translation corresponding to a sentence comprising the selected phrase.
- a language model is preferably, but not necessarily, utilized for interpretation, in order to optimize the translation on sentence level.
- the language model reforms phrases into an understandable sentence, which may then be presented.
- the linguistic analysis may comprise either or both maximum backward and forward phrase matching. Performing both maximum backward and forward matching enables coverage of all possible phrases, as the resulting phrases are combined.
- the selected phrase may comprise one of a single word, a plurality of consecutive words, a single character or a plurality of consecutive characters.
- Figure 1A illustrates a bulletin board of which a user of a mobile communication terminal capable of text translation in accordance with the present invention, takes a snapshot.
- Figure 1 B schematically illustrates a functional block diagram of the exemplifying mobile communication terminal of Figure 1A.
- Figure 2 illustrates a display of a mobile communication terminal capable of text translation in accordance with the present invention, showing recognized text in an extensive mode in accordance with a first embodiment.
- Figure 3 illustrates the display of the mobile communication terminal in accordance with the first embodiment in Figure 2, showing recognized text in a target mode.
- Figure 4 is a flowchart illustrating the operation of text translation in accordance with the first embodiment of the present invention, describing the operation of extensive mode as well as target mode.
- Figure 1A illustrates a user 2, who with the use of his/her mobile communication device 3 capable of text translation in accordance with the present invention, is able to acquire foreign text, which can be fully or partly translated.
- a mobile communication terminal represents the electronic device 3 capable of text translation in this example, the invention is not restricted thereto. Any suitable electronic device 3 can likewise be utilized.
- a user 2 is not required to carry out the inventive concept.
- the user's 2 actions described hereinafter may likewise be carried out by other means than by a human, based on criteria defined by the designer.
- the mobile communication terminal 3 of Figure 1 A further comprises a camera 8, with which the user 2 can take a snap shot of a bulletin board 1 comprising text in a, for the user 2, foreign language.
- the bulletin board 1 is merely exemplifying, and any medium 1 showing text, which the user 2 can acquire with the use of an electronic device 3 according to the present invention, can be utilized with the inventive concept.
- Other examples are newspapers, magazines, signs and so on.
- the present invention is further not restricted to the use of a camera 8 for capturing the text to be translated, although this solution is preferred.
- the captured text is after character recognition presented as recognized text, which is further described in Figures 2 and 3, showing two different selectable modes, and the steps of Figure 4, hereinafter.
- FIG. 1 B shows a block diagram of an electronic device 3 in accordance with the present invention, in form of the exemplifying mobile communication terminal 3 in Figure 1A.
- the terminal 3 comprises a processing unit 9 connected to an antenna 10 via a transceiver 11 , a memory unit 12, a microphone 13, a keyboard and/or joystick and/or pointer 14, a speaker 15 and a camera 8.
- the processing unit 9 controls the overall function of the functional blocks in that it is capable of receiving input from the keyboard/joystick/pointer 14, audio information via the microphone 13, text image input via the camera 8 and receive suitably encoded and modulated data via the antenna 10 and transceiver 11.
- Each block is realized by SW/HW.
- the terminal 3 is typically in connection with a communication network
- the network 16 illustrated in Figure 1 B may represent any one or more interconnected networks, including mobile, fixed and data communication networks such as the Internet.
- a "generic" communication entity 18 is shown as being connected to the network 16. This is to illustrate that the terminal 3 may be communicating with any entity, including other electronic devices and data servers that are connected to the network 16.
- the generic communication entity 18 comprises a processing unit 19 connected to an antenna 20 via a transceiver 21 , and a memory unit 22, for support of remote processing.
- Figure 2 illustrates a display 4 of a mobile communication terminal 3 capable of text translation in accordance with the present invention, showing recognized text 5 in an extensive mode in accordance with a first embodiment.
- the use of a display 4 for presentation is preferred, but the invention is not restricted thereto.
- voice presentation can likewise be utilized through the speaker 15.
- the recognized text 5 in this example comprises Chinese characters, but any characters forming a language are likewise supported by the present invention.
- a selected phrase 26 is highlighted, pointed out by a hand shaped cursor or in any other manner indicated in the recognized text 5, and is additionally shown in the left corner of the display 4. Further, a translation 27 of the selected phrase 26 is shown at the bottom of the display 4. Preferably, but not necessarily, the recognized text 5, the selected phrase 26 and the translation 27 are all presented simultaneously. Their location on the display 4 in relation to one another in Figure 2 is merely exemplifying.
- Figure 3 illustrates the display 4 of the mobile communication terminal 3 shown in Figure 2, showing the recognized text 5 in a target mode.
- a selected phrase 36 highlighted, pointed out by a hand shaped cursor or in any other manner indicated in the recognized text 5, and additionally shown in the left corner of the display 4. Further, a translation 37 of the selected phrase 36 is shown at the bottom of the display 4.
- the operations of the extended mode and the target mode will be further discussed in the flowchart of Figure 4, which describes the operation of text translation in accordance with the first embodiment of the present invention.
- the operation is preferably implemented as software steps stored in a memory and executed in a CPU, for instance the memory 12 and CPU 9 of the terminal 3 in Figure 1 B.
- some steps, as described below, may likewise be performed remotely, as for instance by the memory 22 and CPU 19 in the server 18 connected to the electrical device 3 via the communication network 16 as shown in Figure 1 B.
- a text image input is captured, i.e. text to be translated is acquired though the use of for instance the camera 8 of the mobile communication terminal 3 as shown in Figure 1.
- a second step 402 character recognition is performed on the captured text, in order to provide a recognized text 5 that can be divided into phrases, as shown in Figures 2 and 3.
- a phrase can comprise one of a single word, a plurality of consecutive words, a single character or a plurality of consecutive characters.
- the user 2 is in step 403 prompted to select the extensive mode or the target mode. Which mode to choose is for the user 2 to decide depending on his/her needs with regards to translation of the recognized text 5. Selection is realized by for instance the user 2 maneuvering the keyboard, pointer or joystick 14, or perhaps by voice commands using the microphone 13.
- the extensive mode is described with reference to the embodiment shown in Figure 2. This mode is aimed at users 2 confronted with text in a foreign language to which they are essentially unfamiliar, and for which text they therefore need to have an extensive translation in order to grasp an understanding of the content.
- step 501 a linguistic analysis is, in step 501 , performed in respect to the recognized text 5.
- the analysis in step 501 may comprise backward or forward maximum matching or a combination of both.
- backward phrase matching text is analyzed from the right to the left
- forward phrase matching text is analyzed from the left to the right.
- the meaning of the phrases may differ. This is particularly, although not exclusively, noticeable for text comprising Chinese characters, where thus the interpretation of one possible combination of characters may differ from the interpretation resulting from another combination.
- both backward and forward maximum matching is preferably performed, whereby the resulting phrase sets are combined and the redundant phrases in the matching results removed.
- step 501 may, but is not requested to, comprise either combination of different procedures and considerations such as rule-based word association and automatic object extraction of the text to be analyzed.
- rule-based word association supports finding the possible combination of the concurrent characters using context sensing and linguistic rules.
- target mode for instance, rule-based word association may be utilized in identifying the valid combination of characters whose position is nearest to the hand- shaped cursor on the display 4, which is then recognized as the selected phrase 36.
- Automatic object detection further supports extraction of the selected phrase 26, 36 to be translated.
- target mode for instance, the hand-shaped cursor gives knowledge about the position of the selected phrase 37, and a revised connect-component-based algorithm may be applied for object detection and segmentation.
- step 502 the recognized text 5 is translated into the language of choice.
- the translation is performed on a phrase level, i.e. each phrase is individually translated. If several translation options are feasible for a phrase, they may all be provided.
- step 501 the user 2 is preferably given notice to await completion of the processing.
- the user 2 selects, in step 503, a phrase of which he/she is interested in knowing the translation, defined as the selected phrase 26.
- Selection of a phrase is implemented with for instance the use of a joystick 14 comprised in the electrical device 3, with which joystick 14 the user 2 can navigate among the phrases of the recognized text 5.
- step 404 the corresponding translation of the selected phrase 26 is presented, and in Figure 2 this is implemented by showing the translation 27 on the display 4. If instead of the extensive mode, the target mode is selected in step
- the target mode is described with reference to Figure 3.
- This mode is aimed at users 2 confronted with text in a foreign language to which they are essentially familiar, and for which text they therefore do not need to have an extensive translation in order to grasp an understanding of the content.
- the user 2 is most likely interested in translating a target word or phrase, which he/she is unfamiliar with, and does not need to be bothered with the translation of the entire recognized text 5, which is the case in the extensive mode as described in the foregoing. Consequently, the user 2 selects, in step 601 , a phrase of which he/she is interested in knowing the translation, defined as the selected phrase 36.
- Selection of a phrase is implemented with for instance the use of a pointer 14 comprised in the electrical device 3, with which pointer 14 the user 2 can navigate among the phrases of the recognized text 5.
- a linguistic analysis can, in step 602, be performed in respect to the selected phrase 36.
- the analysis in step 602 may similarly to the analysis of step 501 comprise backward or forward maximum matching or a combination of both, and if preferred, either combination of different procedures and considerations such as rule-based word association and automatic object extraction of the text to be analyzed.
- step 603 the selected phrase 36 is translated into the language of choice. If several translation options are feasible for the selected phrase 36, they may all be provided.
- step 602 and the translation in step 603 are performed only in respect to the selected phrase 36.
- the linguistic analysis 501 , 602 is the most time consuming step in the text translation procedure, less time is thus required for processing in target mode, as the processing involves only the selected phrase (36), not the entire recognized text (5) as in extensive mode.
- step 404 the operation of the specific steps 601 to 603 in target mode is described, and the procedure proceeds to step 404.
- the corresponding translation of the selected phrase 36 is presented, and in Figure 3 this is implemented by showing the translation 37 on the display 4.
- step 403 Regardless of which mode the user 2 has selected in step 403, he/she is able to navigate the phrases of the recognized text 5.
- the user 2 can subsequently alter between the phrases selecting a new selected phrase 26, 36, as shown in step 405.
- the user 2 is continuously given the corresponding applicable translation options 27, 37 of the current selected phrase 26, 36.
- step 406 if the user 2 has selected a new phrase 26, 36 in step 405, it is determined whether extensive mode or target mode was selected in step 403.
- the flow returns to step 404, presenting the corresponding translation 27 of the new selected phrase 26.
- target mode the flow returns to step 602 for linguistic analysis in respect to the selected phrase 36 and translation of the selected phrase 36 in step 603, before presentation of the corresponding translation 37 of the selected phrase 36.
- the steps of the extensive mode as well as the target mode are in the example performed locally within the electrical device 3.
- the present invention is however not restricted thereto, the linguistic analysis step 501 , 602 and/or the translating step 502, 603 may likewise be performed remotely, as for instance in a server 18 connected to the electrical device 3 via the communication network 16, as shown in Figure 1 B.
- the translation can be presented on a sentence level.
- a translation of the sentence comprising the selected phrase 26, 36 will, if presentation on a sentence level if chosen by the user 2, be presented.
- the linguistic analysis in step 602 needs to be performed in respect to the recognized text 5, and translation in step 603 needs to be performed on the sentence comprising the selected phrase 26.
- a language model is preferably utilized for interpretation.
- the language model reforms phrases into an understandable sentence, and a translation of the sentence comprising the selected phrase 26, 36 can be presented.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Human Computer Interaction (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Machine Translation (AREA)
- Telephone Function (AREA)
Abstract
La présente invention concerne un procédé de traduction de texte qui consiste à : capturer (401) une entrée d'image textuelle, à réaliser une reconnaissance de caractères (402) sur ladite entrée d'image, de manière à obtenir un texte reconnu (5), à sélectionner un mode extensif ou un mode cible (403), ledit mode extensif comprenant les sous-étapes consistant à : réaliser une analyse linguistique (501) relative audit texte reconnu, traduire (502) ledit texte reconnu et sélectionner (503) une expression, et ledit mode cible comprenant les sous-étapes consistant à : sélectionner (601) une expression dans ledit texte reconnu, réaliser une analyse linguistique (602) relative à ladite expression sélectionnée, traduire (603) ladite expression sélectionnée (26, 36), exécuter le mode sélectionné et présenter (404) la traduction correspondante de ladite expression sélectionnée. La présente invention concerne également un dispositif électronique capable d'une telle traduction de texte et un programme informatique comprenant des instructions logicielles qui, lorsqu'elles sont exécutées, réalisent une telle traduction de texte.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/EP2007/059883 WO2009036800A1 (fr) | 2007-09-19 | 2007-09-19 | Procédé de traduction de texte et dispositif électronique associé |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/EP2007/059883 WO2009036800A1 (fr) | 2007-09-19 | 2007-09-19 | Procédé de traduction de texte et dispositif électronique associé |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2009036800A1 true WO2009036800A1 (fr) | 2009-03-26 |
Family
ID=39020325
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/EP2007/059883 WO2009036800A1 (fr) | 2007-09-19 | 2007-09-19 | Procédé de traduction de texte et dispositif électronique associé |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2009036800A1 (fr) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20010056352A1 (en) * | 2000-04-24 | 2001-12-27 | Endong Xun | Computer -aided reading system and method with cross-language reading wizard |
WO2003017229A1 (fr) * | 2001-08-13 | 2003-02-27 | Ogilvie John W L | Outils et techniques permettant une immersion progressive et guidee par le lecteur dans un texte en langue etrangere |
-
2007
- 2007-09-19 WO PCT/EP2007/059883 patent/WO2009036800A1/fr active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20010056352A1 (en) * | 2000-04-24 | 2001-12-27 | Endong Xun | Computer -aided reading system and method with cross-language reading wizard |
WO2003017229A1 (fr) * | 2001-08-13 | 2003-02-27 | Ogilvie John W L | Outils et techniques permettant une immersion progressive et guidee par le lecteur dans un texte en langue etrangere |
Non-Patent Citations (2)
Title |
---|
DOERMANN D ET AL: "Progress in camera-based document image analysis", DOCUMENT ANALYSIS AND RECOGNITION, 2003. PROCEEDINGS. SEVENTH INTERNATIONAL CONFERENCE ON AUG. 3-6, 2003, PISCATAWAY, NJ, USA,IEEE, 3 August 2003 (2003-08-03), pages 606 - 616, XP010656833, ISBN: 0-7695-1960-1 * |
WATANABE Y ET AL: "Translation camera on mobile phone", MULTIMEDIA AND EXPO, 2003. PROCEEDINGS. 2003 INTERNATIONAL CONFERENCE ON 6-9 JULY 2003, PISCATAWAY, NJ, USA,IEEE, vol. 2, 6 July 2003 (2003-07-06), pages 177 - 180, XP010650689, ISBN: 0-7803-7965-9 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI266280B (en) | Multimodal disambiguation of speech recognition | |
CA3158979C (fr) | Realisation d'une tache sans ecran dans des assistants personnels numeriques | |
US9335965B2 (en) | System and method for excerpt creation by designating a text segment using speech | |
US9411801B2 (en) | General dictionary for all languages | |
US7162412B2 (en) | Multilingual conversation assist system | |
US9002698B2 (en) | Speech translation apparatus, method and program | |
JP5967569B2 (ja) | 音声処理システム | |
US20170300128A1 (en) | Multimodel Text Input by a Keyboard/Camera Text Input Module Replacing a Conventional Keyboard Text Input Module on a Mobile Device | |
US8224656B2 (en) | Speech recognition disambiguation on mobile devices | |
US20090326938A1 (en) | Multiword text correction | |
US20140081619A1 (en) | Photography Recognition Translation | |
JP6150268B2 (ja) | 単語登録装置及びそのためのコンピュータプログラム | |
KR101882293B1 (ko) | 문자 입력 및 컨텐츠 추천을 위한 통합 키보드 | |
TW200925937A (en) | Inquiry-oriented user input apparatus and method | |
CN107564526B (zh) | 处理方法、装置和机器可读介质 | |
CN113127708B (zh) | 信息交互方法、装置、设备及存储介质 | |
CN108628819B (zh) | 处理方法和装置、用于处理的装置 | |
CN114154459A (zh) | 语音识别文本处理方法、装置、电子设备及存储介质 | |
CN108255940A (zh) | 一种跨语言搜索方法和装置、一种用于跨语言搜索的装置 | |
CN107424612B (zh) | 处理方法、装置和机器可读介质 | |
US6760408B2 (en) | Systems and methods for providing a user-friendly computing environment for the hearing impaired | |
US20090094018A1 (en) | Flexible Phrasebook | |
CN109471538B (zh) | 一种输入方法、装置和用于输入的装置 | |
CN106873798B (zh) | 用于输出信息的方法和装置 | |
CN113409766A (zh) | 一种识别方法、装置、用于识别的装置及语音合成方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 07820327 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 07820327 Country of ref document: EP Kind code of ref document: A1 |