CN108595443A

CN108595443A - Simultaneous interpreting method, device, intelligent vehicle mounted terminal and storage medium

Info

Publication number: CN108595443A
Application number: CN201810286936.3A
Authority: CN
Inventors: 张鸿鸽; 徐钧
Original assignee: Zhejiang Geely Holding Group Co Ltd
Current assignee: Zhejiang Geely Holding Group Co Ltd
Priority date: 2018-03-30
Filing date: 2018-03-30
Publication date: 2018-09-28

Abstract

The present invention relates to the technical field of smart cars, and provides a simultaneous translation method and device, an intelligent vehicle terminal and a storage medium. The method includes: obtaining a voice to be translated based on a simultaneous translation request; performing voice recognition on the voice to be translated, and obtaining the voice to be translated Translate the text; identify the language of the text to be translated, and obtain the language to be translated corresponding to the text to be translated; according to the preset correspondence between the language to be translated and the target language, translate the text to be translated into the target language text, and voice to output. The invention realizes the automatic mutual translation between the two languages by automatically identifying the language of the speech to be translated, so that the people in the car who cannot speak the same language can communicate normally.

Description

Simultaneous translation method, device, intelligent vehicle-mounted terminal and storage medium

技术领域technical field

本发明涉及智能汽车技术领域，具体而言，涉及一种同声翻译方法、装置、智能车载终端及存储介质。The invention relates to the technical field of smart cars, in particular to a simultaneous translation method, device, smart vehicle terminal and storage medium.

背景技术Background technique

如今的汽车功能已经不仅仅是传统的代步功能，随着时代的发展，新型技术的不断开发，汽车也越来越智能化。随着国人不断走出国门，外国人不断的来中国旅游、经商，往往在出租车、商务车上，司机人员在接送乘客或者商务接待时，遇到因不懂双方的语言而导致无法正常交流。Today's car functions are not just traditional transportation functions. With the development of the times and the continuous development of new technologies, cars are becoming more and more intelligent. As Chinese people continue to go abroad, foreigners continue to come to China for tourism and business. Often in taxis and commercial vehicles, drivers and personnel are unable to communicate normally because they do not understand the language of both parties when picking up passengers or receiving business.

发明内容Contents of the invention

本发明实施例的目的在于提供一种同声翻译方法、装置、智能车载终端及存储介质，用以解决司机与乘客因语言不通无法正常交流的问题。The purpose of the embodiments of the present invention is to provide a simultaneous translation method, device, intelligent vehicle-mounted terminal and storage medium to solve the problem that drivers and passengers cannot communicate normally due to language barriers.

为了实现上述目的，本发明实施例采用的技术方案如下：In order to achieve the above object, the technical solution adopted in the embodiment of the present invention is as follows:

第一方面，本发明实施例提供了一种同声翻译方法，所述方法包括：基于同声翻译请求，获取待翻译语音；对待翻译语音进行语音识别，得到待翻译文本；对待翻译文本进行语种识别，得到待翻译文本对应的待翻译语种；按照预先设置的待翻译语种与目标语种之间的对应关系，将待翻译文本翻译为目标语种文本，并以语音方式进行输出。In the first aspect, the embodiment of the present invention provides a simultaneous translation method, the method comprising: based on the simultaneous translation request, obtaining the speech to be translated; performing speech recognition on the speech to be translated to obtain the text to be translated; Identify and obtain the language to be translated corresponding to the text to be translated; translate the text to be translated into the target language text according to the preset correspondence between the language to be translated and the target language, and output it in the form of voice.

第二方面，本发明实施例还提供了一种同声翻译装置，所述装置包括获取模块、语音识别模块、语种识别模块和翻译模块。其中，获取模块用于基于同声翻译请求，获取待翻译语音；语音识别模块用于对待翻译语音进行语音识别，得到待翻译文本；语种识别模块，用于对待翻译文本进行语种识别，得到待翻译文本对应的待翻译语种；翻译模块，用于按照预先设置的待翻译语种与目标语种之间的对应关系，将待翻译文本翻译为目标语种文本，并以语音方式进行输出。In the second aspect, the embodiment of the present invention also provides a simultaneous translation device, which includes an acquisition module, a speech recognition module, a language recognition module and a translation module. Among them, the acquisition module is used to obtain the speech to be translated based on the simultaneous translation request; the speech recognition module is used to perform speech recognition on the speech to be translated to obtain the text to be translated; the language recognition module is used to perform language recognition on the text to be translated to obtain the text to be translated The language to be translated corresponding to the text; the translation module is used to translate the text to be translated into a text in the target language according to the preset corresponding relationship between the language to be translated and the target language, and output it in the form of voice.

第三方面，本发明实施例还提供了一种智能车载终端，所述智能车载终端包括车载传声器、车载发声器，所述智能车载终端还包括：存储器；处理器，所述处理器与车载传声器、车载发声器均电连接；以及同声翻译装置，所述同声翻译装置存储于所述存储器中并包括一个或多个由所述处理器执行的软件功能模组，其包括：获取模块，用于基于同声翻译请求，获取所述车载传声器采集的待翻译语音；语音识别模块，用于对所述待翻译语音进行语音识别，得到待翻译文本；语种识别模块，用于对所述待翻译文本进行语种识别，得到所述待翻译文本对应的待翻译语种；翻译模块，用于按照预先设置的待翻译语种与目标语种之间的对应关系，将所述待翻译文本翻译为目标语种文本，并通过所述车载发声器以语音方式进行输出。In the third aspect, the embodiment of the present invention also provides a kind of intelligent vehicle-mounted terminal, described intelligent vehicle-mounted terminal comprises vehicle-mounted microphone, vehicle-mounted sounder, and described intelligent vehicle-mounted terminal also comprises: memory; Processor, described processor and vehicle-mounted microphone , the vehicle-mounted sounders are all electrically connected; and a simultaneous translation device, the simultaneous translation device is stored in the memory and includes one or more software function modules executed by the processor, which includes: an acquisition module, It is used to obtain the voice to be translated based on the simultaneous translation request; the voice recognition module is used to perform voice recognition on the voice to be translated to obtain the text to be translated; the language recognition module is used to identify the voice to be translated Performing language identification on the translated text to obtain the language to be translated corresponding to the text to be translated; the translation module is used to translate the text to be translated into a text in the target language according to the preset correspondence between the language to be translated and the target language , and output in voice through the vehicle sounder.

第四方面，本发明实施例还提供了一种计算机可读存储介质，其上存储有计算机程序，该计算机程序被处理器执行时实现上述同声翻译方法。In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the above simultaneous translation method is implemented.

相对现有技术，本发明实施例提供的一种同声翻译方法、装置、智能车载终端及存储介质，当用户需要进行同声翻译时，响应用户发出的同声翻译翻译请求并获取待翻译语音，然后对待翻译语音进行语音识别得到待翻译文本，再通过对待翻译文本进行语种识别确定出待翻译文本对应的待翻译语种，接下来按照预先设置的待翻译语种与目标语种之间的对应关系，将待翻译文本翻译为目标语种文本，并以语音方式进行输出。与现有技术相比，本发明实施例通过对待翻译语音的语种进行自动识别，实现了两种语言之间的自动同声互译，使得车内语言不通的人员之间可以正常交流。Compared with the prior art, the embodiment of the present invention provides a simultaneous translation method, device, intelligent vehicle-mounted terminal and storage medium. When the user needs to perform simultaneous translation, it responds to the simultaneous translation translation request issued by the user and obtains the speech to be translated. , and then conduct speech recognition on the speech to be translated to obtain the text to be translated, and then determine the language to be translated corresponding to the text to be translated by performing language recognition on the text to be translated, and then follow the preset correspondence between the language to be translated and the target language, Translate the text to be translated into the target language text and output it in voice mode. Compared with the prior art, the embodiment of the present invention realizes automatic simultaneous translation between two languages by automatically identifying the language of the speech to be translated, so that people in the car who cannot speak the same language can communicate normally.

为使本发明的上述目的、特征和优点能更明显易懂，下文特举实施例，并配合所附附图，作详细说明如下。In order to make the above objects, features and advantages of the present invention more comprehensible, the following specific embodiments are described in detail in conjunction with the accompanying drawings.

附图说明Description of drawings

为了更清楚地说明本发明实施例的技术方案，下面将对实施例中所需要使用的附图作简单地介绍，应当理解，以下附图仅示出了本发明的某些实施例，因此不应被看作是对范围的限定，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他相关的附图。In order to illustrate the technical solutions of the embodiments of the present invention more clearly, the accompanying drawings used in the embodiments will be briefly introduced below. It should be understood that the following drawings only show some embodiments of the present invention, and thus It should be regarded as a limitation on the scope, and those skilled in the art can also obtain other related drawings based on these drawings without creative work.

图1示出了本发明实施例提供的智能车载终端的方框示意图。Fig. 1 shows a schematic block diagram of an intelligent vehicle terminal provided by an embodiment of the present invention.

图2示出了本发明实施例提供的同声翻译方法流程图。Fig. 2 shows a flowchart of a simultaneous translation method provided by an embodiment of the present invention.

图3为图2示出的步骤S102的子步骤流程图。FIG. 3 is a flow chart of sub-steps of step S102 shown in FIG. 2 .

图4为图2示出的步骤S103的子步骤流程图。FIG. 4 is a flow chart of sub-steps of step S103 shown in FIG. 2 .

图5示出了本发明实施例提供的同声翻译装置的方框示意图。Fig. 5 shows a schematic block diagram of a simultaneous translation device provided by an embodiment of the present invention.

图标：100-智能车载终端；101-存储器；102-存储控制器；103-处理器；104-外设接口；105-车载传声器；106-车载发声器；107-显示装置；200-同声翻译装置；201-获取模块；202-语音识别模块；203-语种识别模块；204-翻译模块；205-显示模块。Icons: 100-intelligent vehicle terminal; 101-memory; 102-storage controller; 103-processor; 104-peripheral interface; 105-vehicle microphone; 106-vehicle sounder; 107-display device; 200-simultaneous translation device; 201-acquisition module; 202-speech recognition module; 203-language recognition module; 204-translation module; 205-display module.

具体实施方式Detailed ways

下面将结合本发明实施例中附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。通常在此处附图中描述和示出的本发明实施例的组件可以以各种不同的配置来布置和设计。因此，以下对在附图中提供的本发明的实施例的详细描述并非旨在限制要求保护的本发明的范围，而是仅仅表示本发明的选定实施例。基于本发明的实施例，本领域技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例，都属于本发明保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. The components of the embodiments of the invention generally described and illustrated in the figures herein may be arranged and designed in a variety of different configurations. Accordingly, the following detailed description of the embodiments of the invention provided in the accompanying drawings is not intended to limit the scope of the claimed invention, but merely represents selected embodiments of the invention. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without making creative efforts belong to the protection scope of the present invention.

应注意到：相似的标号和字母在下面的附图中表示类似项，因此，一旦某一项在一个附图中被定义，则在随后的附图中不需要对其进行进一步定义和解释。同时，在本发明的描述中，术语“第一”、“第二”等仅用于区分描述，而不能理解为指示或暗示相对重要性。It should be noted that like numerals and letters denote similar items in the following figures, therefore, once an item is defined in one figure, it does not require further definition and explanation in subsequent figures. Meanwhile, in the description of the present invention, the terms "first", "second", etc. are only used to distinguish descriptions, and cannot be understood as indicating or implying relative importance.

请参照图1，图1示出了本发明实施例提供的智能车载终端100的方框示意图。智能车载终端100可以用于实现不同语言的同声互译，可以是智能手机、车载电脑、汽车上的组合仪表或多媒体主机等等。所述智能车载终端100包括存储器101、存储控制器102、处理器103、外设接口104、车载传声器105、车载发声器106、显示装置107。Please refer to FIG. 1 , which shows a schematic block diagram of a smart vehicle terminal 100 provided by an embodiment of the present invention. The intelligent vehicle-mounted terminal 100 can be used to realize simultaneous translation between different languages, and can be a smart phone, a vehicle-mounted computer, a combination instrument or a multimedia host on a vehicle, and the like. The intelligent vehicle-mounted terminal 100 includes a memory 101 , a storage controller 102 , a processor 103 , a peripheral interface 104 , a vehicle-mounted microphone 105 , a vehicle-mounted sounder 106 , and a display device 107 .

存储器101、存储控制器102及处理器103各元件相互之间直接或间接地电性连接，以实现数据的传输或交互。例如，这些元件相互之间可通过一条或多条通讯总线或信号线实现电性连接。同声翻译装置200包括至少一个可以软件或固件(firmware)的形式存储于存储器101中或固化在所述智能车载终端100的操作系统(operating system，OS)中的软件功能模块。处理器103用于执行存储器101中存储的可执行模块，例如同声翻译装置200所包括的软件功能模块及计算机程序等。The elements of the memory 101 , the memory controller 102 and the processor 103 are electrically connected to each other directly or indirectly to realize data transmission or interaction. For example, these components can be electrically connected to each other through one or more communication buses or signal lines. The simultaneous translation device 200 includes at least one software function module that can be stored in the memory 101 in the form of software or firmware (firmware) or solidified in the operating system (operating system, OS) of the smart vehicle terminal 100 . The processor 103 is used to execute executable modules stored in the memory 101 , such as software function modules and computer programs included in the simultaneous translation device 200 .

其中，存储器101可以是，但不限于，随机存取存储器(Random Access Memory，RAM)，只读存储器(Read Only Memory，ROM)，可编程只读存储器(Programmable Read-OnlyMemory，PROM)，可擦除只读存储器(Erasable Programmable Read-Only Memory，EPROM)，电可擦除只读存储器(Electric Erasable Programmable Read-Only Memory，EEPROM)等。其中，存储器101用于存储程序，所述处理器103在接收到执行指令后，执行所述程序。Wherein, memory 101 can be, but not limited to, random access memory (Random Access Memory, RAM), read-only memory (Read Only Memory, ROM), programmable read-only memory (Programmable Read-OnlyMemory, PROM), erasable In addition to read-only memory (Erasable Programmable Read-Only Memory, EPROM), electrically erasable read-only memory (Electric Erasable Programmable Read-Only Memory, EEPROM), etc. Wherein, the memory 101 is used to store a program, and the processor 103 executes the program after receiving an execution instruction.

处理器103可以是一种集成电路芯片，具有信号处理能力。上述的处理器103可以是通用处理器，包括中央处理器(Central Processing Unit，CPU)、网络处理器(NetworkProcessor，NP)、语音处理器以及视频处理器等；还可以是数字信号处理器、专用集成电路、现场可编程门阵列或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本发明实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器103也可以是任何常规的处理器等。The processor 103 may be an integrated circuit chip with signal processing capability. Above-mentioned processor 103 can be general-purpose processor, comprises central processing unit (Central Processing Unit, CPU), network processor (NetworkProcessor, NP), speech processor and video processor etc.; Integrated circuits, field programmable gate arrays or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components. Various methods, steps and logic block diagrams disclosed in the embodiments of the present invention may be implemented or executed. The general-purpose processor may be a microprocessor, or the processor 103 may be any conventional processor or the like.

车载传声器105用于采集同声翻译请求的语音及待翻译语音，并将该待翻译语音发送至智能车载终端100，以使智能车载终端100开启同声翻译功能，实现两种不同语言之间的同声互译。车载传声器105是用于汽车的传声器，是将声音信号转换为电信号的能量转换器件。车载传声器105可以是车载麦克风、车载话筒、车载微音器等，在本发明实施例中，车载传声器105可以是车载麦克风。The vehicle-mounted microphone 105 is used to collect the voice of the simultaneous translation request and the voice to be translated, and send the voice to be translated to the intelligent vehicle-mounted terminal 100, so that the intelligent vehicle-mounted terminal 100 turns on the simultaneous translation function and realizes the translation between two different languages. Simultaneous translation. The on-vehicle microphone 105 is a microphone for a car, and is an energy conversion device that converts a sound signal into an electric signal. The vehicle-mounted microphone 105 may be a vehicle-mounted microphone, a vehicle-mounted microphone, a vehicle-mounted microphone, etc. In the embodiment of the present invention, the vehicle-mounted microphone 105 may be a vehicle-mounted microphone.

车载发声器106是用于汽车的发生器，是将电能转变为声音的器材，用于将对同声翻译请求的回应语音和翻译后的语音进行输出。车载发声器106可以是车载喇叭、车载扬声器等，在本发明实施例中，车载发声器106可以是车载扬声器。The vehicle-mounted sound generator 106 is a generator used in a car, and is a device that converts electrical energy into sound, and is used to output a response voice to a simultaneous translation request and a translated voice. The vehicle-mounted sound generator 106 may be a vehicle-mounted horn, a vehicle-mounted speaker, etc. In the embodiment of the present invention, the vehicle-mounted sound generator 106 may be a vehicle-mounted speaker.

显示装置107是汽车上的一种人机接口设备，用于显示智能车载终端100的用户界面，同时可以通过触摸方式接收用户的同声翻译请求并将待翻译语音对应的待翻译文本和翻译后的目标语种文本进行显示，显示装置107可以是触摸屏。The display device 107 is a kind of man-machine interface device on the car, which is used to display the user interface of the intelligent vehicle terminal 100, and can simultaneously receive the user's simultaneous translation request through touch and display the text to be translated and the translated text corresponding to the voice to be translated. The text in the target language is displayed, and the display device 107 may be a touch screen.

第一实施例first embodiment

请参照图2，图2示出了本发明实施例提供的同声翻译方法流程图。本发明第一实施例的同声翻译方法应用于智能车载终端100中，同声翻译方法包括以下步骤：Please refer to FIG. 2 , which shows a flowchart of a simultaneous translation method provided by an embodiment of the present invention. The simultaneous translation method of the first embodiment of the present invention is applied to the intelligent vehicle terminal 100, and the simultaneous translation method includes the following steps:

步骤S101，基于同声翻译请求，获取待翻译语音。Step S101, based on the simultaneous translation request, the speech to be translated is acquired.

在本发明实施例中，同声翻译请求是由用户触发的、用于启动利用同声翻译方法实现的同声翻译功能的命令请求，其中，同声翻译请求可以是用户通过车载传声器105发出的语音命令，也可以是用户通过智能车载终端100的用户界面发出的操作命令，也可以是用户通过与智能车载终端100电连接的汽车方向盘上的按键的键控命令。智能车载终端100响应同声翻译请求后进入同声翻译功能并发送提示信息给用户，提示信息可以是通过车载发声器106发出的语音提示、或者通过显示装置107发出的显示提示，例如，车载发声器106播报“请输入语音”，或者显示装置107显示“请输入语音”。In the embodiment of the present invention, the simultaneous translation request is a command request triggered by the user and used to start the simultaneous translation function realized by the simultaneous translation method, wherein the simultaneous translation request can be sent by the user through the vehicle-mounted microphone 105 The voice command may also be an operation command sent by the user through the user interface of the smart vehicle terminal 100 , or a keying command issued by the user through the buttons on the steering wheel of the car electrically connected to the smart vehicle terminal 100 . After the intelligent vehicle-mounted terminal 100 responds to the simultaneous translation request, it enters the simultaneous translation function and sends prompt information to the user. The prompt information can be a voice prompt sent by the vehicle-mounted sounder 106, or a display prompt sent by the display device 107, for example, the vehicle-mounted voice The device 106 broadcasts "please input voice", or the display device 107 displays "please input voice".

智能车载终端100进入同声翻译功能之后，开始获取由车载传声器105发送的待翻译语音，其中待翻译语音是车载传声器105采集的车内语音，车内语音可以是车载人员说话时发出的声音，也可以是发声设备在车内发出的声音，例如，手机上的音频文件播放的声音。After the intelligent vehicle-mounted terminal 100 enters the simultaneous translation function, it starts to obtain the voice to be translated sent by the vehicle-mounted microphone 105, wherein the voice to be translated is the in-vehicle voice collected by the vehicle-mounted microphone 105, and the in-vehicle voice can be the sound emitted by the vehicle personnel when speaking. It may also be the sound produced by the sound-generating device in the car, for example, the sound played by an audio file on a mobile phone.

步骤S102，对待翻译语音进行语音识别，得到待翻译文本。Step S102, performing speech recognition on the speech to be translated to obtain the text to be translated.

在本发明实施例中，语音识别的目的是将用户发出的语音转换为智能车载终端100可读的字符序列的文本。对待翻译语音进行语音识别的方法可以包括：In the embodiment of the present invention, the purpose of speech recognition is to convert the speech uttered by the user into a text of character sequences readable by the smart vehicle terminal 100 . The method for performing speech recognition on the speech to be translated may include:

首先，对待翻译语音进行预处理，消除噪声带来的影响，并将待翻译语音进行声学特征提取，得到待翻译音频数据。声学特征提取既可以对待翻译语音进行信息压缩，又便于后续的语音识别。First, the speech to be translated is preprocessed to eliminate the influence of noise, and the acoustic features of the speech to be translated are extracted to obtain the audio data to be translated. Acoustic feature extraction can not only compress the information of the speech to be translated, but also facilitate subsequent speech recognition.

其次，将待翻译音频数据输入预先建立的音频识别模型中进行处理得到待翻译文本，首先计算待翻译语音对应到音节的概率，得到待翻译语音的音节序列，然后根据多个音节序列，计算出对应的单词序列的概率，最后选出待翻译语音中音节序列概率及单词序列概率均最高的单词序列作为语音识别的结果，即待翻译文本。Secondly, input the audio data to be translated into the pre-established audio recognition model for processing to obtain the text to be translated, first calculate the probability that the speech to be translated corresponds to a syllable, obtain the syllable sequence of the speech to be translated, and then calculate the The probability of the corresponding word sequence, and finally select the word sequence with the highest syllable sequence probability and word sequence probability in the speech to be translated as the result of speech recognition, that is, the text to be translated.

请参照图3，步骤S102还可以包括以下子步骤：Please refer to FIG. 3, step S102 may also include the following sub-steps:

子步骤S1021，将待翻译语音转换为待翻译音频数据。Sub-step S1021, converting the speech to be translated into audio data to be translated.

在本发明实施例中，首先，通过消除噪声及信道失真对待翻译语音进行语音增强。其次，对语音增强后的待翻译语音进行分帧，并对每一帧进行傅里叶变换以提取到每一帧的特征向量，待翻译语音每一帧的特征向量即构成待翻译音频数据。例如，待翻译语音是“你好”，分帧后得到“你”和“好”两帧语音，则待翻译音频数据可以包括“你”的特征向量和“好”的特征向量。In the embodiment of the present invention, firstly, speech enhancement is performed on the speech to be translated by eliminating noise and channel distortion. Secondly, the speech to be translated after speech enhancement is divided into frames, and Fourier transform is performed on each frame to extract the feature vector of each frame. The feature vector of each frame of the speech to be translated constitutes the audio data to be translated. For example, the speech to be translated is "Hello", and two speech frames of "You" and "Good" are obtained after frame division, then the audio data to be translated may include the feature vector of "You" and the feature vector of "Good".

子步骤S1022，将待翻译音频数据输入预先建立的音频识别模型，以得到待翻译音频数据对应的待翻译文本。Sub-step S1022, input the audio data to be translated into the pre-established audio recognition model, so as to obtain the text to be translated corresponding to the audio data to be translated.

在本发明实施例中，音频识别模型包括声学模型、语言模型及搜索空间，其中，声学模型是对大量的样本语音的声学特征进行统计建模后得到的，用于计算待翻译语音对应到音节的概率。语言模型是通过训练大量文本样本、并利用概率统计的方法对单词内在的统计规律进行建模得到的，用于根据多个音节序列计算对应的单词序列的概率。In the embodiment of the present invention, the audio recognition model includes an acoustic model, a language model, and a search space, wherein the acoustic model is obtained after statistical modeling of the acoustic features of a large number of sample speeches, and is used to calculate the corresponding syllables of the speech to be translated. The probability. The language model is obtained by training a large number of text samples and using the method of probability and statistics to model the internal statistical laws of words, and is used to calculate the probability of the corresponding word sequence based on multiple syllable sequences.

搜索空间是以音节为节点组成的音节级的网络，搜索空间的建立过程如下：首先，以待翻译文本中可能出现的单词为节点组成单词级网络，其中，待翻译文本中可能出现的单词可以依据待翻译语音的应用场景预先确定，例如，如果应用场景是商务接待，则常见的单词包括酒店、国家、城市等；然后，再对单词级网络进行音节扩展得到对应的音节级网络，该音节级网络即为搜索空间，例如，单词网络是“hotel”，则扩展得到的音节级网络可以是“ho”对应的音节和“tel”对应的音节。The search space is a syllable-level network composed of syllables as nodes. The establishment process of the search space is as follows: First, the words that may appear in the text to be translated are used as nodes to form a word-level network. Among them, the words that may appear in the text to be translated can be It is pre-determined according to the application scenario of the speech to be translated. For example, if the application scenario is business reception, common words include hotels, countries, cities, etc.; The level-level network is the search space. For example, if the word network is "hotel", the expanded syllable-level network can be the syllables corresponding to "ho" and the syllables corresponding to "tel".

将待翻译音频数据输入预先建立的音频识别模型进行语音识别的过程是：将待翻译音频数据输入至搜索空间，根据声学模型和语音模型在搜索空间中确定出概率最高的单词序列，将该单词序列作为待翻译音频数据对应的待翻译文本。The process of inputting the audio data to be translated into the pre-established audio recognition model for speech recognition is: input the audio data to be translated into the search space, determine the word sequence with the highest probability in the search space according to the acoustic model and the phonetic model, and use the word The sequence serves as the text to be translated corresponding to the audio data to be translated.

步骤S103，对待翻译文本进行语种识别，得到待翻译文本对应的待翻译语种。Step S103, performing language identification on the text to be translated to obtain the language to be translated corresponding to the text to be translated.

在本发明实施例中，得到待翻译文本后，首先，对待翻译文本进行语种特征提取，得到待翻译文本的语种特征。其中，语种特征指能够区别于其他语种的特有标志，通常包括特有字母、特有字母组合、变音符号的种类及标志数量等。特征提取是对待翻译文本中重复且大量出现的特有字母、特有字母组合、变音符号的种类及标志数量等进行提取。智能车载终端100预先保存有语种数据库，所述语种数据库中存储有多个语种模板及与每个语种模板对应的模板语种，得到待翻译文本的语种特征之后，将待翻译文本的语种特征与语种数据库中的多个语种模板进行逐个匹配，得到待翻译文本对应的待翻译语种。In the embodiment of the present invention, after the text to be translated is obtained, firstly, the language features of the text to be translated are extracted to obtain the language features of the text to be translated. Among them, language features refer to unique signs that can be distinguished from other languages, usually including special letters, unique letter combinations, types of diacritics and the number of signs, etc. Feature extraction is to extract the unique letters, unique letter combinations, types of diacritics and the number of signs that are repeated and appear in large quantities in the text to be translated. The intelligent vehicle-mounted terminal 100 has a language database stored in advance, and multiple language templates and template languages corresponding to each language template are stored in the language database. After obtaining the language features of the text to be translated, the language features of the text to be translated and the language features Multiple language templates in the database are matched one by one to obtain the language to be translated corresponding to the text to be translated.

请参照图4，步骤S103还可以包括以下子步骤：Please refer to FIG. 4, step S103 may also include the following sub-steps:

子步骤S1031，对待翻译文本进行特征提取，得到待翻译文本的语种特征。Sub-step S1031, extracting features of the text to be translated to obtain language features of the text to be translated.

在本发明实施例中，语种特征指能够区别于其他语种的特有标志，通常包括特有字母、特有字母组合、变音符号的种类及标志数量等。例如，待翻译文本是“Een van dehoofdkenmerken van dit communicatiesysteem is dat het alleen door de mens kanworden voortgebracht en gebruikt en meestal ook alleen door de mens wordtbegrepen”，则提取后的语种特征是“oo、ee、en”。子步骤S1032，将语种特征与语种数据库中的多个语种模板进行逐个匹配，确定出语种特征对应的待翻译语种。In the embodiment of the present invention, language features refer to unique signs that can be distinguished from other languages, and generally include unique letters, unique letter combinations, types of diacritics, number of signs, and the like. For example, if the text to be translated is "Een van dehoofdkenmerken van dit communicationsysteem is dat het alleen door de mens kanworden voortgebracht en gebruikt en meestal ook alleen door de mens wordtbegrepen", then the extracted language features are "oo, ee, en". Sub-step S1032, matching the language feature with multiple language templates in the language database one by one to determine the language to be translated corresponding to the language feature.

在本发明实施例中，智能车载终端100上预先保存有语种数据库，该语种数据库中包括多个语种模板及与每个语种模板对应的模板语种，例如，语种模板是“oo、aa、uu、ee”，与该语种模板对应的模板语种第荷兰语。将子步骤S1031得到待翻译文本的语种特征与语种数据库中的多个语种模板进行逐个匹配，匹配度达到预定阈值的目标语种模板对应的模板语种作为待翻译文本对应的待翻译语种，例如，待翻译的语种特征是“oo、ee、uu”，语种模板是“oo、aa、uu、ee”，匹配度是(3/4)*100％＝75％，其中，4指语种模板中的特征个数是4个，3是指与语种模板中特征一致的待翻译的语种特征的个数是3个，预定阈值是一个经验值，例如，基于准确性考虑，预定阈值取值为70％，从匹配度大于或等于预定阈值中找出匹配度最高的目标语种模板对应的模板语种作为待翻译文本对应的待翻译语种，例如，预定阈值为70％，待翻译文本与荷兰语的语种模板的匹配度是72％，待翻译文本与南非荷兰语的语种模板的匹配度是80％，则将南非荷兰语作为待翻译文本对应的待翻译语种。In the embodiment of the present invention, a language database is pre-stored on the intelligent vehicle-mounted terminal 100, and the language database includes multiple language templates and template languages corresponding to each language template. For example, the language templates are "oo, aa, uu, ee", the template language corresponding to this language template is Dutch. The language features of the text to be translated obtained in substep S1031 are matched one by one with multiple language templates in the language database, and the template language corresponding to the target language template whose matching degree reaches a predetermined threshold is used as the language to be translated corresponding to the text to be translated, for example, the language to be translated The translated language feature is "oo, ee, uu", the language template is "oo, aa, uu, ee", the matching degree is (3/4)*100%=75%, and 4 refers to the features in the language template The number is 4, and 3 means that the number of language features to be translated consistent with the features in the language template is 3, and the predetermined threshold is an empirical value. For example, based on accuracy considerations, the predetermined threshold is 70%. Find the template language corresponding to the target language template with the highest matching degree from the matching degree greater than or equal to the predetermined threshold as the language to be translated corresponding to the text to be translated, for example, the predetermined threshold is 70%, and the text to be translated is equal to the Dutch language template The matching degree is 72%, and the matching degree between the text to be translated and the language template of Afrikaans is 80%, so Afrikaans is used as the language to be translated corresponding to the text to be translated.

作为一种实施方式，智能车载终端100也可以与语种数据库服务器通信连接，该语种数据库服务器上保存有语种数据库，智能车载终端100将待翻译文本的语种特征发送至该语种数据库服务器上，以使该语种数据库服务器进行语种特征匹配，并将匹配结果发送至智能车载终端100。As an implementation, the smart vehicle-mounted terminal 100 can also communicate with a language database server, on which a language database is stored, and the smart vehicle-mounted terminal 100 sends the language characteristics of the text to be translated to the language database server, so that The language database server performs language feature matching, and sends the matching result to the smart vehicle terminal 100 .

步骤S104，按照预先设置的待翻译语种与目标语种之间的对应关系，将待翻译文本翻译为目标语种文本，并以语音方式进行输出。Step S104: Translate the text to be translated into text in the target language according to the preset correspondence between the language to be translated and the target language, and output it in a voice manner.

在本发明实施例中，用户预先通过智能车载终端100的操作界面设置待翻译语种与目标语种之间的对应关系，根据步骤S103得到的待翻译文本对应的待翻译语种和预先设置的待翻译语种与目标语种之间的对应关系就可以得到目标语种文本的目标语种。例如，用户设置的对应关系为中英互译，当待翻译语种为中文时，目标语种文本的目标语种是英文，此时需要将待翻译文本翻译成英文文本，当待翻译语种为英文时，目标语种文本的目标语种是中文，此时需要将待翻译文本翻译成中文文本。In the embodiment of the present invention, the user pre-sets the corresponding relationship between the language to be translated and the target language through the operation interface of the smart vehicle terminal 100, and the language to be translated corresponding to the text to be translated obtained in step S103 and the preset language to be translated The corresponding relationship with the target language can obtain the target language of the target language text. For example, the corresponding relationship set by the user is Chinese-English mutual translation. When the language to be translated is Chinese, the target language of the target language text is English. At this time, the text to be translated needs to be translated into English text. When the language to be translated is English, The target language of the target language text is Chinese. At this time, the text to be translated needs to be translated into Chinese text.

在本发明实施例中，将待翻译文本翻译为目标语种文本实现过程可以通过将待翻译文本输入预先建立的循环神经网络模型中，得到与待翻译文本对应的目标语种文本实现。循环神经网络模型包括编码和解码两个阶段，首先，将待翻译文本输入循环神经网络模型中进行编码，得到待翻译文本对应的向量序列，然后，对该向量序列进行解码，通过计算目标语种文本中每一个单词概率，最终得到目标语种文本。In the embodiment of the present invention, the process of translating the text to be translated into the text in the target language can be realized by inputting the text to be translated into the pre-established cyclic neural network model to obtain the text in the target language corresponding to the text to be translated. The cyclic neural network model includes two stages of encoding and decoding. First, the text to be translated is input into the cyclic neural network model for encoding to obtain the vector sequence corresponding to the text to be translated. Then, the vector sequence is decoded, and by calculating the target language text The probability of each word in the target language text is finally obtained.

作为一种实施方式，将待翻译文本输入预先建立的循环神经网络模型中，得到与待翻译文本对应的目标语种文本过程可以是：As an implementation, the process of inputting the text to be translated into the pre-established cyclic neural network model to obtain the target language text corresponding to the text to be translated may be:

首先，将待翻译文本进行编码得到待翻译文本对应的向量序列。将待翻译文本中每一个单词依次输入循环神经网络模型进行向量转换得到每一个单词对应的词向量，其中，每一单词对应的词向量又作为生成下一个单词的词向量的输入，这样以来，最后一个单词的词向量其实包含了该单词之前所有单词的词向量信息，因此，将最后一个单词的词向量作为待翻译文本对应的向量序列。First, the text to be translated is encoded to obtain a vector sequence corresponding to the text to be translated. Input each word in the text to be translated into the recurrent neural network model in turn for vector conversion to obtain the word vector corresponding to each word, wherein the word vector corresponding to each word is used as the input to generate the word vector of the next word. In this way, The word vector of the last word actually contains the word vector information of all words before the word, so the word vector of the last word is used as the vector sequence corresponding to the text to be translated.

其次，对待翻译文本对应的向量序列进行解码，通过逐个计算目标语种文本中每个单词的概率，最终得到目标语种文本。解码是对待翻译文本对应的向量序列进行解码，每次解码出目标语种文本中的一个单词，且解码每一个单词的时候都要将已经解码的上一个单词作为输入，再确定每一个单词在目标语种文本中的概率，每次选取其中概率最大的单词作为目标语种文本中的单词，最终得到目标语种文本。Secondly, the vector sequence corresponding to the text to be translated is decoded, and the probability of each word in the target language text is calculated one by one, and the target language text is finally obtained. Decoding is to decode the vector sequence corresponding to the text to be translated. Each time a word in the target language text is decoded, and when decoding each word, the previous word that has been decoded must be used as input, and then each word is determined to be in the target language. The probability in the language text, each time the word with the highest probability is selected as the word in the target language text, and finally the target language text is obtained.

需要说明的是，步骤S104中的循环神经网络模型是通过训练预先建立的，训练循环神经网络模型的过程如下：It should be noted that the cyclic neural network model in step S104 is pre-established through training, and the process of training the cyclic neural network model is as follows:

首先，初始化循环神经网络模型中输入参数。输入参数包括嵌入大小、编码器输入方向、编码深度、解码深度、神经单元类型、迭代次数，其中，嵌入大小用于表示单词的矢量长度，也即是词向量长度；编码器输入方向指输入待翻译文本的输入序列的顺序，可以是正向、反向、正反向，正向是按照待翻译文本的顺序依次输入，反向是按照待翻译文本的顺序逆序依次输入，正反向是分别按照待翻译文本的顺序和逆序依次输入；编码的深度指编码神经网络的层数、解码的深度指解码神经网络的层数；常用的神经单元类型有简单循环神经网络、长短期记忆网络、门控循环单元，确定神经单元类型后，同时还要初始化神经单元对应的神经网络的集合大小，各层神经元个数、各层激活函数及梯度权重的初始值；迭代次数主要影响神经网络的训练损失值。First, initialize the input parameters in the recurrent neural network model. The input parameters include embedding size, encoder input direction, encoding depth, decoding depth, neural unit type, and number of iterations. Among them, the embedding size is used to represent the word vector length, that is, the word vector length; the encoder input direction refers to the input to be The order of the input sequence of the translated text can be forward, reverse, or forward and reverse. The forward direction is input in sequence according to the order of the text to be translated. The order and reverse order of the text to be translated are input in sequence; the depth of encoding refers to the number of layers of the encoding neural network, and the depth of decoding refers to the number of layers of the decoding neural network; commonly used neural unit types include simple recurrent neural network, long short-term memory network, gating Recurrent unit, after determining the type of neural unit, at the same time initialize the set size of the neural network corresponding to the neural unit, the number of neurons in each layer, the activation function of each layer and the initial value of the gradient weight; the number of iterations mainly affects the training loss of the neural network value.

其次，将训练样本输入到循环神经网络模型中进行训练不断地更新权重参数和偏移量。每个训练样本包括一个待翻译样本文本和与之对应的目标语种样本文本，例如{(你要去哪里？)，(Where are you going？)}是一个训练样本，待翻译样本文本是(你要去哪里？)，与之对应的目标语种样本文本是(Where are you going？)。每个循环神经网络都包括至少一个输入层、一个隐含层、一个输出层，权重参数包括：输入层到隐含层的权重参数；隐含层到隐含层的权重参数；隐含层到输出层的权重参数，偏移量包括隐含层的偏移量和输出层的偏移量。将训练样本输入到循环神经网络模型中，采用反向传播算法BPTT对循环神经网络模型进行训练、采用梯度下降算法对权重参数和偏移量进行更新，最终得到权重参数和偏移量的值，从而训练出可以将待翻译文本翻译为目标语种文本的循环神经网络模型。Second, input the training samples into the recurrent neural network model for training and continuously update the weight parameters and offsets. Each training sample includes a sample text to be translated and a corresponding target language sample text, for example, {(where are you going?), (Where are you going?)} is a training sample, and the sample text to be translated is (you Where are you going?), the corresponding target language sample text is (Where are you going?). Each recurrent neural network includes at least one input layer, one hidden layer, and one output layer, and the weight parameters include: weight parameters from the input layer to the hidden layer; weight parameters from the hidden layer to the hidden layer; The weight parameter of the output layer, the offset includes the offset of the hidden layer and the offset of the output layer. Input the training samples into the cyclic neural network model, use the backpropagation algorithm BPTT to train the cyclic neural network model, and use the gradient descent algorithm to update the weight parameters and offsets, and finally obtain the values of the weight parameters and offsets, In this way, a recurrent neural network model that can translate the text to be translated into the target language text is trained.

在本发明实施例中，智能车载终端100得到目标语种文本后，可以将该目标语种文本以语音方式输出，也就是说，智能车载终端100首先将目标语种文本转换成目标语种音频，并通过车载发声器106输出。另外，用户很可能由于噪声等没有及时听到目标语种音频，为了提高用户体验，可以将待翻译文本、以及翻译后的目标语种文本用显示装置107进行显示，因此，本发明实施例还可以包括步骤S105。In the embodiment of the present invention, after the smart vehicle-mounted terminal 100 obtains the text in the target language, it can output the text in the target language in a voice mode. Sounder 106 output. In addition, the user may not hear the audio in the target language in time due to noise, etc. In order to improve user experience, the text to be translated and the translated text in the target language can be displayed on the display device 107. Therefore, the embodiment of the present invention may also include Step S105.

步骤S105，将待翻译文本、以及与待翻译文本对应的目标语种文本均进行显示。Step S105, displaying both the text to be translated and the text in the target language corresponding to the text to be translated.

在本发明实施例中，智能车载终端100除了将目标语种文本转换成目标语种音频，并通过车载发声器106输出之外，还可以通过车载显示装置107将待翻译文本、以及与待翻译文本对应的目标语种文本均进行显示，以在用户由于噪声等没有听到目标语种音频时可以通过显示装置107看到目标语种文本，提高用户体验。In the embodiment of the present invention, in addition to converting the text in the target language into audio in the target language and outputting it through the vehicle-mounted sounder 106, the smart vehicle-mounted terminal 100 can also convert the text to be translated and the text corresponding to the text to be translated through the vehicle-mounted display device 107. All texts in the target language are displayed, so that when the user does not hear the audio in the target language due to noise, the user can see the text in the target language through the display device 107, thereby improving user experience.

与现有技术相比，本发明实施例具有以下有益效果：Compared with the prior art, the embodiments of the present invention have the following beneficial effects:

首先，基于用户的同声翻译请求获取待翻译语音，同声翻译请求既可以是语音、也可以是操作、或者是按键，形式多样，方便用户在不同场景下发起请求。First of all, the voice to be translated is obtained based on the user's simultaneous translation request. The simultaneous translation request can be voice, operation, or button in various forms, which is convenient for users to initiate requests in different scenarios.

其次，利用语音识别将待翻译语音转换为待翻译文本，再对待翻译文本进行自动语种识别，实现了两种语言之间的自动同声互译，使得车内语言不通的人员之间可以正常交流。Secondly, use speech recognition to convert the speech to be translated into the text to be translated, and then perform automatic language recognition on the text to be translated, realizing automatic simultaneous translation between the two languages, so that people in the car who do not understand the language can communicate normally .

第三，将待翻译语音翻译后的目标语种文本以语音方式进行输出，也可以将待翻译文本和待翻译文本对应的目标语种文本用显示装置107进行显示，通过语音输出和文本显示两种形式的输出，满足不同场景下用户对输出的需求，提升了用户的使用体验。Third, output the text in the target language after the speech translation to be translated in a voice mode, or display the text in the target language corresponding to the text to be translated and the text to be translated with the display device 107, through two forms of voice output and text display The output meets the output needs of users in different scenarios and improves the user experience.

第二实施例second embodiment

请参照图5，图5示出了本发明实施例提供的同声翻译装置200的方框示意图。同声翻译装置200应用于智能车载终端100，其包括获取模块201；语音识别模块202；语种识别模块203；翻译模块204；显示模块205。Please refer to FIG. 5 , which shows a schematic block diagram of a simultaneous translation device 200 provided by an embodiment of the present invention. The simultaneous translation device 200 is applied to the intelligent vehicle-mounted terminal 100 , which includes an acquisition module 201 ; a speech recognition module 202 ; a language recognition module 203 ; a translation module 204 ; and a display module 205 .

获取模块201，用于基于同声翻译请求，获取待翻译语音。The obtaining module 201 is configured to obtain the speech to be translated based on the simultaneous translation request.

语音识别模块202，用于对待翻译语音进行语音识别，得到待翻译文本。The voice recognition module 202 is configured to perform voice recognition on the voice to be translated to obtain the text to be translated.

本发明实施例中，语音识别模块202具体用于，将待翻译语音转换为待翻译音频数据；将待翻译音频数据输入预先建立的音频识别模型，以得到待翻译音频数据对应的待翻译文本。In the embodiment of the present invention, the speech recognition module 202 is specifically used to convert the speech to be translated into audio data to be translated; input the audio data to be translated into a pre-established audio recognition model to obtain the text to be translated corresponding to the audio data to be translated.

语种识别模块203，用于对待翻译文本进行语种识别，得到待翻译文本对应的待翻译语种。The language identification module 203 is configured to identify the language of the text to be translated to obtain the language to be translated corresponding to the text to be translated.

本发明实施例中，语种识别模块203具体用于，对待翻译文本进行特征提取，得到待翻译文本的语种特征；将语种特征与语种数据库中的多个语种模板进行逐个匹配，确定出语种特征对应的待翻译语种。In the embodiment of the present invention, the language identification module 203 is specifically used to perform feature extraction on the text to be translated to obtain the language features of the text to be translated; to match the language features with multiple language templates in the language database one by one to determine the corresponding language features The languages to be translated.

翻译模块204，用于按照预先设置的待翻译语种与目标语种之间的对应关系，将待翻译文本翻译为目标语种文本，并以语音方式进行输出。The translation module 204 is configured to translate the text to be translated into text in the target language according to the preset correspondence between the language to be translated and the target language, and output the text in a voice manner.

显示模块205，用于将待翻译文本、以及与待翻译文本对应的目标语种文本均进行显示。The display module 205 is configured to display both the text to be translated and the text in the target language corresponding to the text to be translated.

本发明实施例还揭示了一种计算机可读存储介质，其上存储有计算机程序，所述计算机程序被处理器103执行时实现本发明前述实施例揭示的同声翻译方法。The embodiment of the present invention also discloses a computer-readable storage medium, on which a computer program is stored. When the computer program is executed by the processor 103, the simultaneous translation method disclosed in the foregoing embodiments of the present invention is realized.

综上所述，本发明提供的一种同声翻译方法、装置、智能车载终端及存储介质，所述方法包括：基于同声翻译请求，获取待翻译语音；对待翻译语音进行语音识别，得到待翻译文本；对待翻译文本进行语种识别，得到待翻译文本对应的待翻译语种；按照预先设置的待翻译语种与目标语种之间的对应关系，将待翻译文本翻译为目标语种文本，并以语音方式进行输出。与现有技术相比，本发明实施例通过对待翻译语音的语种进行自动识别，实现了两种语言之间的自动同声互译，使得车内语言不通的人员之间可以正常交流。In summary, the present invention provides a simultaneous translation method, device, intelligent vehicle-mounted terminal, and storage medium, the method comprising: obtaining the speech to be translated based on the simultaneous translation request; performing speech recognition on the speech to be translated, and obtaining the speech to be translated Translate the text; identify the language of the text to be translated, and obtain the language to be translated corresponding to the text to be translated; according to the preset correspondence between the language to be translated and the target language, translate the text to be translated into the target language text, and voice to output. Compared with the prior art, the embodiment of the present invention realizes automatic simultaneous translation between two languages by automatically identifying the language of the speech to be translated, so that people in the car who cannot speak the same language can communicate normally.

在本申请所提供的几个实施例中，应该理解到，所揭露的装置和方法，也可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的，例如，附图中的流程图和框图显示了根据本发明的多个实施例的装置、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上，流程图或框图中的每个方框可以代表一个模块、程序段或代码的一部分，所述模块、程序段或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意，在有些作为替换的实现方式中，方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如，两个连续的方框实际上可以基本并行地执行，它们有时也可以按相反的顺序执行，这依所涉及的功能而定。也要注意的是，框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合，可以用执行规定的功能或动作的专用的基于硬件的系统来实现，或者可以用专用硬件与计算机指令的组合来实现。In the several embodiments provided in this application, it should be understood that the disclosed devices and methods may also be implemented in other ways. The device embodiments described above are only illustrative. For example, the flowcharts and block diagrams in the accompanying drawings show the architecture, functions and possible implementations of devices, methods and computer program products according to multiple embodiments of the present invention. operate. In this regard, each block in a flowchart or block diagram may represent a module, program segment, or part of code that includes one or more Executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved. It should also be noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by a dedicated hardware-based system that performs the specified function or action , or may be implemented by a combination of dedicated hardware and computer instructions.

另外，在本发明各个实施例中的各功能模块可以集成在一起形成一个独立的部分，也可以是各个模块单独存在，也可以两个或两个以上模块集成形成一个独立的部分。In addition, each functional module in each embodiment of the present invention can be integrated together to form an independent part, or each module can exist independently, or two or more modules can be integrated to form an independent part.

所述功能如果以软件功能模块的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储介质中。基于这样的理解，本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括：U盘、移动硬盘、只读存储器(ROM，Read-Only Memory)、随机存取存储器(RAM，Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。需要说明的是，在本文中，诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来，而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且，术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含，从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素，而且还包括没有明确列出的其他要素，或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下，由语句“包括一个……”限定的要素，并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。If the functions are realized in the form of software function modules and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the essence of the technical solution of the present invention or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present invention. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes. . It should be noted that in this article, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply that there is a relationship between these entities or operations. There is no such actual relationship or order between them. Furthermore, the term "comprises", "comprises" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article or apparatus comprising a set of elements includes not only those elements, but also includes elements not expressly listed. other elements of or also include elements inherent in such a process, method, article, or device. Without further limitations, an element defined by the phrase "comprising a ..." does not exclude the presence of additional identical elements in the process, method, article or apparatus comprising said element.

以上所述仅为本发明的优选实施例而已，并不用于限制本发明，对于本领域的技术人员来说，本发明可以有各种更改和变化。凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。应注意到：相似的标号和字母在下面的附图中表示类似项，因此，一旦某一项在一个附图中被定义，则在随后的附图中不需要对其进行进一步定义和解释。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. For those skilled in the art, the present invention may have various modifications and changes. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present invention shall be included within the protection scope of the present invention. It should be noted that like numerals and letters denote similar items in the following figures, therefore, once an item is defined in one figure, it does not require further definition and explanation in subsequent figures.

Claims

1. A simultaneous translation method, characterized in that it is applied to an intelligent vehicle-mounted terminal, and the method comprises:

Obtain the voice to be translated based on the simultaneous translation request;

performing speech recognition on the speech to be translated to obtain the text to be translated;

performing language identification on the text to be translated to obtain the language to be translated corresponding to the text to be translated;

According to the preset correspondence between the language to be translated and the target language, the text to be translated is translated into a text in the target language, and output in a voice manner.

2. The method according to claim 1, wherein the step of performing speech recognition on the speech to be translated to obtain the text to be translated comprises:

converting the speech to be translated into audio data to be translated;

Inputting the audio data to be translated into a pre-established audio recognition model to obtain a text to be translated corresponding to the audio data to be translated.

3. The method according to claim 1, wherein the intelligent vehicle-mounted terminal includes a language database, a plurality of language templates are stored in the language database, and the language identification is carried out to the text to be translated to obtain the The steps of describing the language to be translated corresponding to the text to be translated include:

performing feature extraction on the text to be translated to obtain language features of the text to be translated;

The language feature is matched one by one with multiple language templates in the language database to determine the language to be translated corresponding to the language feature.

4. The method according to claim 1, wherein the step of translating the text to be translated into a target language text includes:

Inputting the text to be translated into a pre-established cyclic neural network model to obtain a text in the target language corresponding to the text to be translated.

5. The method of claim 1, further comprising:

Both the text to be translated and the text in the target language corresponding to the text to be translated are displayed.

6. A simultaneous translation device, characterized in that it is applied to an intelligent vehicle-mounted terminal, and the device comprises:

An acquisition module, configured to acquire the voice to be translated based on the simultaneous translation request;

A voice recognition module, configured to perform voice recognition on the voice to be translated to obtain the text to be translated;

A language identification module, configured to identify the language of the text to be translated, and obtain the language to be translated corresponding to the text to be translated;

The translation module is configured to translate the text to be translated into text in the target language according to the preset correspondence between the language to be translated and the target language, and output the text in a voice manner.

7. The device according to claim 6, wherein the speech recognition module is specifically used for:

converting the speech to be translated into audio data to be translated;

8. The device according to claim 6, wherein the intelligent vehicle-mounted terminal includes a language database, and a plurality of language templates are stored in the language database. The language identification module is specifically used for:

9. A kind of intelligent vehicle-mounted terminal, it is characterized in that, described intelligent vehicle-mounted terminal comprises vehicle-mounted microphone, vehicle-mounted sounder, and described intelligent vehicle-mounted terminal also comprises:

memory;

Processor, described processor is electrically connected with vehicle-mounted microphone, vehicle-mounted sounder; And

A simultaneous translation device, the simultaneous translation device is stored in the memory and includes one or more software function modules executed by the processor, including:

An acquisition module, configured to acquire the speech to be translated collected by the vehicle-mounted microphone based on the simultaneous translation request;

The translation module is used for translating the text to be translated into text in the target language according to the preset correspondence between the language to be translated and the target language, and outputting the text in a voice manner through the vehicle-mounted sounder.

10. A computer-readable storage medium, on which a computer program is stored, wherein, when the computer program is executed by a processor, the method according to any one of claims 1-5 is implemented.