CN114003754B

CN114003754B - Character recognition method, terminal and storage medium

Info

Publication number: CN114003754B
Application number: CN202111173020.5A
Authority: CN
Inventors: 高光磊; 商泽利; 李旻昊; 陈汉文; 黄文涛
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2021-10-08
Filing date: 2021-10-08
Publication date: 2025-07-22
Anticipated expiration: 2041-10-08
Also published as: CN114003754A

Abstract

The embodiment of the present application provides a text recognition method, a terminal, and a storage medium, including: performing graphic recognition on a current image frame to obtain texture information corresponding to a text image in the current image frame; searching for text information corresponding to the texture information from a preset correspondence between texture information and text information; storing text recognition results of the first few frames in this text recognition process in the preset correspondence between texture information and text information; if first preset text information corresponding to the texture information is found from the preset correspondence between texture information and text, determining the first preset text information as the text information corresponding to the texture information.

Description

Character recognition method, terminal and storage medium

Technical Field

The present application relates to the field of electronic applications, and in particular, to a text recognition method, a terminal, and a storage medium.

Background

With the increasing development of electronic technology, more and more software can be inherited on a terminal, and the auxiliary enhancement of the software needs to recognize characters of a software interface. At present, it is generally accepted that text images are first identified, then text identification is performed on each text image, and text information corresponding to the text image is determined. When the number of characters in a software interface is large, the method can cause the character recognition speed to be slow.

Disclosure of Invention

The embodiment of the application provides a character recognition method, a terminal and a storage medium, which can improve character recognition speed.

The technical scheme of the application is realized as follows:

In a first aspect, an embodiment of the present application provides a text recognition method, where the method includes:

Performing pattern recognition on a current image frame to obtain texture information corresponding to a text image in the current image frame;

searching text information corresponding to the texture information from the corresponding relation between the preset texture information and the text information, wherein the text recognition results of the previous frames in the text recognition process are stored in the corresponding relation between the preset texture information and the text information;

If first preset text information corresponding to the texture information is found out from the corresponding relation between the preset texture information and the text, determining the first preset text information as the text information corresponding to the texture information.

In a second aspect, an embodiment of the present application proposes a terminal, including:

the image recognition module is used for carrying out image recognition on the current image frame to obtain texture information corresponding to the text image in the current image frame;

The searching module is used for searching text information corresponding to the texture information from the corresponding relation between preset texture information and text information, wherein the corresponding relation between the preset texture information and the text information stores the text recognition results of the previous frames in the text recognition process;

The determining module is used for determining the first preset text information as the text information corresponding to the texture information if the first preset text information corresponding to the texture information is found out from the preset texture information and text corresponding relation.

In a third aspect, an embodiment of the present application provides a terminal, where the terminal includes a processor, a memory, and a communication bus, and the processor implements the word recognition method as described above when executing an operation program stored in the memory.

In a fourth aspect, an embodiment of the present application proposes a storage medium having stored thereon a computer program which, when executed by a processor, implements a text recognition method as described above.

The embodiment of the application provides a character recognition method, a terminal and a storage medium, wherein the method comprises the steps of carrying out pattern recognition on a current image frame to obtain texture information corresponding to a character image in the current image frame; searching text information corresponding to the texture information from the corresponding relation between the preset texture information and the text information, storing the text recognition results of the previous frames in the text recognition process in the corresponding relation between the preset texture information and the text information, and determining the first preset text information as the text information corresponding to the texture information if the first preset text information corresponding to the texture information is searched from the corresponding relation between the preset texture information and the text information. By adopting the implementation scheme, the terminal stores the character recognition results of the previous frames in the character recognition process, after obtaining the texture information corresponding to the character images in the current image frame, the terminal searches the character information corresponding to the texture information directly from the corresponding relation between the preset texture information and the character information, and if the first preset character information corresponding to the texture information is searched, the first preset character information is directly determined as the character information corresponding to the texture information, so that character recognition is not required for each character image, and the character recognition speed is greatly improved.

Drawings

FIG. 1 is a flowchart of a text recognition method according to an embodiment of the present application;

FIG. 2 is a schematic diagram illustrating an exemplary image texture display according to an embodiment of the present application;

FIG. 3 is a flowchart of an exemplary text recognition method according to an embodiment of the present application;

Fig. 4 is a schematic structural diagram of a terminal 1 according to an embodiment of the present application;

fig. 5 is a schematic diagram of a second structure of the terminal 1 according to the embodiment of the present application.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the application. And are not intended to limit the application.

An embodiment of the present application provides a text recognition method, as shown in fig. 1, where the method may include:

s101, performing pattern recognition on the current image frame to obtain texture information corresponding to the text image in the current image frame.

The graphical recognition method provided by the embodiment of the application is suitable for a scene of recognizing characters in a current image frame of a third party software Interface or a game User Interface (UI) Interface.

In the embodiment of the application, the terminal for performing character recognition can be any device with processing and storage functions, such as a tablet computer, a mobile phone, a personal computer (Personal Computer, a PC), a notebook computer, a wearable device and the like.

In the embodiment of the application, the current image frame of a third-party software interface or a game UI interface in the terminal is acquired, then the current image frame is subjected to pattern recognition by utilizing a preset pattern application programming interface (Application Programming Interface, API) instruction stream to obtain a pattern recognition result, and the texture information corresponding to the text image in the current image frame is determined from the pattern recognition result.

Optionally, the preset graphics APIs include, but are not limited to, open operation language (Open Computing Language, openGL), openGL (OpenGL for Embedded Systems, openGL ES) of the embedded system, vulkan ("next generation" open graphics display API), direct eXtension 11 (Direct eXtension 11), direct x 12, and Metal (low-level rendering application programming interface), which may be specifically selected according to practical situations, and embodiments of the present application are not limited specifically.

In the embodiment of the application, the preset graphic API instruction stream is used for carrying out graphic recognition on the current image frame, wherein specific recognition content of the graphic recognition can be selected according to actual conditions, and finally, a graphic recognition result containing texture information corresponding to the text image can be obtained.

Specifically, instructions and caches related to texture information can be extracted from a preset graphic API instruction stream to obtain a coded stream corresponding to a text image, and each text code in the coded stream contains the texture information corresponding to the text image.

It should be noted that, the current image frame includes at least one image texture, and each image texture may include at least one text image, so each text image in the current image frame is located at a preset position of one image texture. Therefore, to obtain the specific position of each text image in the current image frame, texture identification information and texture coordinate information corresponding to the text image need to be determined. In summary, the texture information includes texture identification information and texture coordinate information corresponding to the text image.

For example, as shown in FIG. 2, the image texture comprises two image textures, namely an image texture 1 and an image texture 2, wherein the image texture 1 comprises four words of 'game, play, lung and inflammation', and the image texture 2 comprises four words of 'fight, hit, add and addiction'. Then, the texture identification information corresponding to "anti" is 2, the texture coordinate information is (0,0.25,0.25,1), the texture identification information corresponding to "hit" is 2, the texture coordinate information is (0.25,0.25,1,1), the texture identification information corresponding to "lung" is 1, the texture coordinate information is (0,0,0.25,0.25), the texture identification information corresponding to "inflammatory" is 1, and the texture coordinate information is (0.25,0,1,0.25).

S102, searching preset text information corresponding to the texture information from the corresponding relation between the preset texture information and the text information, and storing the text recognition results of the previous frames in the text recognition process in the corresponding relation between the preset texture information and the text information.

In the embodiment of the application, after the texture information corresponding to the text image is identified from the current image frame, the preset text information corresponding to the texture information is searched from the corresponding relation between the preset texture information and the text information.

It should be noted that, the corresponding relation between the preset texture information and the text information records the text recognition results of the previous frames in the current text recognition scene. If the current image frame is the image frame corresponding to the first round of character recognition in the current character scene, presetting the corresponding relation between the texture information and the character information as a blank initial state.

It should be noted that, a specific storage behavior < key, value > key value pair form of the corresponding relationship between the preset texture information and the text information is preset, wherein the preset texture information is key, and the preset text information is value.

In the embodiment of the application, first preset texture information matched with the text information is searched from the corresponding relation between the preset texture information and the text information, and if the first preset texture information is searched, the first preset text information corresponding to the first preset texture information is searched from the corresponding relation between the preset texture information and the text information.

In the embodiment of the application, the text information can comprise Chinese characters, east Asian characters and other text forms, and can be specifically selected according to actual conditions, and the embodiment of the application is not particularly limited.

S103, if first preset text information corresponding to the texture information is found from the corresponding relation between the preset texture information and the text, determining the first preset text information as the text information corresponding to the texture information.

In the embodiment of the application, if the first preset texture information matched with the text information and the first preset text information corresponding to the first preset texture information are searched from the corresponding relation between the preset texture information and the text information, the first preset text information corresponding to the texture information is characterized by searching from the corresponding relation between the preset texture information and the text information, and at the moment, the first preset text information is determined to be the text information corresponding to the texture information.

It can be understood that the first preset text information corresponding to the texture information is found from the corresponding relation between the preset texture information and the text information, so that the text information corresponding to the texture information can be rapidly determined, the corresponding text information does not need to be recognized by graphically identifying all text images in the current image frame, and the text recognition speed is greatly improved.

Further, if the first preset texture information matched with the text information is not found from the corresponding relation between the preset texture information and the text information, the terminal determines the image data corresponding to the texture information from the current image frame, and then performs text recognition on the image data to obtain the text information corresponding to the image data.

In an embodiment of the present application, image data is located from a current image frame based on texture identification information and texture coordinate information.

In the embodiment of the application, the method for performing text recognition on the image data can be a local deep learning method, a cloud recognition method and the like, and can be specifically selected according to actual conditions, and the embodiment of the application is not particularly limited.

In the embodiment of the application, after character recognition is performed on the image data to obtain the character information corresponding to the image data, the mapping relation between the texture information and the character information can be updated to the preset corresponding relation between the texture information and the character information. So as to be used in the character recognition process of the next round of the character recognition scene.

For example, four characters of lung, inflammation, resistance and impact are added into the corresponding relation between preset management information and character information. As shown in the details of table 1,

Table 1 preset table of correspondence between management information and text information

Based on the above embodiment, the scheme of the application can be realized through a pattern recognition module, a text management module and an image recognition module, wherein the text management module stores the corresponding relation between the preset texture information and the text information, the specific realization process is shown in figure 3,

1. Inputting the graphic API instruction stream into a graphic recognition module, and searching texture identification information and texture coordinate information corresponding to the text image from the graphic API instruction stream by the graphic recognition module.

2. And searching text information corresponding to the texture identification information and the texture coordinate information from the preset corresponding relation between the texture information and the text information in the text management module.

3. If the first preset text information corresponding to the texture identification information and the texture coordinate information is found, ending the flow.

4. If the text information corresponding to the texture identification information and the texture coordinate information is not found, the texture identification information and the texture coordinate information are input into the image recognition module.

5. The image recognition module recognizes text information corresponding to the texture identification information and the texture coordinate information.

6. The image recognition module inputs texture identification information and texture coordinate information and text information corresponding to the texture identification information and the texture coordinate information into the text management module.

7. The text management module updates the corresponding relation between the preset texture information and the text information by using the texture identification information and the texture coordinate information and the text information corresponding to the texture identification information and the texture coordinate information.

It can be understood that, in the text recognition process, the terminal stores the text recognition results of the previous frames, after obtaining the texture information corresponding to the text image in the current image frame, the terminal searches the text information corresponding to the texture information directly from the corresponding relation between the preset texture information and the text information, if the first preset text information corresponding to the texture information is found, the first preset text information is directly determined as the text information corresponding to the texture information, and each text image is not required to be subjected to text recognition, so that the text recognition speed is greatly improved.

Example two

The embodiment of the application provides a terminal. As shown in fig. 4, the terminal 1 includes:

the pattern recognition module 10 is used for performing pattern recognition on a current image frame to obtain texture information corresponding to a text image in the current image frame;

The searching module 11 is used for searching text information corresponding to the texture information from the corresponding relation between the preset texture information and the text information, wherein the text recognition results of the previous frames in the text recognition process are stored in the corresponding relation between the preset texture information and the text information;

the determining module 12 is configured to determine, if first preset text information corresponding to the texture information is found from a preset texture information and text correspondence, the first preset text information as text information corresponding to the texture information.

Optionally, the terminal further comprises a character recognition module;

the determining module 12 is further configured to determine, if text information corresponding to the texture information is not found from a preset correspondence between the texture information and the text information, image data corresponding to the texture information from the current image frame;

The character recognition module is used for carrying out character recognition on the image data to obtain character information corresponding to the image data.

Optionally, updating the mapping relationship between the texture information and the text information to the corresponding relationship between the preset texture information and the text information.

Optionally, the current image frame includes at least one image texture, where each image texture corresponds to one texture identification information;

the texture information comprises texture identification information and texture coordinate information corresponding to the text image.

Optionally, the terminal further comprises a positioning module;

The positioning module is used for positioning the image data from the current image frame based on the texture identification information and the texture coordinate information.

Optionally, the pattern recognition module 10 is further configured to perform pattern recognition on the current image frame by using a preset API instruction stream to obtain a pattern recognition result;

The determining unit 12 is further configured to determine texture information corresponding to a text image in the current image frame from the pattern recognition result.

The terminal provided by the embodiment of the application performs pattern recognition on a current image frame to obtain texture information corresponding to a text image in the current image frame, searches text information corresponding to the texture information from a preset texture information and text information corresponding relation, stores text recognition results of the previous frames in the text recognition process in the preset texture information and text information corresponding relation, and determines the first preset text information as text information corresponding to the texture information if the first preset text information corresponding to the texture information is searched from the preset texture information and text corresponding relation. Therefore, in the terminal provided by this embodiment, the terminal stores the text recognition results of the previous frames in the text recognition process, and after obtaining the texture information corresponding to the text image in the current image frame, the terminal searches the text information corresponding to the texture information directly from the preset texture information and text information correspondence, if the first preset text information corresponding to the texture information is found, the first preset text information is directly determined as the text information corresponding to the texture information, and text recognition is not required for each text image, so that the text recognition speed is greatly improved.

Fig. 5 is a schematic diagram of a second component structure of the terminal 1 according to the embodiment of the present application, and in practical application, based on the same disclosure concept of the above embodiment, as shown in fig. 5, the terminal 1 of the present embodiment includes a processor 13, a memory 14 and a communication bus 15.

In a specific embodiment, the pattern recognition module 10, the search module 11, the determination module 12, the text recognition module, and the positioning module may be implemented by a Processor 13 located on the terminal 1, where the Processor 13 may be at least one of an Application Specific Integrated Circuit (ASIC), a digital signal Processor (DSP, digital Signal Processor), a digital signal processing image processing device (DSPD, digital Signal Processing Device), a programmable logic image processing device (PLD, programmable Logic Device), a field programmable gate array (FPGA, field Programmable GATE ARRAY), a CPU, a controller, a microcontroller, and a microprocessor. It will be appreciated that the electronics for implementing the above-described processor functions may be other for different devices, and the present embodiment is not particularly limited.

In the embodiment of the present application, the communication bus 15 is used to implement connection communication between the processor 13 and the memory 14, and the processor 13 implements the following text recognition method when executing the running program stored in the memory 14:

the method comprises the steps of carrying out pattern recognition on a current image frame to obtain texture information corresponding to a text image in the current image frame, searching text information corresponding to the texture information from a preset texture information and text information corresponding relation, storing text recognition results of previous frames in the text recognition process in the preset texture information and text information corresponding relation, and determining the first preset text information as text information corresponding to the texture information if first preset text information corresponding to the texture information is searched from the preset texture information and text corresponding relation.

Further, the processor 13 is further configured to determine image data corresponding to the texture information from the current image frame if text information corresponding to the texture information is not found from a preset correspondence between the texture information and the text information, and perform text recognition on the image data to obtain text information corresponding to the image data.

Further, the processor 13 is further configured to update the mapping relationship between the texture information and the text information to the corresponding relationship between the preset texture information and the text information.

Further, the current image frame comprises at least one image texture, wherein each image texture corresponds to one texture identification information, and the texture information comprises texture identification information and texture coordinate information corresponding to the text image.

Further, the above processor 13 is further configured to locate the image data from the current image frame based on the texture identification information and the texture coordinate information.

Further, the processor 13 is further configured to perform pattern recognition on the current image frame by using a preset API instruction stream to obtain a pattern recognition result, and determine texture information corresponding to a text image in the current image frame from the pattern recognition result.

An embodiment of the present application provides a storage medium, on which a computer program is stored, where the computer readable storage medium stores one or more programs, where the one or more programs are executable by one or more processors and applied to a terminal, where the computer program implements a word recognition method as described above.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present disclosure may be embodied essentially or in a part contributing to the related art in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), including several instructions for causing an image display device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method described in the embodiments of the present disclosure.

The foregoing description is only of the preferred embodiments of the present application, and is not intended to limit the scope of the present application.

Claims

1. A method for character recognition, characterized in that the method comprises:

Performing graphic recognition on the current image frame to obtain texture information corresponding to the text image in the current image frame; the current image frame includes at least one image texture, wherein each image texture corresponds to a piece of texture identification information; the texture information includes texture identification information and texture coordinate information;

Searching for text information corresponding to the texture information from a preset correspondence relationship between texture information and text information; storing text recognition results of the first few frames in the current text recognition process in the preset correspondence relationship between texture information and text information;

If the first preset text information corresponding to the texture information is found from the preset correspondence between texture information and text, the first preset text information is determined as the text information corresponding to the texture information.

2. The method according to claim 1, characterized in that after searching for text information corresponding to the texture information from the preset correspondence between texture information and text, the method further comprises:

If the text information corresponding to the texture information is not found from the preset correspondence between texture information and text information, determining the image data corresponding to the texture information from the current image frame;

Perform text recognition on the image data to obtain text information corresponding to the image data.

3. The method according to claim 2, characterized in that after performing text recognition on the image data to obtain text information corresponding to the image data, the method further comprises:

The mapping relationship between the texture information and the text information is updated to the preset corresponding relationship between the texture information and the text information.

4. The method according to claim 3, characterized in that the current image frame includes at least one image texture, wherein each image texture corresponds to a texture identification information;

The texture information includes texture identification information and texture coordinate information corresponding to the text image.

5. The method according to claim 4, characterized in that the step of determining the image data corresponding to the texture information from the current image frame comprises:

The image data is located from the current image frame based on the texture identification information and the texture coordinate information.

6. The method according to claim 1, characterized in that the step of performing graphic recognition on the current image frame to obtain texture information corresponding to the text image in the current image frame comprises:

Using a preset graphics application programming interface API instruction stream, performing graphics recognition on the current image frame to obtain a graphics recognition result;

The texture information corresponding to the text image in the current image frame is determined from the graphic recognition result.

7. A terminal, characterized in that the terminal comprises:

A graphic recognition module, used to perform graphic recognition on the current image frame to obtain texture information corresponding to the text image in the current image frame; the texture information includes texture identification information and texture coordinate information;

A search module, used to search for text information corresponding to the texture information from a preset correspondence relationship between texture information and text information; the preset correspondence relationship between texture information and text information stores the text recognition results of the first few frames in the current text recognition process;

The determination module is used to determine the first preset text information as the text information corresponding to the texture information if the first preset text information corresponding to the texture information is found from the preset texture information and text correspondence relationship.

8. The terminal according to claim 7, characterized in that the terminal further comprises: a text recognition module;

The determination module is further configured to determine the image data corresponding to the texture information from the current image frame if the text information corresponding to the texture information is not found from the preset correspondence between the texture information and the text information;

The text recognition module is used to perform text recognition on the image data to obtain text information corresponding to the image data.

9. A terminal, characterized in that the terminal comprises: a processor, a memory and a communication bus; when the processor executes a running program stored in the memory, the method according to any one of claims 1 to 6 is implemented.

10. A storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the method according to any one of claims 1 to 6.