Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the application. And are not intended to limit the application.
An embodiment of the present application provides a text recognition method, as shown in fig. 1, where the method may include:
s101, performing pattern recognition on the current image frame to obtain texture information corresponding to the text image in the current image frame.
The graphical recognition method provided by the embodiment of the application is suitable for a scene of recognizing characters in a current image frame of a third party software Interface or a game User Interface (UI) Interface.
In the embodiment of the application, the terminal for performing character recognition can be any device with processing and storage functions, such as a tablet computer, a mobile phone, a personal computer (Personal Computer, a PC), a notebook computer, a wearable device and the like.
In the embodiment of the application, the current image frame of a third-party software interface or a game UI interface in the terminal is acquired, then the current image frame is subjected to pattern recognition by utilizing a preset pattern application programming interface (Application Programming Interface, API) instruction stream to obtain a pattern recognition result, and the texture information corresponding to the text image in the current image frame is determined from the pattern recognition result.
Optionally, the preset graphics APIs include, but are not limited to, open operation language (Open Computing Language, openGL), openGL (OpenGL for Embedded Systems, openGL ES) of the embedded system, vulkan ("next generation" open graphics display API), direct eXtension 11 (Direct eXtension 11), direct x 12, and Metal (low-level rendering application programming interface), which may be specifically selected according to practical situations, and embodiments of the present application are not limited specifically.
In the embodiment of the application, the preset graphic API instruction stream is used for carrying out graphic recognition on the current image frame, wherein specific recognition content of the graphic recognition can be selected according to actual conditions, and finally, a graphic recognition result containing texture information corresponding to the text image can be obtained.
Specifically, instructions and caches related to texture information can be extracted from a preset graphic API instruction stream to obtain a coded stream corresponding to a text image, and each text code in the coded stream contains the texture information corresponding to the text image.
It should be noted that, the current image frame includes at least one image texture, and each image texture may include at least one text image, so each text image in the current image frame is located at a preset position of one image texture. Therefore, to obtain the specific position of each text image in the current image frame, texture identification information and texture coordinate information corresponding to the text image need to be determined. In summary, the texture information includes texture identification information and texture coordinate information corresponding to the text image.
For example, as shown in FIG. 2, the image texture comprises two image textures, namely an image texture 1 and an image texture 2, wherein the image texture 1 comprises four words of 'game, play, lung and inflammation', and the image texture 2 comprises four words of 'fight, hit, add and addiction'. Then, the texture identification information corresponding to "anti" is 2, the texture coordinate information is (0,0.25,0.25,1), the texture identification information corresponding to "hit" is 2, the texture coordinate information is (0.25,0.25,1,1), the texture identification information corresponding to "lung" is 1, the texture coordinate information is (0,0,0.25,0.25), the texture identification information corresponding to "inflammatory" is 1, and the texture coordinate information is (0.25,0,1,0.25).
S102, searching preset text information corresponding to the texture information from the corresponding relation between the preset texture information and the text information, and storing the text recognition results of the previous frames in the text recognition process in the corresponding relation between the preset texture information and the text information.
In the embodiment of the application, after the texture information corresponding to the text image is identified from the current image frame, the preset text information corresponding to the texture information is searched from the corresponding relation between the preset texture information and the text information.
It should be noted that, the corresponding relation between the preset texture information and the text information records the text recognition results of the previous frames in the current text recognition scene. If the current image frame is the image frame corresponding to the first round of character recognition in the current character scene, presetting the corresponding relation between the texture information and the character information as a blank initial state.
It should be noted that, a specific storage behavior < key, value > key value pair form of the corresponding relationship between the preset texture information and the text information is preset, wherein the preset texture information is key, and the preset text information is value.
In the embodiment of the application, first preset texture information matched with the text information is searched from the corresponding relation between the preset texture information and the text information, and if the first preset texture information is searched, the first preset text information corresponding to the first preset texture information is searched from the corresponding relation between the preset texture information and the text information.
In the embodiment of the application, the text information can comprise Chinese characters, east Asian characters and other text forms, and can be specifically selected according to actual conditions, and the embodiment of the application is not particularly limited.
S103, if first preset text information corresponding to the texture information is found from the corresponding relation between the preset texture information and the text, determining the first preset text information as the text information corresponding to the texture information.
In the embodiment of the application, if the first preset texture information matched with the text information and the first preset text information corresponding to the first preset texture information are searched from the corresponding relation between the preset texture information and the text information, the first preset text information corresponding to the texture information is characterized by searching from the corresponding relation between the preset texture information and the text information, and at the moment, the first preset text information is determined to be the text information corresponding to the texture information.
It can be understood that the first preset text information corresponding to the texture information is found from the corresponding relation between the preset texture information and the text information, so that the text information corresponding to the texture information can be rapidly determined, the corresponding text information does not need to be recognized by graphically identifying all text images in the current image frame, and the text recognition speed is greatly improved.
Further, if the first preset texture information matched with the text information is not found from the corresponding relation between the preset texture information and the text information, the terminal determines the image data corresponding to the texture information from the current image frame, and then performs text recognition on the image data to obtain the text information corresponding to the image data.
In an embodiment of the present application, image data is located from a current image frame based on texture identification information and texture coordinate information.
In the embodiment of the application, the method for performing text recognition on the image data can be a local deep learning method, a cloud recognition method and the like, and can be specifically selected according to actual conditions, and the embodiment of the application is not particularly limited.
In the embodiment of the application, after character recognition is performed on the image data to obtain the character information corresponding to the image data, the mapping relation between the texture information and the character information can be updated to the preset corresponding relation between the texture information and the character information. So as to be used in the character recognition process of the next round of the character recognition scene.
For example, four characters of lung, inflammation, resistance and impact are added into the corresponding relation between preset management information and character information. As shown in the details of table 1,
Table 1 preset table of correspondence between management information and text information
Based on the above embodiment, the scheme of the application can be realized through a pattern recognition module, a text management module and an image recognition module, wherein the text management module stores the corresponding relation between the preset texture information and the text information, the specific realization process is shown in figure 3,
1. Inputting the graphic API instruction stream into a graphic recognition module, and searching texture identification information and texture coordinate information corresponding to the text image from the graphic API instruction stream by the graphic recognition module.
2. And searching text information corresponding to the texture identification information and the texture coordinate information from the preset corresponding relation between the texture information and the text information in the text management module.
3. If the first preset text information corresponding to the texture identification information and the texture coordinate information is found, ending the flow.
4. If the text information corresponding to the texture identification information and the texture coordinate information is not found, the texture identification information and the texture coordinate information are input into the image recognition module.
5. The image recognition module recognizes text information corresponding to the texture identification information and the texture coordinate information.
6. The image recognition module inputs texture identification information and texture coordinate information and text information corresponding to the texture identification information and the texture coordinate information into the text management module.
7. The text management module updates the corresponding relation between the preset texture information and the text information by using the texture identification information and the texture coordinate information and the text information corresponding to the texture identification information and the texture coordinate information.
It can be understood that, in the text recognition process, the terminal stores the text recognition results of the previous frames, after obtaining the texture information corresponding to the text image in the current image frame, the terminal searches the text information corresponding to the texture information directly from the corresponding relation between the preset texture information and the text information, if the first preset text information corresponding to the texture information is found, the first preset text information is directly determined as the text information corresponding to the texture information, and each text image is not required to be subjected to text recognition, so that the text recognition speed is greatly improved.
Example two
The embodiment of the application provides a terminal. As shown in fig. 4, the terminal 1 includes:
the pattern recognition module 10 is used for performing pattern recognition on a current image frame to obtain texture information corresponding to a text image in the current image frame;
The searching module 11 is used for searching text information corresponding to the texture information from the corresponding relation between the preset texture information and the text information, wherein the text recognition results of the previous frames in the text recognition process are stored in the corresponding relation between the preset texture information and the text information;
the determining module 12 is configured to determine, if first preset text information corresponding to the texture information is found from a preset texture information and text correspondence, the first preset text information as text information corresponding to the texture information.
Optionally, the terminal further comprises a character recognition module;
the determining module 12 is further configured to determine, if text information corresponding to the texture information is not found from a preset correspondence between the texture information and the text information, image data corresponding to the texture information from the current image frame;
The character recognition module is used for carrying out character recognition on the image data to obtain character information corresponding to the image data.
Optionally, updating the mapping relationship between the texture information and the text information to the corresponding relationship between the preset texture information and the text information.
Optionally, the current image frame includes at least one image texture, where each image texture corresponds to one texture identification information;
the texture information comprises texture identification information and texture coordinate information corresponding to the text image.
Optionally, the terminal further comprises a positioning module;
The positioning module is used for positioning the image data from the current image frame based on the texture identification information and the texture coordinate information.
Optionally, the pattern recognition module 10 is further configured to perform pattern recognition on the current image frame by using a preset API instruction stream to obtain a pattern recognition result;
The determining unit 12 is further configured to determine texture information corresponding to a text image in the current image frame from the pattern recognition result.
The terminal provided by the embodiment of the application performs pattern recognition on a current image frame to obtain texture information corresponding to a text image in the current image frame, searches text information corresponding to the texture information from a preset texture information and text information corresponding relation, stores text recognition results of the previous frames in the text recognition process in the preset texture information and text information corresponding relation, and determines the first preset text information as text information corresponding to the texture information if the first preset text information corresponding to the texture information is searched from the preset texture information and text corresponding relation. Therefore, in the terminal provided by this embodiment, the terminal stores the text recognition results of the previous frames in the text recognition process, and after obtaining the texture information corresponding to the text image in the current image frame, the terminal searches the text information corresponding to the texture information directly from the preset texture information and text information correspondence, if the first preset text information corresponding to the texture information is found, the first preset text information is directly determined as the text information corresponding to the texture information, and text recognition is not required for each text image, so that the text recognition speed is greatly improved.
Fig. 5 is a schematic diagram of a second component structure of the terminal 1 according to the embodiment of the present application, and in practical application, based on the same disclosure concept of the above embodiment, as shown in fig. 5, the terminal 1 of the present embodiment includes a processor 13, a memory 14 and a communication bus 15.
In a specific embodiment, the pattern recognition module 10, the search module 11, the determination module 12, the text recognition module, and the positioning module may be implemented by a Processor 13 located on the terminal 1, where the Processor 13 may be at least one of an Application Specific Integrated Circuit (ASIC), a digital signal Processor (DSP, digital Signal Processor), a digital signal processing image processing device (DSPD, digital Signal Processing Device), a programmable logic image processing device (PLD, programmable Logic Device), a field programmable gate array (FPGA, field Programmable GATE ARRAY), a CPU, a controller, a microcontroller, and a microprocessor. It will be appreciated that the electronics for implementing the above-described processor functions may be other for different devices, and the present embodiment is not particularly limited.
In the embodiment of the present application, the communication bus 15 is used to implement connection communication between the processor 13 and the memory 14, and the processor 13 implements the following text recognition method when executing the running program stored in the memory 14:
the method comprises the steps of carrying out pattern recognition on a current image frame to obtain texture information corresponding to a text image in the current image frame, searching text information corresponding to the texture information from a preset texture information and text information corresponding relation, storing text recognition results of previous frames in the text recognition process in the preset texture information and text information corresponding relation, and determining the first preset text information as text information corresponding to the texture information if first preset text information corresponding to the texture information is searched from the preset texture information and text corresponding relation.
Further, the processor 13 is further configured to determine image data corresponding to the texture information from the current image frame if text information corresponding to the texture information is not found from a preset correspondence between the texture information and the text information, and perform text recognition on the image data to obtain text information corresponding to the image data.
Further, the processor 13 is further configured to update the mapping relationship between the texture information and the text information to the corresponding relationship between the preset texture information and the text information.
Further, the current image frame comprises at least one image texture, wherein each image texture corresponds to one texture identification information, and the texture information comprises texture identification information and texture coordinate information corresponding to the text image.
Further, the above processor 13 is further configured to locate the image data from the current image frame based on the texture identification information and the texture coordinate information.
Further, the processor 13 is further configured to perform pattern recognition on the current image frame by using a preset API instruction stream to obtain a pattern recognition result, and determine texture information corresponding to a text image in the current image frame from the pattern recognition result.
An embodiment of the present application provides a storage medium, on which a computer program is stored, where the computer readable storage medium stores one or more programs, where the one or more programs are executable by one or more processors and applied to a terminal, where the computer program implements a word recognition method as described above.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present disclosure may be embodied essentially or in a part contributing to the related art in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), including several instructions for causing an image display device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method described in the embodiments of the present disclosure.
The foregoing description is only of the preferred embodiments of the present application, and is not intended to limit the scope of the present application.