CN113468891A

CN113468891A - Text processing method and device

Info

Publication number: CN113468891A
Application number: CN202110853255.2A
Authority: CN
Inventors: 顾大中; 梁建增; 周梦迪; 王洪彬; 李楠; 乔建伟; 乔莉
Original assignee: Alipay Hangzhou Information Technology Co Ltd
Current assignee: Ant Shengxin Shanghai Information Technology Co ltd
Priority date: 2021-07-27
Filing date: 2021-07-27
Publication date: 2021-10-01
Anticipated expiration: 2041-07-27
Also published as: CN113468891B; CN120524951A

Abstract

The embodiment of the specification provides a text processing method and a text processing device, wherein the text processing method comprises the following steps: receiving a text to be processed, inputting the text to be processed into an entity recognition model, and obtaining a candidate text with an entity recognition tag; inputting the candidate text with the entity identification tag into an entity discrimination model to obtain a candidate entity of the candidate text, and determining an alternative text based on the candidate entity; constructing a relation knowledge graph based on the candidate entities and the candidate texts, calculating the similarity between nodes of each relation knowledge graph based on the relation knowledge graph, and determining the relation of the target entities; and determining a target entity text in a preset knowledge base based on the target entity relationship.

Description

Text processing method and device

Technical Field

The embodiment of the specification relates to the technical field of computers, in particular to a text processing method. One or more embodiments of the present specification also relate to a text processing apparatus, a computing device, and a computer-readable storage medium.

Background

With the progress of data processing technology and the rapid popularization of mobile internet, computer technology is widely applied to various social fields, wherein in the claim project, when a recorder communicates with a user through interview, a large amount of time is spent in a process of recording and inquiring, the recorder needs to repeatedly confirm collected information to the user and then manually fill the collected information in a form of the claim work system, so that the manual recording cost is high, the time consumption for processing the project is long, and the project processing efficiency is greatly influenced.

Disclosure of Invention

In view of this, the embodiments of the present specification provide a text processing method. One or more embodiments of the present specification also relate to a text processing apparatus, a computing device, and a computer-readable storage medium to address technical deficiencies in the prior art.

According to a first aspect of embodiments herein, there is provided a text processing method including:

receiving a text to be processed, inputting the text to be processed into an entity recognition model, and obtaining a candidate text with an entity recognition tag;

inputting the candidate text with the entity identification tag into an entity discrimination model to obtain a candidate entity of the candidate text, and determining an alternative text based on the candidate entity;

constructing a relation knowledge graph based on the candidate entities and the candidate texts, calculating the similarity between nodes of each relation knowledge graph based on the relation knowledge graph, and determining the relation of the target entities;

and determining a target entity text in a preset knowledge base based on the target entity relationship.

According to a second aspect of embodiments herein, there is provided a text processing apparatus including:

the entity identification module is configured to receive a text to be processed, input the text to be processed into an entity identification model and obtain a candidate text with an entity identification tag;

the entity distinguishing module is configured to input the candidate text with the entity identification tag into an entity distinguishing model, obtain a candidate entity of the candidate text, and determine an alternative text based on the candidate entity;

an entity relationship determination module configured to construct a relationship knowledge graph based on the candidate entities and the candidate texts, calculate similarity between each relationship knowledge graph node based on the relationship knowledge graph, and determine a target entity relationship;

and the target entity determining module is configured to determine a target entity text in a preset knowledge base based on the target entity relationship.

According to a third aspect of embodiments herein, there is provided a computing device comprising:

a memory and a processor;

the memory is configured to store computer-executable instructions and the processor is configured to execute the computer-executable instructions, wherein the processor implements the steps of the text processing method when executing the computer-executable instructions.

According to a fourth aspect of embodiments herein, there is provided a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of any one of the text processing methods.

In one embodiment of the specification, a candidate text with an entity identification tag is obtained by receiving a text to be processed and inputting the text to be processed into an entity identification model; inputting the candidate text with the entity identification tag into an entity discrimination model to obtain a candidate entity of the candidate text, and determining an alternative text based on the candidate entity; constructing a relation knowledge graph based on the candidate entities and the candidate texts, calculating the similarity between nodes of each relation knowledge graph based on the relation knowledge graph, and determining the relation of the target entities; and determining a target entity text in a preset knowledge base based on the target entity relationship.

Specifically, a candidate text is determined by inputting a text to be processed into an entity recognition model, a candidate entity is determined by inputting the candidate text into an entity discrimination model, so that an entity with high similarity of the entity in the text to be processed is increased, a target entity relation is determined in a knowledge graph by constructing a relation knowledge graph, the problem of dependence of voice recognition errors and long texts is solved, processed keyword information is displayed back to a form floating window of an inquiry record, the cost of manual filling is reduced, and the efficiency of overall video interview is improved.

Drawings

Fig. 1 is a schematic interface diagram of a document processing method applied to a logger filling form of an online claims settlement system according to an embodiment of the present disclosure;

FIG. 2 is a flow chart illustrating a text processing method according to an embodiment of the present disclosure;

FIG. 3 is a flow diagram of a method for text processing provided in one embodiment of the present description;

FIG. 4 is a process diagram of an entity recognition model of a text processing method according to an embodiment of the present disclosure;

fig. 5 is a schematic diagram illustrating a text to be scored by using a text processing method according to an embodiment of the present specification;

fig. 6 is a diagram illustrating knowledge of a relationship between a text to be processed and an entity in a text processing method according to an embodiment of the present specification;

fig. 7 is a schematic structural diagram of a text processing apparatus according to an embodiment of the present specification;

fig. 8 is a block diagram of a computing device according to an embodiment of the present disclosure.

Detailed Description

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present description. This description may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein, as those skilled in the art will be able to make and use the present disclosure without departing from the spirit and scope of the present disclosure.

The terminology used in the description of the one or more embodiments is for the purpose of describing the particular embodiments only and is not intended to be limiting of the description of the one or more embodiments. As used in one or more embodiments of the present specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any and all possible combinations of one or more of the associated listed items.

It will be understood that, although the terms first, second, etc. may be used herein in one or more embodiments to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first can also be referred to as a second and, similarly, a second can also be referred to as a first without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.

First, the noun terms to which one or more embodiments of the present specification relate are explained.

Insurance public estimation: the insurance equity estimation refers to the action of accepting the entrustment of the insurance party and independently carrying out the activities of evaluation, investigation, appraisal, loss estimation, calculation and the like on the insurance target related to the insurance accident.

And (3) interview survey: the investigator directly visits the respondents and listens to the opinions of the respondents.

Video interview: the interview process is completed in the form of a remote video session.

Insurance video interview: the video interview applied to the insurance field is characterized in that a researcher is an insurance public evaluator, and a researcher is an insurance applicant.

An intelligent summary: the intelligent president can automatically extract keyword information in the video interview session process to help improve the efficiency of the whole online public estimation process.

ASR (Automatic Speech Recognition, Automatic Speech Recognition technology): the sound signal may be converted into a text signal.

NER (Named Entity Recognition technology): the nouns of entities (address, time, disease, hospital, examination, etc.) in a piece of text can be identified.

Entity chain refers to (Entity Linking): and associating the entity identified by the NER with the corresponding entity in the existing knowledge base.

And (3) extracting the relation: relationships (treatment, location, etc.) between entities in a piece of text may be identified.

Text error correction: techniques to restore a noisy text (noise due to image recognition or speech recognition) to the correct text.

The text processing method provided by the embodiment of the specification can be applied to an insurance claim settlement scene, a security institution inquiry record scene or a hospital inquiry scene, the embodiment of the specification does not limit the application scene at all, and the text processing method is described in detail by taking the insurance claim settlement scene as an example of text processing of the inquiry record.

In the scene of application for claim settlement by the applicant, the traditional interview is carried out through the offline, and the recording personnel are required to go to the location of the claim settlement user for development, so that the cost is high and the efficiency is low. In order to reduce the cost and improve the efficiency, the recording personnel can return the disease condition information of the user in a video interview mode so as to reduce the cost of offline public estimation investigation and improve the overall claim settlement efficiency. However, due to the complexity of the claim settlement project, the time consumption of the process of one-time remote video interview is long, a large amount of time is spent in the process of recording inquiry in the whole interview process through communication and information collection of the recording personnel to the user, the recording personnel need to confirm detailed information such as time, place, hospital, diseases, examination modes and the like frequently and the user, and the detailed information is manually filled in the form of the claim settlement operation system after confirmation.

However, due to the user's dialect and the network quality, the voice information received by the entire video interview will generate a lot of noise in the most front voice recognition module, and the generated text will have a certain error, and some words will be mistakenly recognized as words with similar pronunciation (like "hospital" and "music"). For example: the user says "what I see in a collaboration with a hospital," which is recognized by the ASR module as "what I see in a collaboration with music," thereby adversely affecting subsequent steps. Further, the effective atomic information required for a business may be distributed over multiple rounds of conversation. For example: a- "I go. "; b- "do you go to hospital? "; a- "people Hospital"; b- "the people's hospital of the first time? "; a- "first". In this example, ". first people's hospital" is the effective atomic information needed by the business, and simply identifies that ". x.etc. information is of no value to the business". However, the information does not exist in any dialog alone, and the information of 5 dialogs must be combined to be spliced.

The text processing method provided by the embodiment of the specification is based on a video interview intelligent summary system with ASR (error correction and retrieval) capability and long text analysis capability, keyword information is automatically recognized from a real-time conversation text of the video interview and is processed, the problem that voice recognition errors and long text dependence are solved, the processed keyword information is displayed back to a form floating window of an inquiry record, the cost of manual filling is reduced, and the efficiency of the whole video interview is improved.

In view of this, in the present specification, there is provided a text processing method, and the present specification simultaneously relates to a text processing apparatus, a computing device, and a computer-readable storage medium, which are described in detail one by one in the following embodiments.

Referring to fig. 1, fig. 1 shows an interface schematic diagram of a client display interface for filling in a keyword during an insurance claim interview filling process by a logger, where the interface schematic diagram of a logger filling form of an online claims system to which the text processing method provided in an embodiment of the present disclosure is applied.

Fig. 1 is a schematic diagram of a recorder recording a stroke record during a video interview performed by the recorder and a user, a left part a in fig. 1 is a claim form (including some information related to insurance claims spoken by the user) to be completed during the video interview, an upper right part B in fig. 1 is a claim form (including some information related to insurance claims spoken by the user) to be completed during the video interview, and a session after ASR is converted into a text by voice, and a lower right part C in fig. 1 is an intelligent summary keyword part, which shows keyword information recognized from a current video interview session text (which can be semi-automatically refilled into the left claim form).

In the process of recording form information to be filled by a person in the online claims processing system, in order to solve the problem of dependence on a long text and a speech recognition error of an ASR speech conversion text, the present specification optimizes a noisy text NER module, and adds an ASR text error correction module and a long text parsing module, and processes the text content after speech conversion to improve the accuracy of recognizing keyword information, see fig. 2, and fig. 2 shows a flow diagram of a text processing method provided in an embodiment of the present specification.

Step 202: the server converts the speech signal input into text information (including character information) in the ASR module.

Step 204: and the server displays the converted text information into a conversation record, and performs problem identification and topic detection on the conversation record.

Specifically, a text spoken by a recording person is input into the problem identification model, and if the problem identification model determines that the current text is an effective problem, the topic detection module is called, wherein the topic detection module can determine which stage the current session has been performed, such as inquiring about a life track or a work track, and if the current text is determined to be not an effective problem, the subsequent process of the server is stopped.

Step 206: the server inputs the text information into a noise text NER module, and entity information in the text information is identified.

Specifically, the text information passes through a noisy text NER model, and entity information in the text can be identified under the condition that a certain degree of text error exists, such as five types including hospital, disease, address, time and examination.

Step 208: and the server corrects the error text to a certain degree by the entity information through the ASR text correction module.

Step 210: the server obtains the association relationship between the entity information by the text and the entity information through a long text analysis module and a relationship extraction model, and then screens out a correct associated entity group by using a knowledge graph.

Step 212: and the server associates the associated entity group with the entities in the existing knowledge base to determine the target entity information.

Specifically, the process is an entity chain process, and associates an entity identified by the NER (or an entity spliced by the long text parsing module) with an entity in an existing knowledge base.

Step 214: and the server displays the determined target entity information on an interview interface and displays the determined target entity information in a text box to be filled based on the identified question or topic.

Specifically, the intelligent presidential system displays and identifies entity keyword information on the lower right side of an interface for a recorder to fill in a form, and prompts keyword information when the recorder prepares to fill in forms corresponding to different topics (as the problem detection model detects problems under different topics, the keywords prompted in the forms under the corresponding topics are different).

The text processing method provided by the embodiment of the specification determines the correct entity by correcting the entity identified by the text information, and extracts the relationship between the entities from the semantic information of multiple rounds of conversation, thereby not only solving the problem of error recognition of ASR, but also solving the problem of dependence on long text.

The following will further describe the text processing method by taking the application of the text processing method provided in this specification to interview records as an example with reference to fig. 3. Fig. 3 is a flowchart illustrating a text processing method according to an embodiment of the present specification, and specifically includes the following steps.

Step 302: receiving a text to be processed, inputting the text to be processed into an entity recognition model, and obtaining a candidate text with an entity recognition tag.

In practical application, after the server receives the text to be processed, the entity information in the text to be processed needs to be identified, so that the text to be processed is input into the entity identification model, the entity in the text to be processed is identified, the identified entity is marked, and the candidate text with the entity identification tag is determined.

In order to obtain the text to be processed, the dialogue information in the process of video interview between the recording personnel and the user can be processed; specifically, before receiving the text to be processed, the method further includes:

and receiving voice dialogue information, identifying the voice dialogue information through a voice identification module, and determining a text to be processed.

Wherein, the text to be processed can be understood as the text information with the role information and the dialogue content,

in practical application, after receiving the voice conversation information acquired by the communication connection between the recording personnel and the user through the client, the server inputs the voice conversation information into the voice recognition module for recognition, obtains text information corresponding to the voice conversation information between the recording personnel and the user, and takes the text information as a text to be processed, so that the subsequent text to be processed can be conveniently processed again, and the accuracy of text recognition is enhanced.

It should be noted that the speech recognition model is a model capable of recognizing speech information and obtaining text content corresponding to the speech information, for example, an ASR speech recognition model, and this specification does not set any limitation on the speech recognition model.

According to the text processing method provided by the embodiment of the specification, the voice dialogue information is input into the voice recognition model for recognition, so that the subsequent processing according to the recognized text content is facilitated, the entity keywords are obtained from the text content, and the entity keywords can be quickly filled in the interview record in the interview process of the recorder.

After the server obtains the text to be processed identified by the voice identification module, the topic type of the text to be processed can be determined according to problem identification and topic detection in the text to be processed, so that different entity keyword information can be identified according to different topic types subsequently; specifically, before the text to be processed is input into the entity recognition model, the method further includes:

inputting the text to be processed into a question identification model, and inputting the text to be processed into a topic detection model under the condition that the question text in the text to be processed is determined to be effective, and determining the topic type corresponding to the text to be processed.

In practical application, a server inputs a text to be processed into a question identification model, under the condition that a question text inquired by a recording person in the text to be processed is determined to be effective, a topic detection model can be called, topic detection is carried out on the text to be processed, a topic type corresponding to the text to be processed is determined based on a topic detection result, and the topic detection model can judge which stage of a visiting record list the current conversation of the text to be processed goes to, such as inquiring a life track or a working track, so that the topic type of the text to be processed is determined.

It should be noted that, the to-be-processed text processed by the server at one time is not all the dialog records between the recording staff and the user, and is the text information obtained in each stage within a certain period of time, for example, when the recording staff fills in a problem of a medical position, the to-be-processed text may intercept five rounds of dialog texts of dialog between the recording staff and the user, and then the keywords of the medical position are subsequently identified, so that the topic type of the to-be-processed text is determined, and it is convenient to subsequently display the identified keyword information in which stage or which candidate box of the to-be-input text box.

According to the text processing method provided by the embodiment of the specification, the entity relation word type required to be identified by the text to be processed can be determined by performing problem identification and topic detection on the text to be processed, so that the keyword information corresponding to the interview problem can be rapidly displayed in the follow-up interview process of a recorder and a user, and the interview efficiency is improved.

The text processing method provided by the embodiment of the specification performs noisy text processing on the entity recognition model, so as to enhance the recognition accuracy of the entity recognition model; specifically, the entity recognition model is obtained by training in the following way:

receiving a sample text to be processed, randomly determining a comparison sample text based on the sample text to be processed, and determining the sample text to be processed and the comparison sample text as a training sample set;

and training an entity recognition model based on the training sample set.

In practical applications, in order to enable the entity recognition model to obtain the target entity from the noisy text, i.e. to determine the target entity from some wrong text, for example, "i go to collaborate with music" in the disease, it is necessary for the entity recognition model to learn that "collaborating with music" is a hospital entity. If the target entity appears in the correct text form in all training data (such as "collaborate with hospital"), the entity recognition model may not recognize the noisy text as the target entity. Based on the above thought, the text processing method provided in this embodiment may enable the entity recognition model to learn the method for extracting the entity from the noisy text only by actively constructing some noisy texts and indicating the position of the target entity in the noisy texts for the entity recognition model, and may determine the comparison sample text in the knowledge base at random based on the sample text to be processed, and train the entity recognition model by using the sample text to be processed and the comparison sample text as a training sample set.

For example, a construction process of noisy enhanced data is carried out on the sentence "disease that I go to collaborate with a hospital", firstly, the sentence is divided into two parts, one part is an entity text "collaborate with a hospital", the other part is a text "disease that I go to see" not belonging to any entity, for the text "disease that I go to see" not belonging to any entity, we do not do any operation, carry out word segmentation processing on the text "collaborate with a hospital" belonging to the entity, and decompose the "collaborate with a hospital" into the "collaborate with the" hospital "; then, a keyword similar to the last word segmentation voice is randomly searched by using a Dimsim open source tool, for example, the similar word of 'hospital' is 'music' and the similar word of 'collaborate' is 'shoe box'; then, the similar words are used for replacing the text in the original sentence to obtain the disease that I go to the shoe box and see music, the position of the entity in the original sentence is still reserved, the shoe box and music are still an entity, the data enhancement process is completed once, in the practical application process, each sentence in the training data is enhanced once, and the enhanced data and the original training data are mixed together to be used as a training data set with enhanced noise.

In the text processing method provided by the embodiment of the specification, the entity recognition model is trained by enhancing noisy data of training data of the entity recognition model, and the entity recognition model can automatically learn the corresponding relation between semantic features and target entities so as to enhance the accuracy of the entity recognition model for recognizing text entities.

After the entity recognition model is trained, recognizing the text to be processed by using the entity recognition model, and determining a candidate text with an entity recognition tag; specifically, the receiving a text to be processed, inputting the text to be processed into an entity identification model, and obtaining a candidate text with an entity identification tag includes:

receiving a text to be processed, inputting the text to be processed into a semantic recognition module of the entity recognition model, and obtaining a semantic vector of the text to be processed;

determining a pinyin vector of the text to be processed based on the semantic vector of the text to be processed, and inputting the semantic vector and the pinyin vector into a full-connection layer calculation loss function of the entity identification model to obtain a loss value of the text to be processed;

and inputting the loss value of the text to be processed into a probability network layer of the entity recognition model to obtain a candidate text with an entity recognition label.

Specifically, after receiving a text to be processed, a server inputs the text to be processed into a trained entity recognition model, firstly obtains a semantic vector of the text to be processed through a first-layer semantic recognition module of the entity recognition model, then determines a pinyin vector of the text to be processed based on the semantic vector of the text to be processed, inputs the semantic vector and the pinyin vector to a second-layer full-connection layer of the entity recognition model together, and performs loss function calculation to obtain a loss value of the text to be processed; and finally, inputting the loss value of the text to be processed into a third layer probability network layer of the entity recognition model, labeling the characters of the text to be processed, and calculating the probability distribution of each label to determine the correct label content aiming at the text to be processed, thereby determining the accurate candidate text with the entity recognition label.

In practical application, referring to fig. 4, fig. 4 is a schematic processing process diagram of the entity recognition model of the text processing method provided by the embodiment of the present specification.

The text to be processed in fig. 4 is passed through a semantic recognition model to obtain semantic vectors of the text to be processed, wherein the speech recognition model can perform feature extraction for the BERT model, based on the obtained semantic vectors, pinyin vectors of the text to be processed are added, then the semantic vectors and the pinyin vectors are input to a full connection layer to perform loss value calculation, finally the calculated values are input to a probability network layer, the entity identified by the text to be processed is marked, and an entity label corresponding to each word in the text to be processed, such as a word not belonging to the entity range, marked as "O", a first word belonging to the entity range, marked as "B", a middle word in the entity range up to a last entity word, marked as "I", if the text to be processed is "go to collaborate and take a hospital visit", marked as "ob ii io", thus, candidate texts with entity identification tags are obtained.

It should be noted that the pinyin vector is from the open source item, the provided 2-bit code of the pinyin, and the pinyin with similar pronunciation are closer in distance in the two-dimensional space.

In the text processing method provided in the embodiment of the present specification, by adding the pinyin vector to the semantic vector of the text to be processed, semantic similarity between the ASR error and the real text is found (for example, "music" and "hospital" are relatively close to each other in terms of speech), and further, other models may not be regarded as texts of entity words, and a corresponding entity is identified by the entity identification model provided in the embodiment (for example, "collaborate with hospital" may be identified as an entity of a hospital).

Step 304: inputting the candidate text with the entity identification tag into an entity discrimination model to obtain a candidate entity of the candidate text, and determining an alternative text based on the candidate entity.

The candidate entity can be understood as an entity corresponding to the entity tag in the candidate text, and the entity is similar to the entity tag in the candidate text.

The alternative text can be understood as a text formed by embedding the determined candidate entity into the text to be processed.

In practical application, when the server obtains the candidate text with the entity identification tag, the server may determine which entities are in the text to be processed, but the determined entities may be wrong, so that the candidate text with the entity identification tag needs to be input into the entity discrimination model, the candidate entities of the candidate text are determined, and then the candidate text is determined according to the candidate entities.

It should be noted that the entity discrimination model can be understood as an ASR text error correction module, and mainly comprises two parts, namely, recalling an entity with similar fuzzy sound with a key entity, and constructing a map; and secondly, correcting errors of the key entities by using a depth model and combining a knowledge graph.

Further, the inputting the candidate text with the entity identification tag into an entity discrimination model to obtain a candidate entity of the candidate text includes:

determining an initial entity based on a candidate text input entity discrimination model with an entity identification tag, converting the initial entity into an initial entity pinyin, and searching for alternative entity pinyin similar to the initial entity pinyin in a preset knowledge base based on the initial entity pinyin;

calculating the similarity between the initial entity pinyin and the alternative entity pinyin, sequencing based on the similarity and text attributes, and determining an entity sequence;

and determining candidate entities of the candidate texts according to a preset sequence threshold.

The initial entity may be understood as an entity corresponding to the entity tag in the candidate text identified according to the entity identification model.

Wherein, the candidate pinyin entity can be understood as an entity pinyin similar to the pinyin of the initial entity.

It should be noted that, in the conventional error correction scheme, errors with varying lengths and professional knowledge related to the vertical field are generally corrected by using a language model, and the error correction effect is poor, but in this embodiment, an entity in the vertical field is introduced as an additional knowledge base, and error correction is performed on the entity granularity, so that the robustness of errors related to multiple words/few words and specific entities is better.

In the specific implementation, the server inputs the candidate text with the entity identification tag into the entity discrimination model to determine an initial entity, because the accuracy of the determined initial entity is not high, the initial entity pinyin is determined based on the initial entity, the alternative entity pinyin similar to the initial entity pinyin is searched in the preset knowledge base according to the initial entity pinyin, the similarity between the initial entity pinyin and the alternative entity pinyin is calculated, in addition, the entities are sequenced based on the text attribute and the similarity, the entity sequence is determined, and the candidate entity of the candidate text is determined according to the preset sequence threshold.

In practical application, the server may use a full-text retrieval tool (solr) to construct an index for pinyin of an entity in a preset knowledge base (e.g., "jinmen" -jingmen) for subsequent recall of a fuzzy sound entity, and also use the method to perform pinyin conversion on an initial entity, call a retrieval interface to search for the pinyin, recall entities with similar pronunciation, and then reorder the entities through pronunciation similarity and context attribute correlation, wherein a part of the entities may recall a plurality of candidate items, and therefore, the server may sort according to the pronunciation similarity of the initial entity and the overlapping degree of the context attribute and the dialogue context of the initial entity, and screen out candidate entities with a certain sequence threshold.

In the text processing method provided in the embodiments of the present specification, the candidate entity having a higher similarity to the initial entity is determined in the preset knowledge base by determining the alternative entity pinyin that the initial entity pinyin is similar to in the preset knowledge base, and the candidate entity is determined by performing the calculation of the similarity of the pinyin and the context attribute, so that the initial entity is corrected by determining the candidate entity to facilitate the subsequent construction of the relationship diagram for the initial entity.

In order to accurately correct the initial entity, an association relation can be established between the determined candidate entity and the initial entity, and then a text subjected to error correction processing on the entity is determined; specifically, the determining the candidate text based on the candidate entity includes:

acquiring an initial entity of the candidate text, and determining an entity association relation between the initial entity and the candidate entity;

constructing a relationship graph based on the initial entity, the candidate entities and the entity incidence relation;

after the node entities of the relation graph are embedded into the initial entities of the texts to be processed, texts to be scored are determined, and alternative texts are determined based on the texts to be scored.

Wherein, the relationship graph can be understood as a knowledge graph between the entities.

In practical application, after acquiring an initial entity of a candidate text, a server may determine an association attribute between the initial entity and the candidate entity, and construct a relationship graph with the initial entity and the candidate entity as entity nodes and an association relationship between the initial entity and the candidate entity as entity edges, where the entity edges may be a relationship type (similar pronunciation, province, city, etc.) between the entities, there are at least two entity nodes in the relationship graph, and after embedding a corresponding node entity in the relationship graph into the initial entity of the text to be processed, determine a plurality of texts to be scored, and determine an alternative text in the plurality of texts to be scored.

Further, the determining the alternative text based on the text to be scored includes:

inputting the text to be scored into a semantic recognition model for coding, and obtaining an initial entity vector of the text to be processed and a node entity vector of the relational graph;

and calculating the similarity of the initial entity vector and the node entity vector, and determining the candidate text.

In practical application, a pre-training language model is used for coding the modified dialog, vector expression of an entity to be corrected is obtained in the coding process (output of [ CLS ] bit in the model) or vector expression of each candidate entity (average value of word vectors of all characters in the entity) is obtained, cosine similarity of each candidate entity vector and the entity vector to be corrected is calculated, and the entity with the largest score is obtained as a correction result, so that the candidate text is determined.

Referring to fig. 5, fig. 5 is a schematic diagram illustrating a text to be scored is scored according to the text processing method provided in the embodiment of the present specification.

In FIG. 5, the text to be scored is Qianjiang and kayah, and the subsidiary of City A (region Qianjiang) is … …. The method comprises the steps of carrying out scoring, determining that the sentence score of the entity of Qianjiang in a text to be scored is 0.1, and the sentence score of the entity of Qianjiang in the text to be scored is 0.9, so that the accuracy of the entity of Qianjiang in the text to be scored is relatively high, and the entity of Qianjiang in the text to be scored is corrected by replacing Qianjiang with Qianjiang.

Step 306: and constructing a relation knowledge graph based on the candidate entities and the candidate texts, calculating the similarity between nodes of each relation knowledge graph based on the relation knowledge graph, and determining the relation of the target entities.

In practical application, after the server determines the candidate text of the text to be processed, a relationship knowledge graph can be constructed based on the candidate entity and the candidate text, the similarity between nodes of each relationship knowledge graph is calculated based on the relationship knowledge graph, and then the target entity relationship is determined.

Referring to fig. 6, fig. 6 illustrates a relationship knowledge graph between a to-be-processed text and an entity in a text processing method provided by an embodiment of the present specification.

Taking five sentences in the text to be processed, wherein the entities are respectively "a place", "first hospital", "people hospital", "collaboration and hospital", the constructed relationship knowledge graph is constructed, wherein the five sentences all have entity relationship connection, sentence 1 "has entity relationship connection with" a place ", sentence 5" has entity relationship connection with "first hospital", sentence 3 "has entity relationship connection with" people hospital ", sentence 4" has entity relationship connection with "collaboration and hospital", it is to be noted that the relationship of the above-mentioned fig. 6 is initialized entity association relationship, and entity relationship connection between other entities can be continuously increased on the similarity between subsequent calculation relationship knowledge graph nodes, for example, the entity association relationship between "a place" and "sentence 3" or "sentence 2", so as to subsequently determine the association weight between each entity node, and further determining the similarity degree between the two entity nodes, and finally determining the target entity relationship.

In order to further determine the incidence relation between the entities, the similarity can be calculated for the entity nodes, and further the target entity relation of the text to be processed is determined; specifically, the calculating the similarity between each relationship knowledge graph node based on the relationship knowledge graph and determining the target entity relationship includes:

inputting the candidate entity and the candidate text in the relation knowledge graph into a semantic recognition model for coding to obtain a candidate entity vector and a candidate text vector;

inputting a convolution algorithm model based on the candidate entity vector and the candidate text vector to perform feature extraction, and determining a candidate entity node vector and a candidate text node vector;

calculating the similarity of the candidate entity node vector and the candidate text node vector, and determining a target entity relationship based on a preset similarity threshold.

In practical application, each sentence and each entity can be regarded as a node, each sentence and each entity can be encoded by using a BERT model and can be used as vector representation of the node, an edge can be connected between any two sentence nodes, and if an entity belongs to a sentence, the edge can be connected between the entity and the sentence. After the relationship knowledge graph is constructed, a node vector subjected to feature extraction can be obtained based on a Graph Convolution (GCN) algorithm. For each pair of nodes, calculating the similarity between the node vectors, and if the similarity is higher than a threshold value (the specific threshold value is manually set according to experience), determining that the association relationship exists between the two entities, and further determining the relationship as a target entity relationship.

Step 308: and determining a target entity text in a preset knowledge base based on the target entity relationship.

The preset knowledge base can be understood as a knowledge entity base accumulated according to different application scenarios, which is not limited in this specification.

In specific implementation, the server screens out an entity text matched with the target entity relation from a preset knowledge base based on the target entity relation determined in the relation knowledge graph, and further determines the matched entity text as the target entity text, so that the target entity text can be conveniently used as the keyword information of the text to be processed in the follow-up process.

For example, determining a set of entity relationship strings as "location a-people hospital-first hospital" in the relationship knowledge graph splices their entity relationship strings into a character string "shanghai people hospital-first hospital". This string is then used to search through the relationship knowledge graph. Suppose we can search for "the first affiliated people hospital of the university of transportation a". Meanwhile, according to the knowledge of the relation knowledge graph, a geographical position information relation string (China-A land-region-A land transportation university first subsidiary people hospital) can be obtained. Thereafter, each entity in the original relationship string (Shanghai-Min Hospital-first Hospital) is compared to the relationship string in the geographic location information, and if each entity can find a matching entity (e.g., "Adi-Adi," "Min Hospital-Adi transportation university first subsidiary Min Hospital," "first Hospital-Adi transportation university first subsidiary Min Hospital"), then we consider (Adi-Min Hospital-first Hospital) to be an acceptable relationship string. Otherwise, the relation string is considered as a relation string with extraction errors and is not output to a subsequent module, wherein the matching mode can freely select a proper mode (such as character string similarity, longest common subsequence length and the like).

Further, after determining the target entity text based on the target entity relationship in the preset knowledge base, the method further includes:

determining a user to-be-input box based on the topic type corresponding to the to-be-processed text, and displaying the target entity text in the user to-be-input box.

In practical application, after the server identifies the corresponding topic type of the text to be processed, the user box to be input in the interview record of the recording personnel can be determined, and the target entity text obtained by the server is displayed in the user box to be input.

In summary, the text processing method provided in the embodiments of the present specification obtains the capability of entity recognition on a noisy text by the noisy text NER module and by data enhancement and pinyin feature increase, can tolerate a certain degree of text recognition errors, and then can recover some errors of the ASR module for the ASR module recognition error problem, so that the processed text is closer to a real text. By combining the two, the robustness of the system to the ASR recognition error is greatly increased. Furthermore, for the problem of long text dependence, through the long text analysis module, semantic information of multiple sections of conversations can be analyzed simultaneously, the relation between entities is extracted, and then the entities belonging to the same atomic information can be spliced together by the module by combining the information of the knowledge graph, so that the atomic information dispersed in the multiple sections of conversations is spliced again, and the problem of long text dependence is solved.

Corresponding to the above method embodiment, this specification further provides a text processing apparatus embodiment, and fig. 7 shows a schematic structural diagram of a text processing apparatus provided in an embodiment of this specification. As shown in fig. 7, the apparatus includes:

an entity identification module 702 configured to receive a text to be processed, input the text to be processed into an entity identification model, and obtain a candidate text with an entity identification tag;

an entity discriminating module 704 configured to input the candidate text with the entity identification tag into an entity discriminating model, obtain a candidate entity of the candidate text, and determine an alternative text based on the candidate entity;

an entity relationship determination module 706 configured to construct a relationship knowledge graph based on the candidate entities and the candidate texts, calculate similarity between each relationship knowledge graph node based on the relationship knowledge graph, and determine a target entity relationship;

a target entity determination module 708 configured to determine a target entity text in a preset knowledge base based on the target entity relationship.

Optionally, the entity recognition model is obtained by training as follows:

and training an entity recognition model based on the training sample set.

Optionally, the entity recognition model 702 is further configured to:

Optionally, the entity discrimination model 704 is further configured to:

Optionally, the entity relationship determination module 706 is further configured to:

Optionally, the target entity determining module 708 is further configured to:

determining a target entity in the relationship knowledge graph based on a target entity relationship, and splicing the target entity to obtain an entity character string;

searching in the relation knowledge graph based on the entity character string to obtain a matched entity text, and searching in a preset relation knowledge graph based on the entity character string to obtain a compared entity text;

under the condition that the matching entity text is matched with the comparison entity text, determining the matching entity text as an entity text to be selected;

and associating the entity texts to be selected in a preset knowledge base, and taking the associated texts as target entity texts.

Optionally, the apparatus further comprises:

The text processing device provided by the embodiment of the specification determines the candidate text by inputting the text to be processed into the entity recognition model, and determines the candidate entity by inputting the candidate text into the entity discrimination model, so as to increase the entity with high similarity of the entity in the text to be processed, and determines the target entity relationship in the knowledge graph by constructing the relationship knowledge graph, thereby solving the problem of dependence of voice recognition error and long text, displaying the processed keyword information back to the form floating window of the query record, reducing the cost of manual filling, and improving the efficiency of overall video interview.

The above is a schematic scheme of a text processing apparatus of the present embodiment. It should be noted that the technical solution of the text processing apparatus and the technical solution of the text processing method belong to the same concept, and details that are not described in detail in the technical solution of the text processing apparatus can be referred to the description of the technical solution of the text processing method.

FIG. 8 illustrates a block diagram of a computing device 800, according to one embodiment of the present description. The components of the computing device 800 include, but are not limited to, memory 810 and a processor 820. The processor 820 is coupled to the memory 810 via a bus 830, and the database 850 is used to store data.

Computing device 800 also includes access device 840, access device 840 enabling computing device 800 to communicate via one or more networks 860. Examples of such networks include the Public Switched Telephone Network (PSTN), a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the internet. Access device 840 may include one or more of any type of network interface (e.g., a Network Interface Card (NIC)) whether wired or wireless, such as an IEEE802.11 Wireless Local Area Network (WLAN) wireless interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.

In one embodiment of the present description, the above-described components of computing device 800, as well as other components not shown in FIG. 8, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device architecture shown in FIG. 8 is for purposes of example only and is not limiting as to the scope of the description. Those skilled in the art may add or replace other components as desired.

Computing device 800 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), a mobile phone (e.g., smartphone), a wearable computing device (e.g., smartwatch, smartglasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 800 may also be a mobile or stationary server.

Wherein processor 820 is configured to execute computer-executable instructions for executing the computer-executable instructions, wherein the steps of the text processing method are implemented when the processor executes the computer-executable instructions.

The above is an illustrative scheme of a computing device of the present embodiment. It should be noted that the technical solution of the computing device and the technical solution of the text processing method belong to the same concept, and details that are not described in detail in the technical solution of the computing device can be referred to the description of the technical solution of the text processing method.

An embodiment of the present specification also provides a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the text processing method.

The above is an illustrative scheme of a computer-readable storage medium of the present embodiment. It should be noted that the technical solution of the storage medium belongs to the same concept as the technical solution of the text processing method, and details that are not described in detail in the technical solution of the storage medium can be referred to the description of the technical solution of the text processing method.

The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

The computer instructions comprise computer program code which may be in the form of source code, object code, an executable file or some intermediate form, or the like. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

It should be noted that, for the sake of simplicity, the foregoing method embodiments are described as a series of acts, but those skilled in the art should understand that the present embodiment is not limited by the described acts, because some steps may be performed in other sequences or simultaneously according to the present embodiment. Further, those skilled in the art should also appreciate that the embodiments described in this specification are preferred embodiments and that acts and modules referred to are not necessarily required for an embodiment of the specification.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

The preferred embodiments of the present specification disclosed above are intended only to aid in the description of the specification. Alternative embodiments are not exhaustive and do not limit the invention to the precise embodiments described. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the embodiments and the practical application, to thereby enable others skilled in the art to best understand and utilize the embodiments. The specification is limited only by the claims and their full scope and equivalents.

Claims

1. A text processing method, comprising:

Receive the text to be processed, input the text to be processed into the entity recognition model, and obtain the candidate text with the entity recognition label;

Inputting the candidate text with the entity recognition label into the entity discrimination model, obtaining the candidate entity of the candidate text, and determining the candidate text based on the candidate entity;

Build a relational knowledge graph based on the candidate entity and the candidate text, calculate the similarity between each relational knowledge graph node based on the relational knowledge graph, and determine the target entity relationship;

The target entity text is determined in the preset knowledge base based on the target entity relationship.

2. The text processing method according to claim 1, wherein the entity recognition model is obtained by training in the following manner:

Receive the sample text to be processed, randomly determine the comparison sample text based on the sample text to be processed, and determine the sample text to be processed and the comparison sample text as a training sample set;

The entity recognition model is trained based on the training sample set.

3. The text processing method according to claim 2, wherein the text to be processed is received, the text to be processed is input into the entity recognition model, and the candidate text with the entity recognition label is obtained, comprising:

Receive the text to be processed, input the text to be processed into the semantic recognition module of the entity recognition model, and obtain the semantic vector of the text to be processed;

Determine the pinyin vector of the to-be-processed text based on the semantic vector of the to-be-processed text, input the semantic vector and the pinyin vector into the fully connected layer of the entity recognition model to calculate the loss function, and obtain the loss value;

Input the loss value of the text to be processed into the probability network layer of the entity recognition model to obtain candidate texts with entity recognition labels.

4. The text processing method according to claim 3, wherein the input of the candidate text with the entity recognition label into an entity discrimination model to obtain the candidate entity of the candidate text comprises:

Determine the initial entity based on the candidate text input entity discrimination model with the entity identification label, convert the initial entity into the initial entity pinyin, and search the preset knowledge base based on the initial entity pinyin for similar to the initial entity pinyin Alternative entity pinyin;

Calculate the similarity between the initial entity pinyin and the alternative entity pinyin, and sort based on the similarity and text attributes to determine an entity sequence;

The candidate entity of the candidate text is determined according to a preset sequence threshold.

5. The text processing method according to any one of claims 1-4, wherein the determining the candidate text based on the candidate entity comprises:

Obtain the initial entity of the candidate text, and determine the entity association relationship between the initial entity and the candidate entity;

Constructing a relationship graph based on the initial entity, the candidate entity and the entity association;

After embedding the node entity of the relationship graph into the initial entity of the text to be processed, the text to be scored is determined, and the candidate text is determined based on the text to be scored.

6. The text processing method according to claim 5, wherein the determining of candidate text based on the text to be scored comprises:

Inputting the text to be scored into the semantic recognition model for coding, and obtaining the initial entity vector of the text to be processed and the node entity vector of the relationship graph;

The similarity between the initial entity vector and the node entity vector is calculated to determine candidate texts.

7. The text processing method according to claim 6, wherein calculating the similarity between each relational knowledge graph node based on the relational knowledge graph, and determining the target entity relationship, comprising:

Encoding the candidate entity and the candidate text in the relational knowledge graph into a semantic recognition model to obtain a candidate entity vector and a candidate text vector;

Based on the candidate entity vector and the candidate text vector, the convolution algorithm model is input to perform feature extraction, and the candidate entity node vector and the candidate text node vector are determined;

Calculate the similarity between the candidate entity node vector and the candidate text node vector, and determine the target entity relationship based on a preset similarity threshold.

8. The text processing method according to claim 7, wherein determining the target entity text in a preset knowledge base based on the target entity relationship, comprising:

Determine a target entity in the relational knowledge graph based on the target entity relationship, and splicing the target entity to obtain an entity string;

Search in the relational knowledge graph based on the entity string to obtain matching entity text, and search in the preset relational knowledge graph based on the entity character string to obtain the compared entity text;

When it is determined that the matching entity text and the comparison entity text match, determining the matching entity text as the candidate entity text;

The entity texts to be selected are associated in a preset knowledge base, and the associated texts are used as target entity texts.

9. The text processing method according to claim 1, before receiving the text to be processed, further comprising:

The voice dialogue information is received, the voice dialogue information is recognized by the voice recognition module, and the text to be processed is determined.

10. The text processing method according to claim 9, before the to-be-processed text is input into the entity recognition model, further comprising:

The to-be-processed text is input into a question recognition model, and when it is determined that the question text in the to-be-processed text is valid, the to-be-processed text is input into a topic detection model to determine a topic type corresponding to the to-be-processed text.

11. The text processing method according to claim 10, after the target entity text is determined in a preset knowledge base based on the target entity relationship, the method further comprises:

A box to be input by the user is determined based on the topic type corresponding to the text to be processed, and the target entity text is displayed in the box to be input by the user.

12. A text processing apparatus, comprising:

an entity recognition module, configured to receive text to be processed, input the text to be processed into the entity recognition model, and obtain candidate texts with entity recognition labels;

an entity discrimination module, configured to input the candidate text with the entity recognition label into an entity discrimination model, obtain a candidate entity of the candidate text, and determine candidate text based on the candidate entity;

an entity relationship determination module, configured to construct a relational knowledge graph based on the candidate entity and the candidate text, calculate the similarity between each relational knowledge graph node based on the relational knowledge graph, and determine a target entity relation;

The target entity determination module is configured to determine the target entity text in the preset knowledge base based on the target entity relationship.

13. A computing device comprising:

memory and processor;

The memory is used to store computer-executable instructions, and the processor is used to execute the computer-executable instructions, wherein, when the processor executes the computer-executable instructions, the text of any one of claims 1-11 is implemented The steps of the processing method.

14. A computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the text processing method of any one of claims 1-11.